Get your structure right to the extent possible the first time. (Database structure gets revised, of course, when the user's requirements evolve, but that does not excuse you from a modest bit of forethought and planning, which saves you much manual labor afterwards.)
Whenever you design a form that is based on a table (using the Form Wizard), the form "inherits" from the current version of the structure. For example, if you designate the value particular column to be looked up from a list of values or from another table (by using the Lookup Wizard), the default appearance of that column, both in the DataSheet view as well as in any automatically generated forms, will be as a Combo Box.
Therefore the more work you do in the structure up front, the less the work you have to do when you create the form.
To minimize manual labor when the structure of a table has been revised after a form has already been created (e.g., a lookup wizard has been employed later), it is sometimes quicker to delete the affected columns from the form and add them again using the Field List, instead of hand-customizing them. Whenever you remove a field and add a new field in its place, you have to ensure that the Tab Order is the same, for the benefit of users who use the keyboard instead of the mouse to move from field to field (use Auto Order).
The Entity-Relationship Diagram was devised by Peter Chen and colleagues in the 1970s as a means of facilitating both the understanding of databases (as well as their visual design, through a GUI). Access's counterpart of the ER diagram is the Relationships view.
When you inherit someone else's database, or to refresh your memory with your own work, the first thing that you do to understand it is to look at the Relationships. If a database does not have relationships defined, that is the sign of an amateur design. Note that, if the tables in the system are ODBC-attached from a back-end database such as Oracle, many (or most) relationships may be missing in the Access relationships diagram.
In this situation, one tool that does a superb job of extracting relationships is Microsoft Visio's E-R Diagramming tool. If you select all the tables in the back-end database, Visio will automatically discover any relationships that have been defined between them, and even automatically arrange the tables in the diagram in such a way that related tables are placed close together in the diagram.
Relationships between tables are what make databases work. You set a relationship by dragging from a column in one table and dropping on a column in another table. A dialog then pops up, and you are given the choice of setting referential integrity, and also designating cascading update and cascading delete.
When you enforce referential integrity between two tables, the ends of the relationship are labeled with the "1" and "Infinity"(∞) symbols. The infinity symbol implies "many": the "1" symbol is always against the primary key. The column/s against the infinity symbol (in the "many" table) are the foreign key.
(For a definition and discussion of Referential Integrity and other database-design-related terms, go to BIS 560 course materials- Design Principles. If you don't want to peek at that page, just remember that Access, by default, does not enforce cascading updates and deletes. Ninety-five times out of hundred, this is the right thing to do, if you want to protect your sanity. Automatic cascading deletes - e.g., deleting payment records for a customer when a customer is deleted, can cause your valuable data to be erased with minimal warning. I only use cascading deletes after I first execute code that backs up the rows to be erased in a safe place, such as a table that serves as an audit trail.)
Always use referential integrity: if you don't, errors will creep in to the database. There are two kinds of errors, which Access will catch if you attempt to set referential integrity, and not catch if you don't.
1. Setting a relationship between a column in a "many" table and a column in the one table that is NOT the primary key.
Relationships should ONLY be between primary keys and non-primary keys- not between two non-primary keys.
In exceptional circumstances, such as when you have inherited flat-file data that you are cleaning up and restructuring into a proper database structure, you may occasionally designate non-primary links. For example, suppose you have two tables containing first name, last name and middle name of persons, you may create a triple link between these tables on these three pairs of columns temporarily, in a query, while you seek to create an artificial primary key such as a person ID (or use an existing one such as social security number).
However, the relationships window is intended to represent permanent relationships between tables, not temporary ones.
2. Setting a relationship between two columns in different tables where the two have different or incompatible data types. For example, a link between a number of type double in one column, and a number of type long in another (or between two text columns of different width, or between a text column and a date column).
When you designate a column as a Number, Access gives you several choices:
Fractional numbers can be represented in one of two ways:
Fixed precision numbers are more suitable for business-type calculations, the latter for scientific computation.
In general, if you hope to port your database to a high-end system such as Oracle or SQL Server and use an Access front-end to this system, stay away from variable precision numbers. Oracle doesn't support them, and even though SQL Server does, when you try to edit a record containing a variable precision number, Microsoft Access falsely reports that this record has been changed by another user and that your changes cannot be saved. So, if you need fractional numbers, use the decimal data type.
This is because certain decimal numbers cannot be precisely represented in binary, just as the number 2/3 cannot be represented exactly in the decimal system. Access mistakenly tries to compare the local value of this number on your machine with the value stored on the remote database using a test for exact equality. As stated above, this comparison will almost always return inequality, and Access fools itself into believing that this field has been edited by someone else even though it hasn't been touched.
When you designate an Autonumber column in Access, then by default the first record/row gets the value 1 for that column, and subsequent rows, this value increases by one. By default, autonumbers are long integers (2 raised to 31, or two billion, maximum), though they can also be Replication IDs, which are 128-bit numbers. This behavior can change- new values can be randomly assigned, though I've never found a use for this feature)
In high-end databases like SQL Server, the starting value for autonumbers can be other than 1, as can the increments. They also allow a maximum of 2 raised to the power of 64, or 2 billion squared.
If a record is deleted, the autonumber value for that column does not get recycled. Given the large range of numbers that can be accommodated, the loss of recyclability is not an issue.
The advantage of autonumbers as primary keys is that they are compact and space-efficient (and therefore efficiently indexed), and so foreign keys that refer to them are also compact. Having a single-column primary key takes up much less space than a multi-column one. (However, you must always create a unique index on the logical combination of columns whose combined values you expect to be unique).
Designate an autonumber primary key for a table ONLY IF the table is going to be linked to from another table via a foreign key.
In some databases that you will see, the designers have tried to create the equivalent of autonumber columns, but designate them as text, with the first value named, for example "D000001", the second "D000002" and so on. Another table may have keys beginning with "C000001", and so on. In my opinion (shared by most database architects, I might add) this approach is a misguided waste of effort.
There are two kinds of situations where you have multiple links between one table and another.
- Pedigree Databases. Here we have two tables: A table of Demographics, containing Person ID, first name, last name, Date of Birth, etc, and A table of Family_Relationships, containing three columns: Individual ID, Mother ID, Father ID, all three pointing to the Person ID column of the Demographics table.
- Drug Interactions databases. (A Drug Interaction is a phenomenon where, when two or more drugs are given to a patient around the same time, the combined effects are not what might be expected if the two were not given individually- sometimes, the drugs cancel each other out, sometimes, one drug increases the toxic effects of the other.
Here, a Drugs table lists Drug ID, Drug name, etc, while an Interactions table has the columns Drug1, Drug2 (both pointing to Drug ID of Drugs), Interaction_Type, Interaction_Severity, etc.
When you try to create a relationship of the second type and a relationship already exists between the pair of tables, Access warns you about this. After having created the extra relationship, Access will create a second copy of the primary-key table in the Relationships window, name it as <tablename>_1, and draw the link. To reduce visual clutter, it is worth superimposing the shape of <tablename>_1 on top of <tablename> to indicate the double relationship (or triple relationship, as in the Pedigree database example).
This "second copy" feature is limited to Microsoft Access: smarter database-modeling tools such as Microsoft Visio are smart enough not to clutter up your relationships diagram with multiple copies of the same table.
The Lookup Wizard, in addition to determining the visual appearance of a column (as a combo box), will also automatically set a relationship between two tables when the values are looked up from another table. However, it will not set referential integrity, and you have to go into the Relationships and do this yourself. In most cases, this two-step process is still faster than setting relationships manually first, because specifying the cosmetics of a lookup field (hiding the ID column, for instance) is more laborious when done manually rather than via the wizard, which makes many sensible assumptions.
The Lookup Wizard is not perfectly smart: in some cases, the "primary key" table does not contain columns that are mnemonic or human-understandable. In this case, you have to get to another table that contains the understandable values, so that you must effectively look up a combination of two tables. In such a case, you can let the Wizard do some of the work, go to the SQL and modify it visually to incorporate the additional table.
In Access 2002 and earlier, the lookup wizard was not helpful enough to volunteer to sort the values in a rational fashion. (For example, if you want to look up Patient IDs in a patients table, the lookup columns will typically be last name, first name, date of birth, but you want the list of patients sorted in this order, rather than by date of birth first. ) This has been remedied in Access 2003, but if you are working with earlier versions, you have to modify the SQL visually by specifying the column/s that you want the values to be sorted by.
When you are working with a database (or data) that you have inherited, or have purchased, don't assume it is clean. The drug database that we are using as a teaching tool contains some inconsistencies.
Attempting to set primary keys, and relationships with referential integrity enforced is the first and most important step in ensuring cleanliness.