Course Material and Recommended Reading
No single text covers every aspect of this rapidly evolving field. Much information is very current and comes from magazine articles. The recommended references are:
1. SQL: The Complete Reference. James Groff and Paul Weinberg (McGraw-Hill). Much of this course will emphasize SQL, and this will be the book you use most of the time.
2. Special Edition: Using Microsoft Access 2000. Roger Jennings (Que). This is essential for the course work. Roger Jennings is a serious developer, and this book is an honorable exception to the typical Que book, which is generally run-of-the-mill in terms of quality.
3. Sally Shlaer, Stephen J. Mellor. Object-oriented systems analysis : modeling the world in data. Yourdon Press, 1988.
4. Date, C. J. An Introduction to database systems. (Addison-Wesley). The bible, but it is a little heavy going.
5. SQL for Smarties: Joe Celko (M&T Books, 1995). Use this only after you've gone through the other sources. Celko writes in a very entertaining style.
6. SAS mavens will need to go through the SAS manuals for version 8 to understand SAS's PROC SQL capability. (Warning: Many texts on SAS do not mention a word of SQL, which was added relatively recently.)
We also recommend that students periodically scan the Intelligent Enterprise Web site (www.intelligententerprise.com) for articles (among other things, Celko's monthly column is posted here) as well as magazines such as PC Magazine and Byte.
Requirements and Course Work:
Learning databases is like learning cooking: you have to work in the kitchen, and all the theory in the world won't help you if you haven't done so. This course emphasizes a mix of theory and practical work, and will use Microsoft Access for the latter. In addition to two tests, there will be a single assignment, which will be done over several weeks. Each student will be asked to choose his/her own assignment in advance: the choice of assignment will be checked by the instructor, so that it is not too simple). If a student is already doing a dissertation, the material of the dissertation is sometimes a suitable topic for an assignment. Students who are auditing the course have often joined the course with a specific problem in mind: this problem can serve as the assignment.
Students will be expected to demonstrate their assignments directly on the computer. Credit will be given for effort in design, and an accurate and comprehensive modeling of the specific area for which their database is being created. No one is expected to get the design right the first time: database design is an iterative process even for professionals. Each student must meet with the instructor several times during the course of the assignment: before creation of the database, after design, after building a user interface, etc. This way, problems can be resolved as soon as they are perceived.
Definition of Database. Architecture from the users' and designers' viewpoints. Types of logical design: hierarchical, network, relational, object-oriented, hybrid architectures.
How to design a database: Entity-Relationship and Object-oriented modelling. Practical considerations.
The relational model:
Advantages and disadvantages
Structure: Tables, fields, keys, constraints, referential integrity. Indexes and how they work.
SQL. Query, Insert, Update. Set-based operations. Procedural extensions.
User interface considerations: building a database for someone else to use.
Technology related to databases: Spreadsheets, statistical analysis programs. Selecting the appropriate tool for the right job.
Wide-Area Networks. Client-server databases. Portability versus efficiency in design. Microsoft ODBC and other connectivity standards.
Accessing databases through the World-Wide Web.
The Medical Case Record: difficulties in implementation. Data interchange standards (HL-7). Attempts at standardizing the content of the Patient Record. The National Library of Medicine's UMLS.
Designing a database for use in epidemiology: case studies.
Embedded databases: spelling checkers, thesauri, medical expert systems.
Bibliographic (free-text) databases.
Specialized databases: Image (e.g., fingerprints), Sound. Efficient retrieval methods for unusual types of data.
Fundamentals of Data Warehousing.