NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
Peer reviewed Peer reviewed
Direct linkDirect link
ERIC Number: EJ807684
Record Type: Journal
Publication Date: 2007
Pages: 17
Abstractor: As Provided
ISBN: N/A
ISSN: ISSN-1547-9714
EISSN: N/A
Teaching Database Modeling and Design: Areas of Confusion and Helpful Hints
Philip, George C.
Journal of Information Technology Education, v6 p481-497 2007
This paper identifies several areas of database modeling and design that have been problematic for students and even are likely to confuse faculty. Major contributing factors are the lack of clarity and inaccuracies that persist in the presentation of some basic database concepts in textbooks. The paper analyzes the problems and discusses ways to minimize them. Specifically, the paper discusses the practical role of normalization in database design, addresses the confusion in representing attributes with repeating values, discusses how to remove inconsistencies in defining relations and first normal form, simplifies the process of identifying candidate keys to normalize relations, clarifies the conditions under which insertion and deletion anomalies may occur, and sheds light on the confusion in defining weak entities. Normalization plays a vital role in both the theory and the practice of database design. The top-down approach popularly used in relational database design creates a conceptual schema that is represented by entity-relationship (E-R) models, and then uses mapping rules to convert the conceptual schema to relation schemas. Because E-R modeling is an intuitive process, errors could occur in identifying entities and their relationships, resulting in un-normalized relations. Un-normalized relations also could result from converting files in legacy systems and spreadsheets to relational tables. Normalization plays a key role in verifying the goodness of design of such relations and in improving the design. The concept of repeating values in relations plays a major role in defining relations and first normal form. Yet, textbooks in general do not distinguish between multi-valued and single-valued attributes in a schema. This lack of clarity may result in conflicting interpretations of the schema. The paper presents a simple solution to the problem. The lack of clarity in defining the terms tables, relations, and first normal form (1NF) in textbooks is another potential source of confusion. Some books define relation as a table with no duplicate tuples, and "only atomic values." These books then redundantly define 1NF as a relation with "only atomic values." Others define a relation as a table with columns and rows, and state that a relation is in 1NF if each value is atomic. These definitions fail to specify an important requirement of 1NF that there are no duplicate tuples. A third definition of 1NF that fails to include this property is that a table is in first normal form if each value is atomic. A challenging task for many students during the normalization process is checking whether a determinant is a candidate key. The standard method is to check whether every attribute of the relation is functionally dependent on the determinant. The paper presents a method that involves only the determinants, and therefore makes it easier to identify candidate keys. The paper also provides an alternate definition of Boyce-Codd Normal Form (BCNF), which is easier to apply. Discussions in textbooks and other literature on the topic of normalization often give students the impression that data redundancy in un-normalized relations leads to all three types of anomalies--insertion, deletion, and update. The paper shows that though data redundancy generally results in insertion and deletion anomalies, that is not always the case. The conditions under which insertion/deletion anomalies don't occur are discussed. Guidelines for mapping conceptual schema to relational tables often use the terms strong entity and weak entity to provide separate mapping rules for each. It is shown that the definition of weak entity as presented in many textbooks, however, is inaccurate. These books define weak entity using logical dependence rather than identifier dependence of entities. The paper shows that several database design concepts and techniques commonly are presented inaccurately or ambiguously in textbooks and are problematic for students. However, as presented in the paper, simple solutions exist to minimize students' problems in these areas. (Contains 7 tables and 1 figure.)
Informing Science Institute. 131 Brookhill Court, Santa Rosa, CA 95409. Tel: 707-537-2211; Fax: 480-247-5724; Web site: http://JITE.org
Publication Type: Journal Articles; Reports - Descriptive
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A