- Tech know how online


The term normalization is used in connection with relational database models and characterizes the step-oriented approach to designing the structure of a relational database. In this process, the relations are decomposed in such a way that anomalies cannot occur and redundancies are avoided. The relations contain the logically related data.

The goal of normalization is to generate a clear and as transparent as possible data structure that avoids inconsistencies and ambiguities and supports easy data maintenance. Up to five rules - the normalizations - can be distinguished for the normalization process. Provided that all these rules are observed when generating the data structures, this is also referred to as fully normalized data structures, which, however, result in an increased system load. Normalization supports the preservation of the integrity of a database - this refers to the correctness of a data structure.

If you look at it figuratively, normalization splits relations - also tables - with many attributes of a higher degree into several tables - each of which then has a lower degree. This then avoids both redundancies and difficulties in changing, inserting and deleting so-called anomalies. A relation is called unnormalized if it has attributes with attribute values that are composed of several elements of, for example, a value list. Thus normalization is to be defined inversely as the elimination of dependencies between the attributes of a relation. Functional and transitive dependencies are differentiated.

The normalization of a relational schema

Several steps characterize the normalization of a relational schema. In each step, the relations must satisfy certain rules - also conditions. Thus, the so-called normal form of the relational schema is then formed. This process of normalization can contain up to five steps - however, it usually makes sense to satisfy only the first three conditions. Since the normalization process can result in many small relations, the administration of the database often becomes too complex with further normalization steps. The normalization process is often referred to as decomposition. The higher the degree of decomposition of a relation, the more complex a database operation becomes - here, the performance aspect of a database must also be taken into account. Often, even controlled redundancies are allowed in this context, as in the case of multiple existence of attributes - often also key attributes. The so-called duality principle specifies that the original relation must be recoverable after normalization - the normalization process must therefore be free of losses in any case.

1st normal form. The first normal form (1NF) exists when there are no attributes with attribute values that are composed of multiple elements. Relations in the 1NF are characterized by the fact that they are a two-dimensional structure of rows and columns. Conversely, this means that in the 1NF all multiple entries in an attribute are to be removed and each attribute can thus have only one value at most.

2nd normal form. The second normal form (2NF) is formed by breaking the relation into smaller relations so that in each relation all non-key fields depend only on the primary key. It should be noted that the attributes each depend on an entire key field. It ischecked whether a primary key consists of several attributes, where then the attributes depend only on a part of the primary key. If this is the case, this part of the primary key with its attributes is transferred into a new relation.

3Normal Form. The 3NF assumes that all data fields (attributes) of a table are independent of each other and depend only on the entire key. Thus, each attribute not belonging to the key is functionally dependent only on the entire key. Transitive dependency is the condition when a non-key field is identifiable only through another non-key field. The consequences of transitive dependencies are redundancy and inconsistency of data structures. Thus, for the 3NF, all transitive dependencies must be removed by decomposition of relations. In the resulting relations, all non-key fields are then directly dependent on the entire key field.

Boyce Codd Normal Form (BCNF). Removes dependencies of individual keys or key attributes on each other. This is not considered in the 3NF. The BCNF satisfies a relation exactly if no attribute is functionally dependent on an attribute group without a key property. Here we also speak of so-called minimal keys - these are non-compound keys. Relations that have several overlapping minimal keys create a functional dependency and in turn lead to redundancies.

4. normal form. This is satisfied by relations that are in the 3NF and do not contain multi-valued dependencies of attributes. Multivalued dependency is said to exist when several non-key attributes are directly dependent on the key, but not on each other.

5. normal form. If decomposition in the 4NF is no longer possible without loss, the 5th normal form (5NF) adds more primary keys. This continues until there are only single attribute dependencies on one or more primary keys.

The last two normal forms have less meaning in practice, since here both relations are added and redundancies are generated. With each increase in the number of relations, there are often difficulties in the performance of queries and transactions.

Englisch: normalization
Updated at: 28.10.2013
#Words: 855
Links: connection, database (DB), indium (In), process, data
Translations: DE

All rights reserved DATACOM Buchverlag GmbH © 2022