[BiO BB] RE: Normalization WAS: database design question

JRambla jrambla at hotmail.com
Tue Mar 2 08:21:01 EST 2004


Hi,

According to my experience (near 20 years now) in designing/consulting about
enterprise databases:

- Normalization is good/desirable in all online systems (like ones where
several users can be reading and updating data simultaneously), usually
called OLTP systems. Exceptions are not significant at all.
- De-normalization is good (indeed mandatory) for datawarehouse & data
mining systems where grouping, sorting and summarized data is the real
interest. This is due to performance reasons associated to intensive
calculations. Also, we will apply de-normalization in history files or logs,
where you actually need a snapshot of relationships and data in the moment
of the entry.

The kind of database I remember starting the thread is a sequence database.
I will classify it in the first group, although I have little experience in
that field nowadays.

As I mentioned in the previous e-mail, normalization (usually only the
higher levels count as normalized) means not allowing repetitive data to
live in the system. I.e. not copying customer address data in every invoice
in the Invoices table.

That way any change to the data is done only in the "master" record, and you
don't need to keep track of all places where those data can be copied
before. Keeping track of data copies is, usually, a tricky and error prone
affair. So, you keep out of it as much as you can.

Opposing to that, using record keys (primary and foreign keys) is good,
because you define relationships at database design time, and database
engine helps enforcing those relationships when entering data.

Further details or more concrete questions will allow being more specific.

Hope this clarifies a bit more.

Jordi Rambla
Barcelona (Spain)



-----Mensaje original-----
De: bio_bulletin_board-admin at bioinformatics.org
[mailto:bio_bulletin_board-admin at bioinformatics.org] En nombre de Svensson,
B.A.T. (HKG)
Enviado el: martes, 02 de marzo de 2004 11:46
Para: 'bio_bulletin_board at bioinformatics.org'
Asunto: RE: [BiO BB] RE: Normalization WAS: database design question

Thank u for you suggested readings, but I did seek an elaboration on
why (high/er?) normalization should be regarded as a good design?

-----Original Message-----
From: JRambla
To: bio_bulletin_board at bioinformatics.org
Sent: 2004-03-02 10:28
Subject: RE: [BiO BB] RE: Normalization WAS: database design question

Hi all,

Normalization is a concept that comes from relational database theory,
created by the recently deceased Dr.Edgar F. Codd, a mathematician at
IBM. That theory is the base of all SQL-whatever world.

Normalization is a group of rules (5, if my memory is right) to apply to
table design in order, basically, to eliminate redundancy on data. That
redundancy will arise in the form of embarrassing, and sometimes hard to
find, consistency problems on data stored in the database.

As was suggested, following those rules is a good starting point to
design a database.

You can find a good introduction in

http://www.sequoia.be/consult/method/english.htm

Hope this helps,

Jordi Rambla
Barcelona (Spain)

-----Mensaje original-----
De: bio_bulletin_board-admin at bioinformatics.org
[mailto:bio_bulletin_board-admin at bioinformatics.org] En nombre de
Svensson,
B.A.T. (HKG)
Enviado el: martes, 02 de marzo de 2004 2:13
Para: 'bio_bulletin_board at bioinformatics.org '
Asunto: [BiO BB] RE: Normalization WAS: database design question

> Normalization is the process of designing a good data model.

Please, explain this statement.
_______________________________________________
BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
_______________________________________________
BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
_______________________________________________
BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bio_bulletin_board



More information about the BBB mailing list