Analysis Trap 2 - Incomplete business understandingThe Analysis Trap 2 is about incomplete business understanding or misinterpretation of the business processes.
This is a common source for inadequate data models. There are (at least) two factors that may lead to this; The system analyst is lacking experience either from the specific business area, or in general, and/or the business fails to bring forward enough detailed information about the business needs.
I have come to the conclusion, after more than 15 years of ER modeling, and a total of 28 years in the software industry, that more often than not, the "professional" part in the business analysis task, the system analyst, must take a fair share of the burden here.
Being a system analyst is not the same as being a programmer. These are two distinct different professions, and different skills are necessary. It is not a disadvantage for a system analyst to have a programming background; you may quickly see areas where the model will influence the development phase in a negative way.
Actually, problems with building queries against a relational database are very often an effect of an inadequate (denormalized) database structure. The opposite is also a fact: It is much easier to write optimal queries and updates against a normalized database structure.
The business probably do not know data modeling in detail. That is why they use consultants. However, some basic knowledge is required in order to participate in the analysis phase. You may forward them the free eBook on Entity Relationship Modeling - Principles as a primer to understand it. You find it in my eBooks section. If the customer learns the basic principles, he is much better prepared to communicate his business and have a dialogue with the system analyst(s). Here are some common errors in ER modeling:
Failing to understand that the same information is repeated:
The business may say: For each customer, we need his business address and his delivery address. That is two addresses. If you model it like this,
you are violating the First Normal Form: Repeating attributes/group of attributes.
Ask questions such as:
This model will let him have as many choices as he may wish. I don't say this is good enough, though;
Failing to build flexibility:
The business says; We have customers and suppliers. You might model it like this,
While instead, you could ask:
Does it ever happen that a customer is also a supplier of other goods? It happens more often than one should think. You could solve it like this,
We changed the name of CUSTOMER into BUSINESS CONTACT. Using the many-to-many relationship in an early stage of analysis is a powerful mechanism. It shows that a relationship is more complex than first anticipated, and as long as that many-to-many relation is there, it is a reminder to resolve it at some point in time.
Even worse; The business might say: We identify our customers and suppliers by a number from separate series to identify them, so we will need the customer type as an additional means to identify them. Truly, this happens. The analyst gives them this,
where the unique combination of Business contact id and contact type is the primary key. The vertical bar at the crow's foot notates that the relationship is (a part of) the primary key.
This analysis trap breaks Second Normal Form. Name, address and so on is not dependent on contact type, only on business contact. If a business contact is both a customer and a supplier, all names and address information etc. must be stored twice.
But what about the whole concept of customers and suppliers? They are only two of several groups that are relating to the business. In an early stage, we should rather say: We have several groups of business contacts, such as customers, suppliers, and maybe other interest groups, all of which we may have to deal with. In such cases, one may use the terms super-entity and sub-entity as a very powerful illustration of generic (common) structures, like this,
We are trying to find all attributes for each sub-entity. Since they obviously are closely related to each other, many of the attributes will be common. As the analysis moves forward, we add attributes to each sub-entity. Where an attribute exists in all sub-entities, we move it to the super-entity. We se that actually, some of the sub-entities are different types of the same thing, because they do not have un-common attributes:
Now we can be more specific:
The model is still not complete - Primary key definitions are insufficient. Since we are at the end of this keynote, I leave it to you to investigate further...
Also, the arc between business contact and shareholder has to
be studied closer. Arcs are, in my opinion, a signal that the area is
insufficiently studied: What we really say with an arc is: It's either
this or that. Choose your pick. Not good enough, in my opinion. It will
force you to represent both relationships as optional foreign keys in
the physical data model, and optional is a violation of relational
The use of arcs and mutually exclusive relationships are explained in detail in my free eBook Entity Relationship Modeling - Principles.
This keynote is a (slightly) compressed chapter from the eBook, in which the shareholder is removed from the diagram altogether.
Free eBookSubscribe to my newsletter and get my ebook on Entity Relationship Modeling Principles as a free gift:
What visitors say...
"I just stumbled accross your site looking for some normalization theory and I have to say it is fantastic.Read more Testimonials
I have been in the database field for 10+ years and I have never before come across such a useful site. Thank you for taking the time to put this site together."
Theory & Practice
SQL Server DBA
Install SQL Server
Database Normalization eBook:
Copyright © www.databasedesign-resource.com / Alf A. Pedersen
All rights reserved.
All information contained on this website is for informational purposes only.
Disclaimer: www.databasedesign-resource.com does not warrant any company, product, service or any content contained herein.
Return to top
The name Oracle is a trademark of Oracle Corporation.