Proper documentation has always been one of the major concerns in any IT set-up irrespective of the scale. Many IT engineers tend to jump on the tools and software to start coding or solving problems without proper planning and design staging (which I believe is an important step to maneuver unforeseen situations). For business analysts and developers, their day-to-day duties, a data warehouse is foremost. And to do so, the following seven-steps must be considered, according to David Walls, Mark D. Scott, in ITPro Today, Dec 20, 1999:
- Determine business objectives
- Collect and analyze the information
- Identify core business processes
- Construct a conceptual data model
- Locate data sources and plan data transformations
- Establish tracking duration
- Implement the plan
A data engineer helps the business owner during the planning and designing of the warehouse that fulfills the business requirements. This process is called Data Modeling. According to Techopedia, data modeling is a tabular representation of the data structures of a company’s database. It is considered a very powerful representation of the company’s business requirements.
Why is data modeling important?
Imagine you want to build a house. Will you start by digging the land immediately and building blocks to make the walls or will you start by acquiring the land first and get an architect to plan the construction of house the way you want it, with all the functionality and specifications you wish your house to have?
Naturally, you will approach an architect, who understands the nitty-gritties, prepares the design proposals, produce detailed drawings, negotiate with contractors and other professionals involved and so on.
Data modeling needs similar planning and vision. It is important to understand the elements in the process of building the warehouse, the connections between those elements, the links, the data to be loaded, and the step-by-step process of putting the data together. By doing so, one could easily detect defects before the development phase starts in order to rectify them at an early stage. Without a proper plan and design, a data model could lead to performance issues, slow query processing, and delays in building the actual warehouse.
The diagram below draws a parallel representation of the modeling process for a house and that of a data warehouse.
A real-case scenario
A very successful international retailer company went from spreadsheet data mart and SQLs that have some drawbacks, like irregularity and late delivery to decision makers towards different production environments/database to a data model that supported on time delivery of weekly and monthly financial reports on sale and other KPIs. The advantage, in this case, was that all the measurements were equally defined across multiple systems and countries. This allowed the company to better control, the ratio between the number of people visiting the shop and the number of people buying the product.
On the other hand, a communications company faced many challenges by building an analytical platform to make the data available and make the company more data-driven. The challenges faced here were because the company had not conformed both artifacts and defined parameters in the data prepared.
A model is a concept or a representation of a real-world object, a data model act as a guideline to the developers in building the warehouse. The advantage of building a model is that we can easily see where the missing entities are, allowing the developers to stay ahead when implementing and building the actual warehouse. This also help the business owner to save money and time. Therefore, discussing the model with important stakeholders and business owner before building it is essential. Everyone is on the same page and everyone agrees to the design and the implementation strategies.
In the next publication, we will go deeper into the topic of data modeling architecture.