An enterprise data warehouse is a collection of entire organization’s data in a central location, to be accessed by the users of the organization to get a holistic view of their business.

Data within an enterprise data warehouse is organized to establish a single version of truth. It is a highly curated environment and a lot of efforts go in curating the data.

Here are some of the guiding principles for designing an enterprise data warehouse.

  1. Data within the enterprise data warehouse is reconciled with the source data to ensure accuracy
  2. Data within the enterprise data warehouse is subjected to various data quality checks to ensure that the data within the warehouse is a true reflection of the reality.
  3. Data within the enterprise data warehouse is well documented to ensure that it is correctly interpreted. Data catalogue and definitions are made available to all authorised users through a convenient channel.
  4. Data within the enterprise data warehouse can be traced back to its original source, to ensure integrity of data.
  5. Data within the enterprise data warehouse is consolidated from various discrete sources to create an enterprise view of the business.
  6. Data within the enterprise data warehouse is homogenized to allow unique interpretation.
  7. Data within the enterprise data warehouse is accessible only to authorised users.
  8. Data within the enterprise data warehouse is audited, and any changes to data over time are captured and preserved for reference.
  9. Data within the enterprise data warehouse is well organized with high levels of referential integrity, imposed on the database level.
  10. Data within the enterprise data warehouse can be accessed by users easily and efficiently.
  11. Data within the enterprise data warehouse is well managed and governed to ensure  quick resolution of data quality and data integrity issues.
  12. Data within the enterprise data warehouse always reflects the latest state of business and gets refreshed as frequently as deemed necessary for effective usage.
  13. Data within the enterprise data warehouse is cleansed to ensure good quality and readability for the users.
  14. Data within the enterprise data warehouse is never edited or manipulated and is a true reflection of its source. Data gets loaded into the data warehouse from source systems through automated and scheduled jobs.
  15. Data within the enterprise data warehouse is organized such that any historic view of the data can be reproduced anytime as needed, if needed. Historic data is retained in the data warehouse for a sufficiently long time.
  16. Data within the enterprise data warehouse is structured and well organized to ensure easy retrieval and further processing. Common structures are third normal and data vault.
  17. Data within the enterprise data warehouse is kept to minimal, since managing it is a costly affair. Only data that is required for the project is brought into the data warehouse.
  18. Master Data within the enterprise data warehouse is de-duplicated to maintain a unique and updated identity.
  19. Data within the enterprise data warehouse represents the lowest grain in its most basic form, and does not contain any processed or aggregated data.
  20. Data within the enterprise data warehouse never gets deleted, but simply changes to a closed status for non existing entities (unless this contradicts with regulatory requirements to forget personal identifiable data).
  21. Data within the enterprise data warehouse is loaded incrementally, such that only new data (or changes to the previously loaded data) gets loaded. This reduces the data processing load, which is important because loading data into the data warehouse can get very costly.