Sunday 8 February 2015

Part 4: Types of Data Warehouses

In this series, the previous post was about Part 3: Types of Data Warehouse Architectures. Please go through it, if you missed it.

A data warehouse is a relational/multidimensional database that is designed for query and analysis rather than transaction processing. A data warehouse usually contains historical data that is derived from transaction data. It separates analysis workload from transaction workload and enables a business to consolidate data from several sources.
In addition to a relational/multidimensional database; a data warehouse environment often consists of an ETL solution, an OLAP engine, client analysis tools, and other applications that manage the process of gathering data and delivering it to business users.

There are three types of data warehouses:

  • Enterprise Data Warehouse
  • ODS (Operational Data Store)
  • Data Mart


Enterprise Data Warehouse

In an enterprise Data Warehouse, the entire data of the organisation is fetched from the heterogeneous data sources into the ODS, then the data is loaded in to the data warehouse after cleansing and applying business transformations. 
The data marts are built from the data warehouse. Each data mart will hold the data specific to the departments in the organization. Each user group will have access to the respective data marts. This mechanism helps to secure the data by avoiding unnecessary access and it also speed up the report generation.The user groups uses analytic tools to generate the reports from the respective data marts.

ODS (Operational Data Store)

ODS is the place where the data from multiple heterogeneous sources are stored for further processing. Data is loaded to the ODS in batches. Data once loaded in the ODS will not go back to the OLTP / Source / Originating system. Some amount of cleansing and standardization is applied to the data before storing it in the Operational Data Store. It act as a single version of truth to the organization. It enables the user to get all the organizational information at one place. ODS is not a data warehouse. But it act as an intermediate storage in the data warehouse. In the above diagram, the data from files, CRM, ERP, Web Service, etc are loaded into the ODS using ETL process.


Data Mart

Data Mart is a subset of Data Warehouse. There are 2 different approaches in Data Warehouse.

Top-Down Approach

Top-Down Approach is introduced by Bill Inmon. In this approach, the data warehouse is created first. Then the data marts are created from the broad data warehouse. For example Sales, Purchase, Marketing, Human Resource, etc.

Advantages

  • The entire organizational data will be available to uses from the data warehouse.

Disadvantages

  • Development time is more.
  • Scalability is complex.

Bottom-Up Approach

Bottom-Up Approach is introduced by Ralph Kimball. In this approach, the data marts are created first. Each data mart serves a particular business area's reporting requirements in an organization. For example Sales, Purchase, Marketing, Human Resource, etc.
Once the data marts are created, all these data marts are integrated together to form the data warehouse.

Advantages
  • Development time is minimal since it will not create the entire data warehouse at once.
  • Faster report availability.
  • Highly scalable.

Disadvantages
  • The entire information of the organization will not be available since the data is scattered in different data marts.

Thanks for reading my blog.