The term “Data Warehouse” intuitively brings to mind a unified representation of data across an enterprise. While this is the standard theoretical look of a data Warehouse, many practical scenarios demand that we take alternate approaches as well. It is thus not uncommon for an organization to face scenarios where multiple data Warehouses are implemented to handle data across the enterprise. Although it may be misunderstood as an indicator of ineffective data organization, this concept of using multiple warehouses (also known as the plural data warehouse concept) is often a simple indicator of an evolving firm.
CHALLENGES OF RETAINING SINGULARITY
For most small-medium firms, the customer base and product line-up are limited enough to ensure that a single data warehouse can handle all operations. However, as a firm evolves in size, several issues pop up. The largest firms have multiple branches located in different regions across the world where region-specific business needs and restrictions often apply. Furthermore, large firms with a diverse product line-up result in the same firm having to deal with different client bases associated with different products: each with their own needs and restrictions. Even inter-department business perspectives on the same product may differ (for example, “1-tonne steel” may be the “end product” for the metallurgy department while it may be seen as “raw material” for the “manufacturing department”).
The sheer geographical and organizational issues that these challenges raise means that most large-scale firms cannot practically work with a single data warehouse. Further complicating things is the fact that large firms also tend to participate in mergers and acquisitions – once again bringing forth a scenario where the same parent firm needs to deal with multiple warehouses. Thus, a situation where a firm is forced to deal with multiple data warehouses is not always an indication of improper data organization. It is often an indication that a small-medium form is evolving into a large-scale firm.
THE RISE OF THE PLURAL DATA WAREHOUSE
Attempting to forcefully club the diverse data produced in a large firm under one roof leads to difficulties & inefficiencies in both information retrieval as well as knowledge extraction processes. Large-scale firms must thus confront the fact that they need to employ multiple warehouses for better performance. In fact, the best way to improve both information retrieval and knowledge extraction processes is to organize these various data warehouses in an effective manner such that they can behave both as individual data warehouse units as well as a single federated “Plural Data Warehouse”.
The Plural Data Warehouse can be designed using one of two Federation methodologies: Region-based or Functional.
This Federation is used to deal with the issue of multiple data warehouses located in multiple locations. This form of Federation consists of two levels: the lower level consisting of individual/independent “regional” data warehouses, and a higher level consisting of a “global warehouse” to which all the lower warehouses connect to. The flow of data in this arrangement happens in two ways:
- Upward Federation: Data moves from Regional Level to Global Level Warehouse.
- Downward Federation: Data moves from Global Level to Regional Level Warehouse(s).
This Federation is used to deal with the issue of organizational boundaries giving rise to multiple data warehouses. In this arrangement, different departments sharing common data points have their data flows integrated towards appropriate Data Warehouses.
PLURAL DATA WAREHOUSES: THE PROS AND CONS
Much like any other implementation, plural data warehouses are not without their fair share of pros and cons. Here are a few:
- Plural Data Warehouse allows for different types of warehouses (including legacy warehouses) to continue to serve their specialized uninterrupted while still providing a common-access route.
- Integration of various data warehouses allows for cross-departmental information retrieval as well as analytics.
- The option to integrate legacy BI systems allows for a greatly shortened implementation time as compared to the process of designing an enterprise data warehouse from scratch.
- The Plural Data Warehouse can only be successful if all the constituent warehouses operate with the same O/S, Hardware, Backups, and Archiving Systems.
- Due to the varied nature of the warehouses and their constituent data, it is difficult to come up with a uniform information security implementation.
- Attempting to build a plural data warehouse without standardization can be highly expensive.
Encountering a plural data warehouse scenario is no cause for alarm. It is rather an opportunity for firms to learn and experiment with the organizational structure of their existing data warehouses. Firms should embrace the plural data warehouses as a means of helping them achieve ease of implementation and cross-functionality in a cost-effective manner.
The post What are Plural Data Warehouses? appeared first on Pyramid Solutions.
This post first appeared on Using Hadoop For A Successful Big Data Testing Strategy, please read the originial post: here