In this article Dain Hansen unravels the mystery surrounding data federation.
What are the essential requirements for a data federation solution? How
is data federation considered a stepping stone to a successful SOA? Why
are both data quality and data profiling so critical in bringing about
manageable data services? What’s the impact of these emerging
federation techniques on data warehousing and business intelligence
applications? This article will answer these questions through industry
data, architectural patterns in data integration, and compelling case
studies that illustrate the importance of embracing data federation.
Introduction: The Need for Reusable Data
In the struggle for information agility – take one part SOA and add it to two parts data, the result is the formidable and extremely popular data services. But what is it beyond the hype? Data services help transform data sources into reusable data components for the purposes of better accessing, aggregating and of managing data. Our SOA founding forefathers taught us that service-oriented architectures tend to be much more agile than their constrained EAI contemporaries. This is also the way we need to consider our data architectures.
There are a number of reasons that an organization might undertake a data-services implementation.
|•||Data is everywhere. The rate at which data grows in organizations is on the rise as is its critical importance for turning data into useful information.|
|•||Data originates from a variety of structured and unstructured sources.|
|•||Without data services, data access is too complex a task, involving different data models and data formats for each data source.|
|•||The proliferation of point-to-point connections between numerous data consumers and providers increases as IT implementations evolve.|
|•||The lack of a real-time, unified view across multiple sources makes data even more difficult to leverage consistently.|
For most organizations, data appears on the architectural radar only when there is a mandate to derive specific value from collecting data. The data needed may be related to sales forecasts, campaign results, customer-service metrics, regulatory compliance, operational performance results, or customer profiling. Companies initiating projects that require data must make choices. For instance, should it continue to connect data in the same customized, rigid, point-to-point way that it uses for applications (Figure 1)?
Experienced IT professionals know that single-use data integration projects create long-term maintenance challenges and offer negligible ROI on the data integration portions of the project. The better strategy is to apply SOA principles to data integration, turning data into a service that is available as logical modules, each with a standards-based interface. This enables users to access and use the service more easily, improves data visibility, and promotes greater reuse.
Understanding Data Federation Patterns
There is profound agreement in the industry that data services have a transformational influence on enterprise data-centric architectures. Our analysis indicates that there are three important scenarios where data can be exposed as reusable services (Figure 2):
|•||Simple Data Access|
|•||Data Federation Services|
Finally, data federation is defined as the capability to aggregate information across multiple sources into a single view. It leaves data at the source and consolidates information virtually, almost the same way an enterprise service bus (ESB) virtualizes messages but for data. Data federation essentially allows companies to aggregate data across multiple sources into a real-time view which can be re-used as a service.
We’ll next look at these three examples in detail and discuss the pros and cons of implementing each pattern.