Wednesday, November 23, 2011

Data Virtualisation As An Approach To Data Integration

Many different approaches are now available for Data Integration, yet far and away the most popular approach currently still remains as Extract Transform and Load (ETL).
However the pace of Business change and the requirement for agility demands that organizations support multiple styles of data integration.

Three leading options present themselves; let’s now describe the differences among the three major styles of integration.

1.        Physical Movement and Consolidation

Probably the most commonly used approach is physical data movement.  This is used when you need to replicate data from one database to another.  There are two major genres of physical data movement, Extract Transform & Load (ETL) and Change Data Capture (CDC). 
ETL is typically run according to a schedule and is used for bulk data movement, usually in in batch.  CDC is event driven and delivers real-time incremental replication.  Example products in these areas are Informatica (ETL) and GoldenGate (CDC).


 2.        Message based synchronization & propagation

Whilst ETL and CDC are Database to Database integration approaches, the next approach, message based syncronisation and data propogation is used for application to application integration.  Once again there are two main genres, Enterprise Application Integration (EAI) and Enterprise Service Bus (ESB) approaches, but both of these are used primarily for the purpose of event driven business process automation.  A leading product example in this area is the ESB from Tibco.

 3.        Abstraction / Virtual Consolidation (aka Federation)

Thirdly you have Data Virtualization (DV).  The key here is that the data source (usually a database), and the target or consuming application (usually a business application) are isolated from each other.  The information is delivered on-demand, to the Business Application when the user needs it.  The consuming business application can consume the data as though it were a database table, a star schema, an XML message or in many other forms.  The key point with a DV approach is that the form of the underlying source data is isolated from the consuming application.  The key rationale for Data Virtualization within an overall Data Integration strategy is to overcome complexity, increase agility and reduce cost.  A leading product example in this area is Composite Software.

ETL or DV?
The suitability of Data Integration approaches needs to be considered for each case.  Here are 6 key considerations to ponder:

1. Will the data be replicated in both the DW and the Operational System?

      Will data need to be updated in one or both locations?
      If data is physically in two locations beware of regulatory & compliance issues associated with having additional copies of the data (e.g. SoX, HIPPA, BASEL2, FDA etc)

2. Data Governance

      Is the data only to be managed in the originating Operational System?

      What is the certainty that a DW will be a reporting DW only
(vs Operational DW)?

3. Currency of the data, i.e. Does it need to be up to the minute?

      How up to date are the data requirements of the DW?
      Is there a need to see the operational data?

4. Time to solution i.e. how quickly is the solution required?

      Immediate requirement?
      Confirmed users & usage?

5. What is the life expectancy of source system(s)?
      Are any of the source systems likely to be retired?
      Will new systems be commissioned?
      Are new sources of data likely to be required?

6. Need for historical / summary / aggregate data
      How much historical data is required in the DW solution?
      How much aggregated / summary data is required in the DW solution?

 Leading analyst firms like Gartner are recommending that data virtualization be added to your integration tool kit, and that you should use the right style of data integration for the job for optimal results. 
 Just like so many things in Infromation MAnagement - there's more than way way to accomplish Data Integration; ETL is not the only way.  Data Virtualisation is well worth considering a a part of your overall strategy. 

7 comments:

  1. Informatica is best data integration and management tool available in the market. It helps the organization to make data driven decisions by using this advanced data management tool.
    Regards,
    Best Informatica Training In Chennai

    ReplyDelete
  2. Thanks for sharing this valuable post to my knowledge great pleasure to be here SAS has great scope in IT industry. It’s an application suite that can change, manage & retrieve data from the variety of origin & perform statistical analytic on it…
    Regards,


    SAS Training in Chennai|SAS Course in Chennai

    ReplyDelete
  3. { Inconsistency in output quality: If the provider {you have chosen|you've chosen|you've selected|you have selected} is inexperienced and lacks consistency, {then it|it|this|that} {might lead to|could trigger|might trigger|may cause} problems {such as|for example|including|like} delayed submission of completed projects, processed files without accuracy and quality, inappropriate assignment of responsibilities, {lack of communication|no communication|poor communication} {and so|and thus|therefore|so} on| While the job profile {might seem|may appear|may seem|might appear} simple {it does|it will|it can|it lets you do} {in fact|actually|in reality|the truth is} {require a|need a|demand a|have to have a} certain {degree of|amount of|level of|a higher level} exactness {and an|as well as an|plus an|with an} eye for detail| My writing {is focused|is concentrated|is targeted|concentrates} {more on|more about|read more about|on} {the industry|the|a|that is a} {and quality of|and excellence of|superiority} work, not the worker| By continues monitoring the hurdles and solving it, {one can|it's possible to|you can|one can possibly} easily {increase the|boost the|raise the|improve the} productivity of business| Decline {in the|within the|inside the|inside} quality of service and delay {in the|within the|inside the|inside} execution and delivery of processes are some {of the|from the|with the|in the} risks involved, {besides the|aside from the|in addition to the|apart from the} risk {to the|towards the|for the|on the} security {of the|from the|with the|in the} data and privacy and cost-related risks| The {service provider|company|supplier|vendor} {should also|also needs to|must also|also need to} volunteer {a variety of|a number of|many different|various} profits concerning formulas {of data|of information|of knowledge|of internet data} transmission, turnaround etc}. { A lot of companies are fine with admitting this, but {others are|other medication is|other people are} {not so|not too|not|less than} sure, primarily {because this|as this|since this|simply because this} may put people {off the|from the|off of the|over} service| Such measures would {keep your|keep the|maintain your|maintain} sensitive documents from falling {into the|in to the|to the|in the} hands of unauthorized personnel| When you outsource {to an|for an|to a|with an} experienced BPO company, {they would|they'd|they might|they will} manage these risks professionally {as well as|in addition to|along with|and also} plan and implement appropriate {strategies to|ways of|ways to|methods to} avoid them in future| Outsourcing data entry is most helpful term {for all|for those|for many|for all those} these organizations| With the help of such information, {you can|you are able to|it is possible to|you'll be able to} {improve on|enhance|make improvements to} customer targeting| If you think {you are|you're|you might be|you happen to be} proficient enough in installing the payment processor {on your|in your|on your own|on the} website {on your|in your|on your own|on the} own, {you should not|you shouldn't|you ought not|it's not necassary to} hesitate doing it}. cheap data entry

    ReplyDelete
  4. Hi there! Nice material, do keep me posted when you post something like this again! I will visit this blog leaps and bounds for more quality posts like it. Thanks... The Best Project Management Tool

    ReplyDelete
  5. Great survey, I'm sure you're getting a great response website management

    ReplyDelete
  6. Good to become visiting your weblog again, it has been months for me. Nicely this article that i've been waited for so long. I will need this post to total my assignment in the college, and it has exact same topic together with your write-up. Thanks, good share. data entry rates per hour

    ReplyDelete
  7. I think this is an informative post for Data Virtualisation and it is very useful and knowledgeable. therefore, I would like to thank you for the efforts you have made in writing this article.

    ReplyDelete