This blog presents an overview of an ODS (Operations Data Store) for integrating disparate systems. An ODS is one approach to an EIA (Enterprise Integration Architecture). Of course, there are viable alternatives to integrate systems. For the later, you’re invited to review additional web pages from this website. For now, I would like to provide material regarding the common practice of using an ODS.  Let’s being by looking at an outline to highlight what we’ll cover.

Table of Contents

A Brief Description of System Components

First we should start with answering what is an Operational Data Store?  An ODS provides the basis for operational processing and it may be used to feed data warehouses.  According to Inmon, Imhoff and Battas in ‘Building the Operational Data Store’ (Published by Wiley and Sons ’96) it is subject-oriented, integrated, volatile, current valued and comprised of only corporate detailed data.  From this point, we’ll focus on several technical and non-technical areas to make an ODS  system integration project successful.

It would be helpful to have the related schematics page open in an additional window while reading this post. In the Integration Architecture Overview you will see many system components. On the left half of the diagram you notice a system that reside at a corporation’s headquarters. These applications are centralized from the host (legacy) software. On the right half of the schematic we find decentralized systems which are dispersed to the divisions or various subsidiaries. To the far right we find client software running and distributed applications. These applications can be either personal, department or divisional in scope.

The systems (on the right plane) can be feeder systems to the central host. The ERP package-running on a mainframe or mini computer-can also be a feeder to the down stream applications. In such a scenario the centralized data becomes parsed and pushed down to the divisions. However, data originating at the divisions also has to replicated back to headquarters in order be rolled up to the host. This is a very complex environment where data can originate from multiple sources (centralized or decentralized).  Although it is complex integration is a requirement.

A Few Critical Factors

Several layers of software exist in a complex environment such as this. A data scrubber is used to cleanse data and or transform it before it is imported into a downstream system. It is important to trap erroneous data as close as possible to its source. It cannot be permitted to enter the other systems. If bad data were permitted to proliferate into down stream applications then the cost to contain and eradicate it would become exponentially costly to resolve. For this reason, it is crtical to cleanse data and insure a set of ‘bullet-proof’ validation procedures are in place.

If a corporation is considering an ODS it will be important for them to avoid re-inventing the wheel. Realistically, data can be restructured, reformatted and engineered. However, the program logic found to create data ought not to be unrealistically underestimated in its sheer size. For example, a closed loop MRP package is said to take fifty man years to create. It maybe possible to compliment applications which already exist but most companies will find it impractical to re-code their ERP packages into their ODS.

Data Quality and Data Transformation

That said, I would like to stress the ‘transform’ piece of ETL (extract transform and load) should be identified and quantified in a successful integration project. To repeat-it is impractical to re-create all the data validation rules and business logic found in the source systems for an ODS. This doesn’t mean the solution can avoid extensive logic which a company will have to invest into. The transformation and cleansing logic in a system integration project are typically significant efforts.

The quality of a company’s data assets will influence what is chosen to feed downstream systems.  It also impacts the development effort. For example, the general ledger ought to be a good source of data because it should contain information that has been thoroughly reviewed before it becomes a system of record to downstream systems. If it is a volatile data source, then it would be a poor choice to feed other systems. To increase your odds at being successful carefully select your data sources.

Key Measurements for Establishing Realistic Deliverables

Corporations often have weak documentation especially about their data-its creation, who is responsible for it, what are the acceptable tolerances, etc. When documentation is available rarely is it kept current. This is simply a business reality as companies find themselves devoting resources to other areas. However, correct and meaningful documentation is critical for a successfully system integration project.  To set realistic plans, measure the quality of your documentation and your data.

No one will question how easy it is to produce reports and analytics once data becomes available. We can all agree that is the easy part. We have to consider how much more difficult it is to get data-to the point-where it can be reported against. As much as eighty to ninety percent of the work to roll out a DSS (decision support system) or a data warehouse / ODS  can be composed of doing the back-end tasks; that is, creating the meta data, the transformation processes and scrubbing the data.

Some Help can be Provided from Toolkits but …

Back-end tasks, including analysis, do not receive a lot of glory-even while they are fundamental to any roll out. As with any application IT can create program logic but the end-user will have to be responsible for data quality. To help solve data quality challenges data scrubbers and transformation software are helpful. However, as we all know, they are only tools. The human resource aspect cannot be overlooked. This will have a direct impact regarding where management should put their attention and where they will need to allocate resources.  Getting data to the place where it is useful information is a big job.  It requires collaboration, many disciplines and human effort.

Enterprise software tools come in many varieties.  They are often very expensive investments. They should be carefully critiqued like any other capital investment. For example, care should be taken when selecting ‘point and click’ programming tools because they often do not give documentation about their internal parameters. If they produce weak documentation it will be difficult to maintain your application, scale it, and create ownership. They can also be found to ‘hit the wall’ and fail to resolve complex problems. In other words, you may find issues with their scalability, adaptability and maintainability.

Rapid Application Development: Friend or Foe

Many report writers run in a ‘point and click’ environment which shouldn’t be considered a major fault of their design. In fact, they can be considered an asset for RAD (Rapid Application Development). Some parts of an enterprise application can be considered disposable. When nominal time is required to build a given report, we can classify it as having an immaterial development cost. In this case we can see rapid application development tools serve an important role.

When creating reports don’t be overly alarmed if your tool either fails to use or generate source code. With this type of application, in most instances, this should not be an issue. However, if we focus on other parts of an enterprise system, then we may find a ‘point and click’ software package is a liability. This is especially true when dealing with complex components or system designs. An alarm should be raised when deep within the bowel’s of a proprietary package the ‘point and click’ parameters are kept but no one is able to produce documentation about them.

A Lesson to Learn from History

Not too long ago the IT industry moved away from proprietary solutions as they adapted open systems. In the same light, today’s toolkits should be evaluated. We should evaluate technologies and tools critically. Do they create vendor lock-in?  It is the right technology? It is standards based? When selecting software or building enterprise applications corporations can cut exposure and isolate risk by considering factors like these. Certainly this is a step removed from the nitty-gritty details of record layouts, data maps and flow diagrams.  Therefore, to some, these questions may not appear important.

However, as with many IT projects, these are important considerations beyond the obvious for a company to consider.  At an abstract level, seasoned professionals can look for areas of vulnerability to mediate risks and complex systems.  An enterprise architecture has many parts and requires many disciplines to successfully implement. These applications are systems in very large companies by their very nature. Consequently, due to their size, politics will be involved. This will make the selection of technologies and software packages more complicated but it still needs to be done.

An Important System Analysis Deliverable

Now let’s put our attention on data requirements. By reviewing the schematics, you’ll notice an entire series of reports were analyzed. They were reviewed-from a set of weekly performance measures-in order to gleam a comprehensive list of all the various data items being called out. When automating any application it is prudent to review all related ‘source documents’ as well as various data entry screens, spreadsheets, etc. In the process of conducting this type of study, all entities are identified, their attributes as well as their relationship to one another.

Producing a data requirements deliverable is just one deliverable among many in any data-centric project. Make this document as complete as possible. Pay attention to detail for it will be a key to avoid rework. To be expected, the data requirements will be the basis for the data model and the software development effort. Going through these steps is necessary to design an ODS, a data warehouse as well as traditional OLTP applications. If it is not done, then the project is not likely to be successful because it would fall short of fulfilling all data requirements.

ODS Data Preparation

OLTP systems tend to have their data normalized where decision support systems store data de-normalized. However, large ERP systems often contain unusable data because they commonly do not normalize their data. As mentioned above, data scrubbing and reformatting is often a large part of an ODS effort. Once the data has been structured properly for the ODS, it can then be used as a building block for decision support systems.

Design Guidelines for Structuring Data

While the primary focus of this article is on an ODS, let’s briefly the contrast between the below data structures. This will be familiar to many but others will benefit from this high level perspective.

Design Guidelines for Constructing OLTP Systems

Whenever practically possible, an enterprise database should be normalized to the third normal form. This is done to provide referential integrity and cut redundant data.  Below you will find a list of four normalization rules. These rules do not apply to data warehouses nor to data marts since these systems require an altogether different set of design considerations.

However, they do apply to construction of an Operational Data Store.  That’s why they are mentioned here.

Form Description
First At each row and column position in the table exist one value, never a set of values.
Second Each column that is not in the key provides a fact that depends on the entire key.
Third Each non-key column provides a fact that is independent of the other non-key columns and depends only on the key.
Fourth No row contains two or more independent multi-valued facts about an entity.

Design Guidelines for Constructing OLAP Systems

Star schemas and snowflakes are used for the design of a data warehouse and data marts. These can appear in many forms. Another distinctive feature between OLTP vs. OLAP regards how data is stored. As a rule, decision support cannot be provided from highly structured data for it requires too much overhead to be reformatted in real-time. A warehouse will commonly contain pre-processed data, stored in aggregate form, to provide information on demand.

Closing Remarks

In this blog we have reviewed introductory concepts to integrate large-scale enterprise systems. Even as an introduction-for those not familiar with enterprise architectures-they may have found this post overwhelming. Nonetheless, several technical and non-technical issues have been presented to a diverse audience. They were raised here to help anyone who can benefit from the discussion. Some people would consider these topics to be controversial; and, it should be stated, the above approach may not be suitable for your environment.

Please keep the following in mind. It can be helpful have a map when walking into a minefield. In other words, exposing some of the obstacles to a successful system integration project can aid folks to arrive at the desired business objectives. That’s what I’ve tried to do here. I”ve laid out some areas which need to be discussed in an EIA / ODS based system.  I’m pure you’ll discover a set of “opportunities” and challenges of your own as you integrate an enterprise.

This post and the related schematics also provided an opportunity to present a range of my skill sets. I hope you enjoyed my work samples as you read this article. I have provided insights from diverse experience in IT which is not readily available from others. Solutions have been offered and potential pitfalls were exposed so that they can be avoided. Consider sharing your thoughts by posting a comment. Please contact me for potential business opportunities. You may reach me by sending an email message. Thank you.