This paper analyzes the differences between data and information, the importance of, and the justification for, making such a distinction and its relevance as a critical undercurrent of a successful SOA.
Introduction
The conventional wisdom about the fundamental goal of Service Oriented Architecture (SOA), i.e. to bring about a closer alignment between business and IT, is also well understood. It is no surprise that most enterprises today are involved in efforts towards evolving to a SOA, one way or another. Typically, any enterprise which aims to adopt SOA realizes that there are a multitude of issues to consider, such as technology, process, infrastructure; and these issues span across the tactical and strategic level.
While it is acknowledged that those are relevant and critically important issues, this article aims to address a much more fundamental question: As an enterprise’s strategy evolves towards an SOA, what strategies should be adopted to evolve its data in lock-step? Even while it is well understood that data is a critical component of any modern business enterprise, often, the data question is not given due importance, or at best, is considered as an afterthought.
An obvious, off-the-cuff response to the question posed above, is data interoperability. This article tries to delve deeper and analyzes the details of what data interoperability entails. There has been much treatment of the subject of interoperability in SOA, but, this treatment tends to focus on interoperability at the system/platform and programming language level. In other words, the scope of interoperability is limited in scope to the development-time aspects of SOA.
However, it the complete value-add of interoperability is realized only when two services in an SOA can communicate with each other, and exchange data in an ultimately fruitful, productive manner. Obviously, the data passed by the service provider to the service consumer becomes of value, only when that data is consumable, and when the consumed data can be easily processed into information by the consumer.
Data Interoperability: transforming data into useful Information
When a service consumer requests data from a service provider, the data sent across by the consumer is still raw, in the sense that the consumer still needs to massage and process the data, convert into useful information. The implication here is that there is an evolutionary process by which data gets transformed into useful information. The following details that evolution. Raw Data refers to the contents of enterprise databases and other such artifacts within the enterprise where data is stored. It is raw because it has not been processed, and is not yet in a sufficiently usable form.
Raw Data
Raw Data refers to the contents of enterprise databases and other such artifacts within the enterprise where data is stored. It is raw because it has not been processed, and is not yet in a sufficiently usable form.
Processed Information
Processed Information can be thought of data which has evolved from its most rudimentary state into something more meaningful to the enterprise. Typically, software applications consume the raw data and present it in more meaningful business entities and their inter-relationships.
Business Knowledge
Business Knowledge is a stage of data evolution where a deeper and more comprehensive understanding of the dynamic environment where the business operates has been captured. Business knowledge can be thought of as process information, but only at a higher level of evolution, with overloaded semantic meaning and cross-references.
Business Intelligence
At the Business Intelligence stage, data has evolved to the most refined state, and represents an actionable picture to a decision-maker, by helping them arrive at the right decisions, resulting in reduced risk, effective utilization of resources and enhanced enterprise efficiency in business operations.
Every enterprise has data in each of the above stages of information evolution, and so it is critically important to consider the strategies of data evolution into useful information, especially when considering a transformational activity like adopting a SOA
How to make information out of raw data
A further problem arises when there are multiple stakeholders in the enterprise interested in this information (ostensibly, to aid their own decision-making), but their semantic interpretation of that information varies widely. In other words, if the information were to be universally understood and accepted by all within the organization, all would be well. However, this is typically NOT the case. Each consumer of the information needs to have their own narrow interpretation of how they perceive the raw data and how it needs to evolve to be presented with actionable intelligence.
At first glance, this problem is easily resolved by developing a universal, canonical information model (metadata) and deploying it across the enterprise. Practically however, this is akin to forcing everybody in the world to speak English, and this is not acceptable from a functional, cultural, political and maybe even a legal point of view. This necessitates allowing a certain amount of leeway in allowing individual groups of consumers to consume the data by way of transforming it to suit their specific needs. In fact, this leeway can be instituted as a design pattern or a best practice.
A further, refined model might involve a federated control mechanism over the information metadata model, in which a common, global, abstract definition exists, however, each consumer might define transformational contracts on the global model. These transformational contracts allows for modifications of the common, global definition of that data to the domain specific data definition. When applied to actual data, these transformation contracts can then modify the data so as to be easily consumable within the local domain of the consumer. Thus, each consumer is empowered to be responsible for transforming the common stream of data into a contextual definition for its own consumption. Also, note that once a consumer defines their own set of transformational contracts, they are artifacts which can be re-used by subsequent interested parties.
The following illustration shows an example where sales data is consumed by various interested parties within an enterprise, but each party has their unique view of data. They might be interested in only certain parts of that data, or may view it in certain other formats, by applying individual transformations.
This model fits in very well in the Enterprise Service Bus (ESB) design pattern which is ubiquitously used in most SOA designs. The ESB not only provides the messaging backbone for the service provider and the consumer to communicate with, but also provides data mediation and information transformational capabilities as well. It can thus serve as a centralized repository of the transformational contracts (for e.g.: XML schemas and XSL stylesheets).
Conclusion
If the stated goal of SOA, to bring about a closer alignment of business and IT, is to be a reality, it is obvious that of seamless migration of information between enterprise sources and sinks (or information providers and consumers) is one of the critical foundational requirements. It is indeed, a critical success factor. Given that a goal of SOA efforts is getting the right data, at the right time, to the right people, we can see the value of investments in information strategy and management. SOA architects and enterprise managers would do well to pay close attention to this very important issue.
References:
• Ackoff, R. L., “From Data to Wisdom”, Journal of Applied Systems Analysis, Volume 16, 1989 p 3-9.