Jay Yusko is a Consulting Engineer in Application Architectures for Information Resources. He has an MS in Artificial Intelligence from DePaul University and a Ph.D. in Artificial Intelligence from IIT. His research has been in the area of Ontologies and Ontology Inference Engines. Jay developed rule-based systems at Bell Laboratories, Navistar and United Airlines. Information Resources provides a combination of real-time market content, advanced analytics and performance management software.
The technical challenges for complex rules engines are many, especially when dealing with terabytes of information that need to be integrated into many processes. Yusko compared it to a pipeline, where the data comes in one end and leaves at the other, and every data manipulation has to happen inside the pipe in order to maintain the integrity of the data. The challenges include:
- The requirement for complex decision making in the ETL (Extract, Transform, and Load) flow with the ability to fix data live in the data stream
- Ability for the departments to maintain the business logic in English in a central repository
- Embedding a rules engine in a datastage (broken up into stages) custom operator in a way that is normal for a datastage developer to maintain
- Getting the required speed and scalability
- Having the ability to spread the decision process across a grid
The business logic and the programming logic need to be separate. Quality control is needed from both the business and the programming side. Yusko advises doing test runs with nine or ten million records to insure the rules do what is intended. Yusko went into detail on the written code for one of these rules. The rule, written in C, was almost in the same English as the business specification, making it understandable to everyone.
The ideal is for each rule to be a black box, where the inputs go in and the outputs come out and what happens inside doesn’t have to be known or changed. Only the inputs and the outputs need to be specified, which makes things easier for the developer. Most of the rules are written in C++, but then these need to be converted to JRules so they will work with a Java engine that can handle object attributes. Yusko explained how this process works for all the stages and links.
If a record fails for any reason, there is a link going out so a human can find out why. Once the engine is set up, any number or records can be handled through any number of nodes, and in any kind of system. Data also can leave the stream for Business Activity Monitoring (BAM).
One of the benefits of implementing a rules system is that it centralizes and makes accessible all the business rules of the organization, rather than continuing the practice of having some important parts of the business knowledge existing in isolation on some desk somewhere.
The project benefits include:
- Enables very complex sets of rules to handle the decision process
- Ability to have explanation facilities
- Ability to quickly add and change rules in the datastage job
- Enable scalability by embedding a rules engine in a datastage custom operator tied into the parallel extender, which allows the processing to be spread across a grid
- Enhance the speed of processing by using a continuous flow of data
- Rules can be maintained by the proper department in a format that they can understand
- The business logic is stored in a central repository
- Real time complex alerts are enabled
Yusko said the new product is in production now. The engine can be installed free-standing or used as a web-service and it can work with any kind of rule.