Many BRMS vendors, especially the leaders Fair Isaac and ILOG, consider their implementation of the RETE algorithm a critical part of their product offering. Yet you might not understand what RETE is and why it can be either useful or unnecessary, depending on the rules problem domain. So what is RETE?
The RETE algorithm was developed to improve evaluation performance with large knowledge or fact sets. For example, a diagnostic expert system might describe many signs and symptoms for the diagnosis of a condition. Because of the diagnosis, an action may be taken.
Many BRMS vendors, especially the leaders Fair Isaac and ILOG, consider their implementation of the RETE algorithm a critical part of their product offering. Yet you might not understand what RETE is and why it can be either useful or unnecessary, depending on the rules problem domain. So what is RETE?
The RETE algorithm was developed to improve evaluation performance with large knowledge or fact sets. For example, a diagnostic expert system might describe many signs and symptoms for the diagnosis of a condition. Because of the diagnosis, an action may be taken. The subject, perhaps a patient, equipment, or whatnot, has associated facts. In RETE the facts are knows as the left-hand side of the diagnosis and the action is the right-hand side. To diagnose a condition, the expert system must search through, and settle on, a list of relevant facts. Settling the diagnosis is the goal.
RETE will progressively narrow the subject’s facts associated with a diagnosis. Suppose there are ten separate conditions associated with a subject. The simplest RETE algorithm starts with the first fact and filters or, effectively, queries the diagnosis set for the first condition. Next, the list is trimmed according to the second fact. The algorithm progresses through the facts until the list is exhausted or a goal is reached. As conditions progressively filter facts, unneeded rules are removed from memory. The facts may assume different forms: strings, numeric or computational.
The commercial RETE algorithms are more subtle and have been tuned for performance. RETE is memory intensive and proprietary implementations use different caching strategies to create an ideal execution map. Yet, essentially, RETE efficiently performs inferences against large fact sets (goals) and decides actions. The BRMS can chain, or combine these inferences into and over powerful and complex rulesets.
RETE is a tool for developing inferences from large fact sets. The classical AI definition of inference is ‘From some facts, others can be inferred’. Examples are relationships such as:
(Financial-Backer a b) => True
(Financial-Backer c a) => True
(Indirect-Financial Backer c b) => True
Suppose a large financial database holds facts about who has co-signed or financed whom. Perhaps there are many rules to define a financial backer.
How could we use this in a business rule? I suppose you could have a rule that said:
Conduct special investigations on each person whose indirect financial backing is at-risk.
(You can abstract the financial-backing into any clause or action type). This business rule could connect several lengthy rulesets.
It has been my experience that most organization’s business rules are usually not this opaque. I believe teams that mine for business rules should adhere to Barbara von Halle’s principles. According to Barbara’s business rule quality criteria, rules should be atomic, self-contained and complete (among other things). Atomic rules represent only one policy. Precise rules have only one interpretation. Complete rules contain all the intellectual content needed to apply it to action or to decisions. A good business rule mining team will strive to build concise, self-contained logic in the statements. Unless, they are building a diagnostic or expert system, the outcome is usually a small ruleset for each business area.
The RETE algorithm is not efficient for small rulesets. There is much computational overhead in filtering fact elements for conditions. As I mentioned earlier, RETE is efficient for matching many conditions across many facts. The number of facts that approach this condition is unknown; it is, likely, in the thousands. So, ordinary rulesets are often too small to need RETE’s inference strategy.
A BRMS can carry out a ruleset as production rules or as a sequence of predicate actions. Production rules are the fact-action memory elements used by RETE. A predicate is an ‘If [guard] then [action]… else [action]’ where guards equal the production rule facts. You might think of this as a block of code in an ordinary program. Production rule support a goal. A sequence of predicates have no goal, the sequence evaluates Boolean conditions in the guard and assign variable values in its action. High-performance, high-throughput transactions processes need the predicate approach, or a RETE algorithm that adjusts to smaller rulesets.
Many of the business processes I have designed invoke a small (<1000) ruleset for a single business area. Rulesets I have worked with often number in the many thousands. It is not the number of rules in an entire ruleset that dictates the need for RETE; it is the number of rule invoked by a process. Here, carrying out a ruleset as a sequence of predicates will be more computationally efficient.
Sometimes RETE can be misused to gloss over poor business rule gathering. Consider the following business case:
Married people with children must be offered the family healthcare plan. Single people under the age of 40 must be offered the individual healthcare plan. Married people over 50 must be offered the high option healthcare plan. Married people under 40 should be offered the standard healthcare plan.
The last sentence produces different results depending on where you find the sentence in the set of production rules. You can tweak this ruleset by setting the salience. Salience is a numeric factor that affects the execution order of facts.
Business rules for this scenario are not complete because they cannot be atomic, precise or complete, according to Barbara von Halle’s Business Rule Quality Criteria. I would argue that adding salience to a ruleset glosses over business knowledge. We have sacrificed business rule precision for the accuracy of the outcome.
Does your application of the Business Rules Approach need the power of the RETE algorithm? That depends on the problem you are trying to solve. If you are developing a customer or ordering ruleset for an order-to-cash process for high-volume systems, then your rules engine’s algorithm should adjust to this. If you are developing a complex ruleset for a diagnosis of conditions, economic analysis or fraud-detection, then RETE will help.