CN2 Induction Algorithm

Context:
- It can be implemented by a CN2 Induction System to solve a CN2 Induction Task.
- It was initially developed by Clark & Nibblet (1989).
- It outputs a If-Then Rule Set.
- …
Example(s):
- Clark & Nibblet (1989) - CN2 Algorithm (pseudocode).
- Wikipedia - CN2 Algorithm (pseudocode)
Counter-Example(s):
See: Pattern Mining Algorithm, Decision Tree Induction Algorithm, Inductive Logic Programming, Propositional Rule, Firs-Order Logic Rule.

References

(Wikipedia, 2019) ⇒ https://en.wikipedia.org/wiki/CN2_algorithm Retrieved:2019-11-3.
- The CN2 induction algorithm is a learning algorithm for rule induction (Clark & Nibblet, 1989). The CN2 induction algorithm. Machine Learning 3(4):261-283. </ref> It is designed to work even when the training data is imperfect. It is based on ideas from the AQ algorithm and the ID3 algorithm. As a consequence it creates a rule set like that created by AQ but is able to handle noisy data like ID3.

(Clark & Nibblet, 1989) ⇒ Peter Clark, and Tim Niblett. (1989). “The CN2 Induction Algorithm”. In: Machine Learning, 3(4) DOI:10.1023/A:1022641700528
- QUOTE: Table 3 presents a summary of the CN2 algorithm. This works in an iterative fashion, each iteration searching for a complex that covers a large number of examples of a single class C and few of other classes...

**Table 3:** The CN2 induction algorithm.

Let E be a set of classified examples.
Let SELECTORS be the set of all possible selectors.
Procedure CN2(E)
	Let RULE-LIST be the empty list.
	Repeat until BEST_CPX is nil or E is empty:
		Let BEST_CPX be Find-Best_Complex(E).
		If BEST_CPX is not nil,
		Then let E' be the examples covered by BEST_CPX.
			Remove from E the examples E' covered by BEST_CPX.
			Let C be the most common class of examples in E'.
			Add the rule 'If BEST_CPX then the class is C'
				to the end of RULE-LIST.
	Return RULE-LIST.

Procedure Find-Best_Complex(E)
	Let STAR be the set containing the empty complex.
	Let BEST_CPX be nil.
	While STAR is not empty,
		Specialize all complexes in STAR as follows:
		Let NEWSTAR be the set {x ∧ y\|x ∈ STAR, y ∈ SELECTORS}.
		Remove all complexes in NEWSTAR that are either in STAR (i.e., the unspecialized ones) or null (e.g., big = y ∈ big = n).
		For every complex C_i in NEWSTAR:
			If C_i is statistically significant and better than BEST_CPX by user-defined criteria when tested on E,
				Then replace the current value of BEST_CPX by C_{i_.}
		Repeat until size of NEWSTAR ≤ user-defined maximum:
		Remove the worst complex from NEWSTAR.
		Let STAR be NEWSTAR.
	Return BEST_CPX.