Emerging Pattern (EP)
(Redirected from Emerging Patterns)
Jump to navigation
Jump to search
An Emerging Pattern (EP) is an itemset that supports a significant change from one dataset to another.
- Context:
- It can be discovered or learned by a Emerging Pattern Mining System.
- It can be retrieved by a Emerging Pattern Retrieval System.
- …
- Example(s):
- Counter-Example(s):
- See: Data Mining, Pattern Discovery, Frequent-Pattern Mining Task.
References
2017
- (Sammut & Webb, 2017) ⇒ Claude Sammut (editor), and Geoffrey I. Webb (editor). (2017). "Emerging Patterns". In: (Sammut & Webb, 2017). DOI:10.1007/978-1-4899-7687-1_250
- QUOTE: Emerging pattern mining is an area of supervised descriptive rule induction. Emerging patterns are defined as itemsets whose support increases significantly from one data set to another (Dong 1999). Emerging patterns are said to capture emerging trends in time-stamped databases, or to capture differentiating characteristics between classes of data.
2012
- (Yu et al., 2012) ⇒ Kui Yu, Wei Ding, Dan A. Simovici, and Xindong Wu. (2012). “Mining Emerging Patterns by Streaming Feature Selection.” In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2012). ISBN:978-1-4503-1462-6 doi:10.1145/2339530.2339544
- QUOTE: An emerging pattern (EP for short) is a pattern whose support value changes significantly from one class to another [9]. Highly accurate classifiers can be built by aggregating the differentiating power of EPs [7, 10](...)
Given a threshold [math]\displaystyle{ \rho \gt 0 }[/math], an EP from [math]\displaystyle{ D_l }[/math]to [math]\displaystyle{ D_m }[/math] is an itemset X where [math]\displaystyle{ GR_{D_l\rightarrow D_m}(X)\geq\rho }[/math]
- QUOTE: An emerging pattern (EP for short) is a pattern whose support value changes significantly from one class to another [9]. Highly accurate classifiers can be built by aggregating the differentiating power of EPs [7, 10](...)
- An EP e from [math]\displaystyle{ D_l }[/math] to [math]\displaystyle{ D_m }[/math] is also called an EP of [math]\displaystyle{ D_m }[/math]. If [math]\displaystyle{ GR(e)=\infty }[/math], e is called a Jumping EP (JEP). The goal of EP mining is to extract the EP set [math]\displaystyle{ E_i }[/math]for each class [math]\displaystyle{ C_i }[/math]which consists of EPs from [math]\displaystyle{ D-D_i }[/math]to[math]\displaystyle{ D_i }[/math] , given a minimum growth rate threshold and a minimum support threshold.
1999
- (Dong & Li, 1999) ⇒ Dong G, Li J (1999) Efficient mining of emerging patterns: discovering trends and differences. In: Pro- ceedings of the 5th ACM SIGKDD International Conference on knowledge discovery and data mining (KDD-99), San Diego, pp 43–52 DOI:10.1145/312129.312191 Online Free PDF version
- QUOTE: We introduce a new kind of patterns, called emerging patterns (EPs), for knowledge discovery from databases. EPs are defined as itemsets whose supports increase significantly from one dataset to another. EPs can capture emerging trends in time-stamped databases, or useful contrasts between data classes. EPs have been proven useful: we have used them to build very powerful classifiers, which are more accurate than C4.5 and CBA, for many datasets. We believe that EPs with low to medium support, such as 1%-20%, can give useful new insights and guidance to experts, in even “well understood” applications. The efficient mining of EPs is a challenging problem, since (i) the Apriori property no longer holds for EPs, and (ii) there are usually too many candidates for high dimensional databases or for small support thresholds such as 0.5%. Naive algorithms are too costly. To solve this problem, (a) we promote the description of large collections of itemsets using their concise borders (the pair of sets of the minimal and of the maximal itemsets in the collections). (b) We design EP mining algorithms which manipulate only borders of collections (especially using our multiborder-differential algorithm), and which represent discovered EPs using borders. All EPs satisfying a constraint can be efficiently discovered by our border-based algorithms, which take the borders, derived by Max-Miner, of large itemsets as inputs. In our experiments on large and high dimensional datasets including the US census and Mushroom datasets, many EPs, including some with large cardinality, are found quickly. We also give other algorithms for discovering general or special types of EPs.