Frequent Trees Pattern Mining Task
(Redirected from Frequent Tree Pattern Mining Task)
Jump to navigation
Jump to search
A Frequent Trees Pattern Mining Task is a Frequent Pattern Mining Task that is required to produce Frequent Tree Patterns.
- AKA: Frequent-Pattern Tree Mining Task, Frequent Tree Mining Task, Frequent-Pattern Tree Learning Task, Frequent-Pattern Tree Recognition Task.
- Context:
- Input:
- a Transaction Database, Tree Database (e.g, TreeBASE and PhyLoTA), or a set of Labeled Trees.
- a Minimum Support Threshold (or user-specified threshold).
- output: FP-Tree of the Database.
- It can be solved by a Frequent Trees Pattern Mining System that implements a Frequent Trees Pattern Mining Algorithm.
- It can range from being a Frequent Ordered Tree Mining Task to being an Frequent Unordered Tree Mining Task.
- It can range from being a Frequent Rooted Tree Mining Task to being an Frequent Unrooted Tree Mining Task.
- …
- Input:
- Example(s):
- Counter-Example(s):
- See: Pattern Recognition Task, Association Rule Learning Task, Apriori Algorithm, Tree Structure.
References
2016
- (Yan, 2016) ⇒ Xifeng Yan (2016). "Frequent Pattern Mining". In: KDD Topics 2016.
- QUOTE: Frequent patterns are itemsets, subsequences, or substructures that appear in a data set with frequency no less than a user-specified threshold. For example, a set of items, such as milk and bread, that appear frequently together in a transaction data set, is a frequent itemset. A subsequence, such as buying first a PC, then a digital camera, and then a memory card, if it occurs frequently in a shopping history database, is a (frequent) sequential pattern. A substructure can refer to different structural forms, such as subgraphs, subtrees, or sublattices, which may be combined with itemsets or subsequences. If a substructure occurs frequently in a graph database, it is called a (frequent) structural pattern. Finding frequent patterns plays an essential role in mining associations, correlations, and many other interesting relationships among data. Moreover, it helps in data indexing, classification, clustering, and other data mining tasks as well. Frequent pattern mining is an important data mining task and a focused theme in data mining research. Abundant literature has been dedicated to this research and tremendous progress has been made, ranging from efficient and scalable algorithms for frequent itemset mining in transaction databases to numerous research frontiers, such as sequential pattern mining, structured pattern mining, correlation mining, associative classification, and frequent pattern-based clustering, as well as their broad applications [1]. A few text books are available on this topic, e.g., [2].
- ↑ Frequent Pattern Mining: Current Status and Future Directions, by J. Han, H. Cheng, D. Xin and X. Yan, 2007 Data Mining and Knowledge Discovery archive, Vol. 15 Issue 1, pp. 55 – 86, 2007.
- ↑ Frequent Pattern Mining, Ed. Charu Aggarwal and Jiawei Han, Springer, 2014.
2014
- (Deepak et al., 2014) ⇒ Akshay Deepak, David Fernández-Baca, Srikanta Tirthapura, Michael J. Sanderson, and Michelle M. Mcmahon. (2014)."EvoMiner: Frequent Subtree Mining in Phylogenetic Databases".; In: Knowledge and Information Systems Journal, 41(3). doi:10.1007/s10115-013-0676-0
2004a
- (Han et al., 2004) ⇒ Jiawei Han, Jian Pei, Yiwen Yin, and Runying Mao. (2004). “Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach.” In: Journal Data Mining and Knowledge Discovery, 8(1). doi:10.1023/B:DAMI.0000005258.31418.83
2004b
- (Chi et al., 2004) ⇒ Yun Chi, Richard R. Muntz, Siegfried Nijssen, and Joost N. Kok. (2001, 2004). "Frequent Subtree Mining - An Overview". In: Fundamenta Informaticae Journal, 66.
2002
- (Zaki, 2002) ⇒ Mohammed J. Zaki. (2002). “Efficiently Mining Frequent Trees in a Forest.” In: Proceedings of the eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. doi:10.1145/775047.775058
- QUOTE: Mining frequent trees is very useful in domains like bioinformatics, web mining, mining semi-structured data, and so on.