Frequent-Pattern (FP) Tree
A Frequent-Pattern (FP) Tree is a Frequent Data Pattern in which a given Tree Structure occurs in Tree Dataset with a frequency greater than a specified threshold.
- Context:
- It can be discovered by a Frequent Tree Pattern Mining Task.
- It can range from being a Frequent Ordered Tree Pattern to being an Frequent Unordered Tree Pattern.
- It can range from being a Frequent Rooted Tree Pattern to being an Frequent Unrooted Tree Pattern.
- …
- Example(s):
- Counter-Example(s):
- See: Frequent Structural Pattern, Pattern Recognition Task, Data Mining, Maximum Agreement Subtrees, Majority Rule Trees.
References
2016
- (Yan, 2016) ⇒ Xifeng Yan (2016). "Frequent Pattern Mining". In: KDD Topics 2016.
- QUOTE: Frequent patterns are itemsets, subsequences, or substructures that appear in a data set with frequency no less than a user-specified threshold. For example, a set of items, such as milk and bread, that appear frequently together in a transaction data set, is a frequent itemset. A subsequence, such as buying first a PC, then a digital camera, and then a memory card, if it occurs frequently in a shopping history database, is a (frequent) sequential pattern. A substructure can refer to different structural forms, such as subgraphs, subtrees, or sublattices, which may be combined with itemsets or subsequences. If a substructure occurs frequently in a graph database, it is called a (frequent) structural pattern. Finding frequent patterns plays an essential role in mining associations, correlations, and many other interesting relationships among data. Moreover, it helps in data indexing, classification, clustering, and other data mining tasks as well. Frequent pattern mining is an important data mining task and a focused theme in data mining research. Abundant literature has been dedicated to this research and tremendous progress has been made, ranging from efficient and scalable algorithms for frequent itemset mining in transaction databases to numerous research frontiers, such as sequential pattern mining, structured pattern mining, correlation mining, associative classification, and frequent pattern-based clustering, as well as their broad applications [1]. A few text books are available on this topic, e.g., [2].
- ↑ Frequent Pattern Mining: Current Status and Future Directions, by J. Han, H. Cheng, D. Xin and X. Yan, 2007 Data Mining and Knowledge Discovery archive, Vol. 15 Issue 1, pp. 55 – 86, 2007.
- ↑ Frequent Pattern Mining, Ed. Charu Aggarwal and Jiawei Han, Springer, 2014.
2014
- (Deepak et al., 2014) ⇒ Akshay Deepak, David Fernández-Baca, Srikanta Tirthapura, Michael J. Sanderson, and Michelle M. Mcmahon. (2014)."EvoMiner: Frequent Subtree Mining in Phylogenetic Databases".; In: Knowledge and Information Systems Journal, 41(3). doi:10.1007/s10115-013-0676-0
2004a
- (Han et al., 2004) ⇒ Jiawei Han, Jian Pei, Yiwen Yin, and Runying Mao. (2004). “Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach.” In: Journal Data Mining and Knowledge Discovery, 8(1). doi:10.1023/B:DAMI.0000005258.31418.83
- QUOTE: To facilitate tree traversal, an item header table is built in which each item points to its first occurrence in the tree via a node-Link. Nodes with the same item-name are linked in sequence via such node-links. After scanning all the transactions, the tree, together with the associated node-links, are shown in Figure 1.
Based on this example, a frequent-pattern tree can be designed as follows.
DEFINITION 1 (FP-tree). A frequent-pattern tree (or FP-tree in short) is a tree structure defined below.
1. It consists of one root labeled as “null”, a set of item-prefix subtrees as the children of the root, and a frequent-item-header table.
2. Each node in the item-prefix subtree consists of three fields: item-name, count, and node-link, where item-name registers which item this node represents, count registers the number of transactions represented by the portion of the path reaching this node, and node-link links to the next node in the FP-tree carrying the same item-name, or null if there is none.
3. Each entry in the frequent-item-header table consists of two fields, (1) item-name and (2) head of node-link (a pointer pointing to the first node in the FP-tree carrying the item-name).
- QUOTE: To facilitate tree traversal, an item header table is built in which each item points to its first occurrence in the tree via a node-Link. Nodes with the same item-name are linked in sequence via such node-links. After scanning all the transactions, the tree, together with the associated node-links, are shown in Figure 1.
2004b
- (Chi et al., 2004) ⇒ Yun Chi, Richard R. Muntz, Siegfried Nijssen, and Joost N. Kok. (2001, 2004). "Frequent Subtree Mining - An Overview". In: Fundamenta Informaticae Journal, 66.
2002
- (Zaki, 2002) ⇒ Mohammed J. Zaki. (2002). “Efficiently Mining Frequent Trees in a Forest.” In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2002). doi:10.1145/775047.775058