2012 GigaTensorScalingTensorAnalysis
- (Kang et al., 2012) ⇒ U. Kang, Evangelos Papalexakis, Abhay Harpale, and Christos Faloutsos. (2012). “GigaTensor: Scaling Tensor Analysis Up by 100 Times - Algorithms and Discoveries.” In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2012). ISBN:978-1-4503-1462-6 doi:10.1145/2339530.2339583
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%222012%22+GigaTensor%3A+Scaling+Tensor+Analysis+Up+by+100+Times+-+Algorithms+and+Discoveries
- http://dl.acm.org/citation.cfm?id=2339530.2339583&preflayout=flat#citedby
Quotes
Author Keywords
Abstract
Many data are modeled as tensors, or multi dimensional arrays. Examples include the predicates (subject, verb, object) in knowledge bases, hyperlinks and anchor texts in the Web graphs, sensor streams (time, location, and type), social networks over time, and DBLP conference-author-keyword relations. Tensor decomposition is an important data mining tool with various applications including clustering, trend detection, and anomaly detection. However, current tensor decomposition algorithms are not scalable for large tensors with billions of sizes and hundreds millions of nonzeros: the largest tensor in the literature remains thousands of sizes and hundreds thousands of nonzeros.
Consider a knowledge base tensor consisting of about 26 million noun-phrases. The intermediate data explosion problem, associated with naive implementations of tensor decomposition algorithms, would require the materialization and the storage of a matrix whose largest dimension would be â7 x 1014; this amounts to ~10 Petabytes, or equivalently a few data centers worth of storage, thereby rendering the tensor analysis of this knowledge base, in the naive way, practically impossible. In this paper, we propose GIGATENSOR, a scalable distributed algorithm for large scale tensor decomposition. GIGATENSOR exploits the sparseness of the real world tensors, and avoids the intermediate data explosion problem by carefully redesigning the tensor decomposition algorithm.
Extensive experiments show that our proposed GIGATENSOR solves 100 times bigger problems than existing methods. Furthermore, we employ GIGATENSOR in order to analyze a very large real world, knowledge base tensor and present our astounding findings which include discovery of potential synonyms among millions of noun-phrases (e.g. the noun ' pollutant' and the noun-phrase ' greenhouse gases').
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2012 GigaTensorScalingTensorAnalysis | Christos Faloutsos U. Kang Evangelos Papalexakis Abhay Harpale | 10.1145/2339530.2339583 | 2012 |
[[title::GigaTensor: Scaling Tensor Analysis Up by 100 Times - Algorithms and Discoveries|]]