2006 SpatialScanStatisticsApproximat
- (Agarwal et al., 2006) ⇒ Deepak Agarwal, Andrew McGregor, Jeff M. Phillips, Suresh Venkatasubramanian, and Zhengyuan Zhu. (2006). “Spatial Scan Statistics: Approximations and Performance Study.” In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ISBN:1-59593-339-5 doi:10.1145/1150402.1150410
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%222006%22+Spatial+Scan+Statistics%3A+Approximations+and+Performance+Study
- http://dl.acm.org/citation.cfm?id=1150402.1150410&preflayout=flat#citedby
Quotes
Abstract
Spatial scan statistics are used to determine hotspots in spatial data, and are widely used in epidemiology and biosurveillance. In recent years, there has been much effort invested in designing efficient algorithms for finding such " high discrepancy " regions, with methods ranging from fast heuristics for special cases, to general grid-based methods, and to efficient approximation algorithms with provable guarantees on performance and quality.In this paper, we make a number of contributions to the computational study of spatial scan statistics. First, we describe a simple exact algorithm for finding the largest discrepancy region in a domain. Second, we propose a new approximation algorithm for a large class of discrepancy functions (including the Kulldorff scan statistic) that improves the approximation versus run time trade-off of prior methods. Third, we extend our simple exact and our approximation algorithms to data sets which lie naturally on a grid or are accumulated onto a grid. Fourth, we conduct a detailed experimental comparison of these methods with a number of known methods, demonstrating that our approximation algorithm has far superior performance in practice to prior methods, and exhibits a good performance-accuracy trade-off. All extant methods (including those in this paper) are suitable for data sets that are modestly sized; if data sets are of the order of millions of data points, none of these methods scale well. For such massive data settings, it is natural to examine whether small-space streaming algorithms might yield accurate answers. Here, we provide some negative results, showing that any streaming algorithms that even provide approximately optimal answers to the discrepancy maximization problem must use space linear in the input.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2006 SpatialScanStatisticsApproximat | Deepak Agarwal Jeff M. Phillips Suresh Venkatasubramanian Andrew McGregor Zhengyuan Zhu | Spatial Scan Statistics: Approximations and Performance Study | 10.1145/1150402.1150410 | 2006 |