2007 FilteringProductReviewsFromWebSearchResults
- (Thet et al., 2007) ⇒ Tun Thura Thet, Jin-Cheon Na, and Christopher S. G. Khoo. (2007). “Filtering Product Reviews from Web Search Results.” In: Proceedings of the 2007 ACM symposium on Document Engineering.
Subject Headings: Text Categorization Algorithm, Product Review Classification Task
Notes
- It compares the performance of a Supervised Learning Algorithm and a Heuristic Approach to a Text Categorization Task that is based on Search Snippets.
- The Search Snippets are from Google queries using the format “[product name] review”.
- http://www.springerlink.com/content/85p31628j25r7505/
Cited By
Quotes
Abstract
This study seeks to develop an automatic method to identify product reviews on the Web using the snippets (summary information) returned by search engines. Determining whether a snippet is a review or non-review is a challenging task, since the snippet usually does not contain many useful features for identifying review documents. Firstly we applied a common machine learning technique, SVM (Support Vector Machine), to investigate which features of snippets are useful for the classification. Then we employed a heuristic approach utilizing domain knowledge and found that the heuristic approach performs equally well as the machine learning approach. A hybrid approach which combines the machine learning technique and domain knowledge performs slightly better than the machine learning approach alone.
References
- Choi, B. and Yao, Z. Web Page Classification, Foundations and Advances in Data Mining, Studies in Fuzziness and Soft Computing 180, 2005, 221--274, Springer Berlin/Heidelberg.
- Aidan Finn, Nicholas Kushmerick, Barry Smyth, Genre Classification and Domain Transfer for Information Filtering, Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval, p.353-362, March 25-27, 2002
- Thorsten Joachims, Text Categorization with Suport Vector Machines: Learning with Many Relevant Features, Proceedings of the 10th European Conference on Machine Learning, p.137-142, April 21-23, 1998
- Brett Kessler, Geoffrey Numberg, Hinrich Schütze, Automatic detection of text genre, Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, p.32-38, July 07-12, 1997, Madrid, Spain
- Jin-Cheon Na, Christopher S. G. Khoo, Syin Chan, Norraihan Bte Hamzah, Sentiment-based search in digital libraries, Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries, June 07-11, 2005, Denver, CO, USA doi:10.1145/1065385.1065416
- Jones, K. S. and Willet, P. Readings in Information Retrieval, Morgan Kaufman, 1997.
- Bo Pang, Lillian Lee, Shivakumar Vaithyanathan, Thumbs up?: sentiment classification using machine learning techniques, Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, p.79-86, July 06, 2002 doi:10.3115/1118693.1118704
- J. Ross Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers Inc., San Francisco, CA, 1993
- Fabrizio Sebastiani, Machine learning in automated text categorization, ACM Computing Surveys (CSUR), v.34 n.1, p.1-47, March 2002 doi:10.1145/505282.505283,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2007 FilteringProductReviewsFromWebSearchResults | Tun Thura Thet Jin-Cheon Na Christopher S. G. Khoo | Filtering Product Reviews from Web Search Results | Proceedings of the 2007 ACM symposium on Document Engineering | 10.1145/1284420.1284467 | 2007 |