Product Mention Normalization Task
Jump to navigation
Jump to search
A Product Mention Normalization Task is a domain specific entity mention normalization task that is restricted to the mapping of product mentions to canonical product records.
- AKA: Product Entity Mention Dereferencing, PMNT.
- Context:
- Input: Annotated Text (with annotated product mentions); and a Consumer Product Offer Database.
- It can supported by a Product Mention Recognition Task and a Product Mention Disambiguation Task.
- It can support a Consumer Product Mention Hyperlinking Task.
- It can be a non-trivial task because:
- there is no standard Product Database.
- there are so many ways to spell a Product Name.
- company's have internal Nicknames for their products. E.g. Montevina ⇒ Centrino 2.
- company's release marketing names. Intel's X58 chipset is referred to as ICH10R.
- Example(s):
- PMNT(“Let's show off the [
PRODUCT|
Nokia N95 8G] during the Nokia convention.”, {Open IceCat Product Catalog})
⇒ ("Let's show off the Nokia N95 8G during the Nokia convention.”). - PMNT(“Just a very basic question, would it be alright to put 2 x [
PRODUCT|
30GB OCZ Solid Series SATA II 2.5" SSD] into RAID0 on an [PRODUCT|
ICH10R controller]?”, {Open IceCat Product Catalog, Intel Product Catalog, ...})
=> “Just a very basic question, would it be alright to put 2 x 30GB OCZ Solid Series SATA II 2.5" SSD into RAID0 on an ICH10R controller?” - PMNT(“I imported my [Samsung i900 Omnia 16gb] - I like [the phone] [it] has so many features wifi, blutooth, word pad, excel, 5mp camera to name a few i900 has a 3.2" screen which is not used very well as the size of the font used in text messages in so tiny its nearly impossible to see it which defeats the object of having a large screen. [It] can perform any type of file i have thrown at it and even downloaded stardust movie onto [it] and can watch it full screen. Overall if you got good eye sight to read tiny print and don't mind charging [it] every two days [this phone] is very good.” from http://www.reviewcentre.com/review401108.html):
- Samsung i900 Omnia 16gb ⇒ http://www.gsmarena.com/samsung_i900_omnia-2422.php
- Samsung i900 Omnia 16gb ⇒ http://www.gsmarena.com/samsung_i900_omnia-2422.php
- PMNT(“As we near the end of our wait for AMD's long overdue response to [Centrino], Intel has fired another salvo; [Centrino 2]. This [Toshiba Satellite A300-02C] is my first notebook using Intel's latest bits and I am giddy like a school girl to test it out.” from http://www.notebookreview.com/default.asp?newsID=4598):
- (Intel) Centrino ⇒ http://www.intel.com/products/centrino/centrino/
- (Intel) Centrino 2 ⇒ http://www.intel.com/products/centrino/centrino2/ http://en.wikipedia.org/wiki/Centrino#Montevina_platform_.282008.29
- Toshiba Satellite A300-02C ⇒ (??nearest??) http://explore.toshiba.com/laptops/satellite/A300/A300-ST4505
- PMNT(“Let's show off the [
- Counter-Example(s):
- See: CPROD1 Task.
References
2012
- (Lin, Liu et al., 2012) ⇒ Lei Lin, Ming Liu, Bingquan Liu, and Xuejun Sha. (2012). “A Product Named Entity Normalization Method based on Entity Relations.” In: Proceedings of the 8th International Conference on Information Science and Digital Content Technology (ICIDT).
- QUOTE: With the popularity and prosperity of e-commerce, text mining technologies for e-commerce information processing have been become more and more important. Product named entity normalization technology plays a vital role for the performance of e-commerce information processing because it can resolve the ambiguities of product named entities which is caused by the rich aliases and the complex structures of product names. This work proposed a relation based method for product named entity normalization. The proposed method first detected the relations between entities, and then used the relations to inference the full form of an entity. After that the similarities between the target entity with full form and the entries in a dictionary were calculated. The corresponding identifier of the most similar entries in the dictionary was chosen as the normalization result for the target entity. When calculating the similarity between two entities, the structures of the two entities were considered. Experiments on an annotated corpus consisting of web documents related to electronic product showed promising results of the proposed method, which achieved an accuracy of 88.09%.