Information Extraction from Tables Task

From GM-RKB
Jump to navigation Jump to search

See: Information Extraction Task, Structured Data, HTML Table.



References

2008

2003

2002

2001

  • (Crescenzi et al., 2001) ⇒ Valter Crescenzi, Giansalvatore Mecca, and Paolo Merialdo. (2001). “RoadRunner: Towards Automatic Data Extraction from Large Web Sites.” In: Proceedings of the 27th International Conference on Very Large Data Bases (VLDB 2001).
  • (Chang & Lui, 2001) ⇒ Chia-Hui Chang, and Shao-Chen Lui. (2001). “IEPAD: Information Extraction Based on Pattern.” In: Proceedings of the 10th International Conference on World Wide Web (WWW 2001).
    • It presents a automatically discovers extraction rules from web pages.
    • It utilizes repeated pattern mining and multiple sequence alignment
    • It can automatically identify record boundaries.
    • It proposes the IEPAD Algorithm, composed of three components: Extraction rule generator, pattern viewer and extractor module.