2006 TutorialIEandJointInference McCallum

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Conditional Random Field, Information Extraction, Text Mining, Relationship Extraction, Protein-Protein Interaction, Coreference Resolution.

Notes

Cited By

Quotes

  • What model structures will capture salient dependencies?"
  • Will joint inference actually improve accuracy?"
    • How to do inference in these large graphical models?"
    • How to do parameter estimation efficiently in these models, which are built from multiple large components?"
    • How to do structure discovery in these models?"
  • Although information extraction and data mining appear together in many applications, their interface in most current systems would better be described as serial juxtaposition than as tight integration. Information extraction populates slots in a database by identifying relevant subsequences of text, but is usually not aware of the emerging patterns and regularities in the database. Data mining methods begin from a populated database, and are often unaware of where the data came from, or its inherent uncertainties. The result is that the accuracy of both suffers, and significant mining of complex text sources is beyond reach.
  • The task of relation extraction is to discover connections between entities in text. In the simplest case, pattern matching can be used to fill a database of such facts with high precision. These patterns can be brittle to variations in utterances, so we can build a classifier that computes contextual features; for example, using a thesaurus or ontology to improve recall. But what about all the facts that these methods miss because the contextual cues are too sparse, noisy, or complex? In this example, we have to first extract that Bill and George are fellow alumns. Then we have to know the (rather common sense) rule that fellow alumns attend the same universities. This is an example of an implicit relation, so called because the relation itself is not spelled out by the context. But what about this final example? Here, there is no direct contextual evidence that Bush attended Yale. We only know that they are both mentioned in the same document. But what if we already knew some facts about Bush, like that he’s a fellow alumn with Clinton, or that his father also attended Yale, or that he’s a president, or that he lived in Connecticut. All of these facts present some supporting evidence that Bush attended Yale. Can we use machine learning methods to discover these relational patterns, and to decide how much credence to give to each of these pieces of evidence? If we can do this reliably, we can then talk about performing real knowledge discovery from text, where we can trawl a large corpus generating facts that are not explicitly stated, and perhaps are even currently unknown. If you think there’s a lot of information on the Web, imagine how much is implied by the Web."
  • Application of (Linear Chain) (Conditional Random Fields have attained positive results in:
  • Conclusion
    • Joint inference needed for avoiding cascading errors in information extraction and data mining.
    • Challenge: making inference & learning scale to massive graphical models. Markov-chain Monte Carlo
    • Rexa: New research paper search engine, mining the interactions in our community.

References


,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2006 TutorialIEandJointInference McCallumAndrew McCallumInformation Extraction, Data Mining and Joint Inferencehttp://www.kdd2006.com/docs/presentations/andrewMcCallum06Talk.ppt2006