Spock Entity Resolution Challenge
Jump to navigation
Jump to search
The Spock Entity Resolution Challenge was an Multi-Document Coreference Resolution Task Competition that took place from Aug to Dec 2007.
- AKA: Spock Benchmark Task, Spock Challenge.
- Context:
- The training data contains 100,000 documents, which includes Person Mentions.
- The task is to determine all the distinct people described in the data set.
- The score on a test set is in the form of a percentage rank score which depends on how many correct unique people were identified in the data.
- See: BioCreAtIvE Task, Person Mention.
References
- http://challenge.spock.com
- Redwood City, CA - March 4, 2008 - Spock, the leading people search engine, announced today the winners of the Spock Challenge. A six-person team of researchers, faculty and students from Germany's Bauhaus University Weimar were awarded the $50,000 grand prize.
- The challenge called for leading computer scientists and engineers to solve how to distinguish many people with the same name, e.g., Michael Jackson the singer from Michael Jackson the football player.
- "With billions of documents and people online, we are now able to more precisely categorize and cluster web documents to unique individuals" said Jaideep Singh, Co-Founder & CEO of Spock. “Mapping named entities from documents to the correct person was the essence of the Spock Challenge, and the team from Germany did a tremendous job," added Singh.
- http://radar.oreilly.com/2007/04/the-spock-entity-resolution-ch.html
- To improve our technology and to create a better user experience, we decided to share the fun! We have selected one of our most interesting problems, namely Entity Resolution, to share with the community, allowing other leading computer scientists and engineers to compete in an open contest. The winners of this global competition will reap a handsome reward, and perhaps even employment at Spock.
- You can work individually and in teams. The competition will last 4 months and the winning team will win a Grand Prize of $50,000! Most importantly you’ll be working on a very important and widely applicable problem. We will also be issuing prizes for 2nd and 3rd place.
- A common problem that we face is that there are many people with the same name. Given that, how do we distinguish a document about Michael Jackson the singer from Michael Jackson the football player?
- With billions of documents and people on the web, we need to identify and cluster web documents accurately to the people they are related to. Mapping these named entities from documents to the correct person is the essence of the Spock Challenge.
- In order to constrain the problem so that it can be successfully solved by an individual or a small team, we provide you with real world data with ground truth. This data contains 100,000 documents about people, and the challenge is to determine all the distinct people described in the data set. This data can be your training set. Once you’ve got your basic algorithm working against the training set, we let you further tune your code by running it against a second test data set.
- We give you instant accuracy feedback in the form of a percentage rank score. The score depends on how many correct unique people you can identify in the data. This way you can continue to refine your work and see how well you are holding up against your competitors.