Billion Triple Challenge 2009 Dataset

Jump to navigation Jump to search

See: Semantic Search Benchmark Task, Benchmark Dataset.



    • Data: We provide a corpus of datasets, which contain entity descriptions in the form of RDF. They represent a sample of Web data crawled from publicly available sources. For this evaluation, we use the Billion Triple Challenge 2009 dataset. Further information and detailed statistics can be found here: The original Billion Triple Challenge 2009 dataset contains blank nodes. We will not deal with blank nodes in this evaluation and thus require participants to encode blank nodes according to the following rule: BNID map to, where BNID is the blank node id. Since the blank node ids in that dataset are unique, this convention is sufficient to map blank nodes to obtain distinct URIs. Instead of encoding the blank nodes using this convention, participants can also download the following version of the Billion Triple Challenge 2009 dataset where blank nodes are have been already converted to URIs:
