RDF Graph Dataset

From GM-RKB
Jump to navigation Jump to search

An RDF Graph Dataset is a labeled directed graph of RDF triples.



References

2014a

  • http://www.rdfhdt.org/datasets/
    • QUOTE: We provide some of the most useful/popular datasets from the LOD cloud in HDT for you to use them easily. If the dataset you need is not available here, you can create your own or kindly ask the data provider to publish their datasets in HDT format for all the community to enjoy.

2014b

  • http://www.w3.org/TR/rdf-mt/#graphdefs
    • An RDF graph, or simply a graph, is a set of RDF triples.

      A subgraph of an RDF graph is a subset of the triples in the graph. A triple is identified with the singleton set containing it, so that each triple in a graph is considered to be a subgraph. A proper subgraph is a proper subset of the triples in the graph.

      A ground RDF graph is one with no blank nodes.

      A name is a URI reference or a literal. These are the expressions that need to be assigned a meaning by an interpretation. Note that a typed literal comprises two names: itself and its internal type URI reference.

      A set of names is referred to as a vocabulary. The vocabulary of a graph is the set of names which occur as the subject, predicate or object of any triple in the graph. Note that URI references which occur only inside typed literals are not required to be in the vocabulary of the graph.

      Suppose that M is a mapping from a set of blank nodes to some set of literals, blank nodes and URI references; then any graph obtained from a graph G by replacing some or all of the blank nodes N in G by M(N) is an instance of G. Note that any graph is an instance of itself, an instance of an instance of G is an instance of G, and if H is an instance of G then every triple in H is an instance of some triple in G.

      An instance with respect to a vocabulary V is an instance in which all the names in the instance that were substituted for blank nodes in the original are names from V.

      A proper instance of a graph is an instance in which a blank node has been replaced by a name, or two blank nodes in the graph have been mapped into the same node in the instance.

      Any instance of a graph in which a blank node is mapped to a new blank node not in the original graph is an instance of the original and also has it as an instance, and this process can be iterated so that any 1:1 mapping between blank nodes defines an instance of a graph which has the original graph as an instance. Two such graphs, each an instance of the other but neither a proper instance, which differ only in the identity of their blank nodes, are considered to be equivalent. We will treat such equivalent graphs as identical; this allows us to ignore some issues which arise from 're-naming' nodeIDs, and is in conformance with the convention that blank nodes have no label. Equivalent graphs are mutual instances with an invertible instance mapping.

      An RDF graph is lean if it has no instance which is a proper subgraph of the graph. Non-lean graphs have internal redundancy and express the same content as their lean subgraphs. For example, the graph

       <ex:a> <ex:p> _:x . _:y <ex:p> _:x .
      is not lean, but
       <ex:a> <ex:p> _:x . _:x <ex:p> _:x .
      is lean.

      A merge of a set of RDF graphs is defined as follows. If the graphs in the set have no blank nodes in common, then the union of the graphs is a merge; if they do share blank nodes, then it is the union of a set of graphs that is obtained by replacing the graphs in the set by equivalent graphs that share no blank nodes. This is often described by saying that the blank nodes have been 'standardized apart'. It is easy to see that any two merges are equivalent, so we will refer to the merge, following the convention on equivalent graphs. Using the convention on equivalent graphs and identity, any graph in the original set is considered to be a subgraph of the merge.

      One does not, in general, obtain the merge of a set of graphs by concatenating their corresponding N-Triples documents and constructing the graph described by the merged document. If some of the documents use the same node identifiers, the merged document will describe a graph in which some of the blank nodes have been 'accidentally' identified. To merge N-Triples documents it is necessary to check if the same nodeID is used in two or more documents, and to replace it with a distinct nodeID in each of them, before merging the documents. Similar cautions apply to merging graphs described by RDF/XML documents which contain nodeIDs, see RDF/XML Syntax Specification (Revised) [RDF-SYNTAX].

2008