RDF Graph Dataset

An RDF Graph Dataset is a labeled directed graph of RDF triples.

Context:
- It can be a member of an RDF Database (e.g. within a semantic graph DBMS).
Example(s):
- …
Counter-Example(s):
- an OWL Ontology.
See: HDT Format, RDF Schema, Semantic Graph.

References

2014a

http://www.rdfhdt.org/datasets/
- QUOTE: We provide some of the most useful/popular datasets from the LOD cloud in HDT for you to use them easily. If the dataset you need is not available here, you can create your own or kindly ask the data provider to publish their datasets in HDT format for all the community to enjoy.

2014b

http://www.w3.org/TR/rdf-mt/#graphdefs
- An RDF graph, or simply a graph, is a set of RDF triples.
  A subgraph of an RDF graph is a subset of the triples in the graph. A triple is identified with the singleton set containing it, so that each triple in a graph is considered to be a subgraph. A proper subgraph is a proper subset of the triples in the graph.
  A ground RDF graph is one with no blank nodes.
  A name is a URI reference or a literal. These are the expressions that need to be assigned a meaning by an interpretation. Note that a typed literal comprises two names: itself and its internal type URI reference.
  A set of names is referred to as a vocabulary. The vocabulary of a graph is the set of names which occur as the subject, predicate or object of any triple in the graph. Note that URI references which occur only inside typed literals are not required to be in the vocabulary of the graph.
  Suppose that M is a mapping from a set of blank nodes to some set of literals, blank nodes and URI references; then any graph obtained from a graph G by replacing some or all of the blank nodes N in G by M(N) is an instance of G. Note that any graph is an instance of itself, an instance of an instance of G is an instance of G, and if H is an instance of G then every triple in H is an instance of some triple in G.
  An instance with respect to a vocabulary V is an instance in which all the names in the instance that were substituted for blank nodes in the original are names from V.
  A proper instance of a graph is an instance in which a blank node has been replaced by a name, or two blank nodes in the graph have been mapped into the same node in the instance.
  Any instance of a graph in which a blank node is mapped to a new blank node not in the original graph is an instance of the original and also has it as an instance, and this process can be iterated so that any 1:1 mapping between blank nodes defines an instance of a graph which has the original graph as an instance. Two such graphs, each an instance of the other but neither a proper instance, which differ only in the identity of their blank nodes, are considered to be equivalent. We will treat such equivalent graphs as identical; this allows us to ignore some issues which arise from 're-naming' nodeIDs, and is in conformance with the convention that blank nodes have no label. Equivalent graphs are mutual instances with an invertible instance mapping.
  An RDF graph is lean if it has no instance which is a proper subgraph of the graph. Non-lean graphs have internal redundancy and express the same content as their lean subgraphs. For example, the graph
```
 <ex:a> <ex:p> _:x . _:y <ex:p> _:x .
```
  is not lean, but
```
 <ex:a> <ex:p> _:x . _:x <ex:p> _:x .
```
  is lean.
  A merge of a set of RDF graphs is defined as follows. If the graphs in the set have no blank nodes in common, then the union of the graphs is a merge; if they do share blank nodes, then it is the union of a set of graphs that is obtained by replacing the graphs in the set by equivalent graphs that share no blank nodes. This is often described by saying that the blank nodes have been 'standardized apart'. It is easy to see that any two merges are equivalent, so we will refer to the merge, following the convention on equivalent graphs. Using the convention on equivalent graphs and identity, any graph in the original set is considered to be a subgraph of the merge.
  One does not, in general, obtain the merge of a set of graphs by concatenating their corresponding N-Triples documents and constructing the graph described by the merged document. If some of the documents use the same node identifiers, the merged document will describe a graph in which some of the blank nodes have been 'accidentally' identified. To merge N-Triples documents it is necessary to check if the same nodeID is used in two or more documents, and to replace it with a distinct nodeID in each of them, before merging the documents. Similar cautions apply to merging graphs described by RDF/XML documents which contain nodeIDs, see RDF/XML Syntax Specification (Revised) [RDF-SYNTAX].

2008

(Quilitz & Leser, 2008) ⇒ Bastian Quilitz, and Ulf Leser. (2008). “Querying Distributed RDF Data Sources with SPARQL.” In: Proceedings of the 5th European Semantic Web Conference (ESWC 2008).
- ABSTRACT: Integrated access to multiple distributed and autonomous RDF data sources is a key challenge for many semantic web applications. As a reaction to this challenge, SPARQL, the W3C Recommendation for an RDF query language, supports querying of multiple RDF graphs. However, the current standard does not provide transparent query federation, which makes query formulation hard and lengthy. Furthermore, current implementations of SPARQL load all RDF graphs mentioned in a query to the local machine. This usually incurs a large overhead in network traffic, and sometimes is simply impossible for technical or legal reasons. To overcome these problems we present DARQ, an engine for federated SPARQL queries. DARQ provides transparent query access to multiple SPARQL services, i.e., it gives the user the impression to query one single RDF graph despite the real data being distributed on the web. A service description language enables the query engine to decompose a query into sub-queries, each of which can be answered by an individual service. DARQ also uses query rewriting and cost-based query optimization to speed-up query execution. Experiments show that these optimizations significantly improve query performance even when only a very limited amount of statistical information is available. DARQ is available under GPL License at http://darq.sf.net/

RDF Graph Dataset

References

2014a

2014b

2008

Navigation menu

Search