SPARQL Tutorial@TarguMures Semantic Web MeetupJune 22, 2011Adonis Damian
Introduction - SPARQLSPARQL is a query language/engine for RDF graphs. An RDF graph is a set of triples. A flexible and extensible way to represent information about resourcesA concept similar to SQL for data bases. A W3C standard query language to fetch data from distributed Semantic Web data models. Can query a triple store or data on the Web (at a given URL). It provides facilities to:extract information in the form of URIs, blank nodes, plain and typed literals. extract RDF subgraphs. construct new RDF graphs based on information in the queried graphs
What is RDF?RDF is a data model of graphs of subject, predicate, object triples.Resources are represented with URIs, which can be abbreviated as prefixed namesObjects can be literals: strings, integers, booleans, etc.
Short Ontology Introduction
Reasoning OWL
A SPARQL query comprises, in order:Prefix declarations, for abbreviating URIsDataset definition, stating what RDF graph(s) are being queriedA result clause, identifying what information to return from the queryThe query pattern, specifying what to query for in the underlying datasetQuery modifiers, slicing, ordering, and otherwise rearranging query results# prefix declarationsPREFIX foo: http://example.com/resources/...# dataset definitionFROM ...# result clauseSELECT ...# query patternWHERE { ... }# query modifiersORDER BY ...
SPARQL LandscapeSPARQL 1.0 became a standard in January, 2008, and included:SPARQL 1.0 Query LanguageSPARQL 1.0 ProtocolSPARQL Results XML FormatSPARQL 1.1 is in-progress, and includes:Updated 1.1 versions of SPARQL Query and SPARQL ProtocolSPARQL 1.1 UpdateSPARQL 1.1 Uniform HTTP Protocol for Managing RDF GraphsSPARQL 1.1 Service DescriptionsSPARQL 1.1 Basic Federated Query
First QueryQuery DBPedia at http://dbpedia.org/snorql/Give me all the objects that are a personSELECT ?personWHERE { ?person rdf:typefoaf:Person.}SPARQL variables start with a ? and can match any node (resource or literal) in the RDF dataset.Triple patterns are just like triples, except that any of the parts of a triple can be replaced with a variable.The SELECT result clause returns a table of variables and values that satisfy the query.Dataset: http://downloads.dbpedia.org/3.6/dbpedia_3.6.owl
Multiple triple patternsGive me all the people that had a Nobel Prize ideaSELECT ?person ?ideaWHERE { ?person rdf:typefoaf:Person. ?person <http://dbpedia.org/ontology/notableIdea> ?idea.}AlternativeSELECT *WHERE { ?person a foaf:Person;	<http://dbpedia.org/ontology/notableIdea> ?idea.}We can use multiple triple patterns to retrieve multiple properties about a particular resourceShortcut: SELECT * selects all variables mentioned in the queryUser a instead of rdf:typeUse ; to refer to the same subject
Multiple triple patterns: traversing a graphFind me all the artists that were born in BrazilSELECT *WHERE { ?person a <http://dbpedia.org/ontology/Artist>; <http://dbpedia.org/ontology/birthPlace> ?birthPlace. ?birthPlace <http://dbpedia.org/ontology/country> ?country. ?country rdfs:label "Brazil"@en.}countryArtistBirth PlacebirthPlaceCountrylabelRdfs:label
Limit the number of resultsFind me 50 example concepts in the DBPedia dataset.SELECT DISTINCT ?conceptWHERE {	?s a ?concept .} ORDER BY DESC(?concept)LIMIT 50LIMIT is a solution modifier that limits the number of rows returned from a query. SPARQL has two other solution modifiers:ORDER BY for sorting query solutions on the value of one or more variablesOFFSET, used in conjunction with LIMIT and ORDER BY to take a slice of a sorted solution set (e.g. for paging)The DISTINCT modifier eliminates duplicate rows from the query results.
Basic SPARQL filtersFind me all landlocked countries with a population greater than 15 million.PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX type: <http://dbpedia.org/class/yago/>PREFIX prop: <http://dbpedia.org/property/>SELECT ?country_name ?populationWHERE { ?country a type:LandlockedCountries ;rdfs:label ?country_name ;prop:populationEstimate ?population . FILTER (?population > 15000000) .}FILTER constraints use boolean conditions to filter out unwanted query results.Shortcut: a semicolon (;) can be used to separate multiple triple patterns that share the same subject. (?country is the shared subject above.)rdfs:label is a common predicate for giving a human-friendly label to a resource.Note all the translated duplicates in the results. How can we deal with that?
Basic SPARQL filters - languageFind me all landlocked countries with a population greater than 15 million and show me their English namePREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX type: <http://dbpedia.org/class/yago/>PREFIX prop: <http://dbpedia.org/property/>SELECT ?country_name ?populationWHERE { ?country a type:LandlockedCountries ;rdfs:label ?country_name ;prop:populationEstimate ?population . FILTER (?population > 15000000) . FILTER (lang(?country_name) = "en") .}
Filters using a rangeFind me all the artists born in Austria in the 19thsencurySELECT *WHERE { ?person a <http://dbpedia.org/ontology/Artist>; <http://dbpedia.org/ontology/birthPlace> ?birthPlace. ?birthPlace <http://dbpedia.org/ontology/country> ?country. ?country rdfs:label "Austria"@en. ?person <http://dbpedia.org/property/dateOfBirth> ?dob FILTER (?dob > "1/1/1800"^^xsd:date && ?dob < "12/31/1899"^^xsd:date)}
SPARQL built-in filter functionsLogical: !, &&, ||Math: +, -, *, /Comparison: =, !=, >, <, ...SPARQL tests: isURI, isBlank, isLiteral, boundSPARQL accessors: str, lang, datatypeOther: sameTerm, langMatches, regex
Finding artists' info - the wrong wayFind all Jamendo artists along with their image, home page, and the location they're near.PREFIX mo: <http://purl.org/ontology/mo/>PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?name ?img ?hp ?locWHERE { ?a amo:MusicArtist ;foaf:name ?name ;foaf:img ?img ;foaf:homepage ?hp ;foaf:based_near ?loc .} Jamendo has information on about 3,500 artists.Trying the query, though, we only get 2,667 results. What's wrong?Query at: DBTune.org'sJamendo-specific SPARQL endpoint
Finding artists' info - the right wayPREFIX mo: <http://purl.org/ontology/mo/>PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?name ?img ?hp ?locWHERE { ?a amo:MusicArtist ;foaf:name ?name . OPTIONAL { ?a foaf:img ?img } OPTIONAL { ?a foaf:homepage ?hp } OPTIONAL { ?a foaf:based_near ?loc }}OPTIONAL tries to match a graph pattern, but doesn't fail the whole query if the optional match fails.If an OPTIONAL pattern fails to match for a particular solution, any variables in that pattern remain unbound (no value) for that solution.
Querying alternativesFind me everything about HawaiiSELECT ?property ?hasValue ?isValueOfWHERE { { <http://dbpedia.org/resource/Hawaii> ?property ?hasValue } UNION { ?isValueOf ?property <http://dbpedia.org/resource/Hawaii> }}The UNION keyword forms a disjunction of two graph patterns. Solutions to both sides of the UNION are included in the results.
RDF DatasetsWe said earlier that SPARQL queries are executed against RDF datasets, consisting of RDF graphs.So far, all of our queries have been against a single graph. In SPARQL, this is known as the default graph.RDF datasets are composed of the default graph and zero or more named graphs, identified by a URI.Named graphs can be specified with one or more FROM NAMED clauses, or they can be hardwired into a particular SPARQL endpoint.The SPARQL GRAPH keyword allows portions of a query to match against the named graphs in the RDF dataset. Anything outside a GRAPH clause matches against the default graph.
RDF Datasets
Querying named graphsFind me people who have been involved with at least three ISWC or ESWC conference events.SELECT DISTINCT ?person ?nameWHERE { ?person foaf:name ?name . GRAPH ?g1 { ?person a foaf:Person } GRAPH ?g2 { ?person a foaf:Person } GRAPH ?g3 { ?person a foaf:Person } FILTER(?g1 != ?g2 && ?g1 != ?g3 && ?g2 != ?g3) .} N.B. The FILTER assures that we're finding a person who occurs in three distinct graphs.N.B. The Web interface we use for this SPARQL query defines the foaf: prefix, which is why we omit it here.Try it with the data.semanticweb.org SPARQL endpoint.
Transforming between vocabulariesConvert FOAF data to VCard data.PREFIX vCard: <http://www.w3.org/2001/vcard-rdf/3.0#>PREFIX foaf: <http://xmlns.com/foaf/0.1/>CONSTRUCT { ?X vCard:FN ?name . ?X vCard:URL ?url . ?X vCard:TITLE ?title .}FROM <http://dig.csail.mit.edu/2008/webdav/timbl/foaf.rdf>WHERE { OPTIONAL { ?X foaf:name ?name . FILTER isLiteral(?name) . } OPTIONAL { ?X foaf:homepage ?url . FILTER isURI(?url) . } OPTIONAL { ?X foaf:title ?title . FILTER isLiteral(?title) . }}The result RDF graph is created by taking the results of the equivalent SELECT query and filling in the values of variables that occur in the CONSTRUCT template.Triples are not created in the result graph.Try it with ARQ or OpenLink's Virtuoso. (Expected results.)
ASKing a questionIs the Amazon river longer than the Nile River?PREFIX prop: <http://dbpedia.org/property/>ASK{ <http://dbpedia.org/resource/Amazon_River> prop:length ?amazon . <http://dbpedia.org/resource/Nile> prop:length ?nile . FILTER(?amazon > ?nile) .}As with SELECT queries, the boolean result is (by default) encoded in an SPARQL Results Format XML document.Shortcut: the WHERE keyword is optional--not only in ASK queries but in all SPARQL queries.
Learning about a resourceTell me whatever you'd like to tell me about the Amazon river.PREFIX prop: <http://dbpedia.org/property/>DESCRIBE ?amazon{ ?amazon rdfs:label "Amazon River"@en.}Because the server is free to interpret DESCRIBE as it sees fit, DESCRIBE queries are not interoperable.Common implementations include concise-bounded descriptions, named graphs, minimum self-contained graphs, etc
What’s new in SPARQL 1.1Learn about SPARQL 1.1 by David Becket

Semantic web meetup – sparql tutorial

  • 1.
    SPARQL Tutorial@TarguMures SemanticWeb MeetupJune 22, 2011Adonis Damian
  • 2.
    Introduction - SPARQLSPARQLis a query language/engine for RDF graphs. An RDF graph is a set of triples. A flexible and extensible way to represent information about resourcesA concept similar to SQL for data bases. A W3C standard query language to fetch data from distributed Semantic Web data models. Can query a triple store or data on the Web (at a given URL). It provides facilities to:extract information in the form of URIs, blank nodes, plain and typed literals. extract RDF subgraphs. construct new RDF graphs based on information in the queried graphs
  • 3.
    What is RDF?RDFis a data model of graphs of subject, predicate, object triples.Resources are represented with URIs, which can be abbreviated as prefixed namesObjects can be literals: strings, integers, booleans, etc.
  • 4.
  • 5.
  • 6.
    A SPARQL querycomprises, in order:Prefix declarations, for abbreviating URIsDataset definition, stating what RDF graph(s) are being queriedA result clause, identifying what information to return from the queryThe query pattern, specifying what to query for in the underlying datasetQuery modifiers, slicing, ordering, and otherwise rearranging query results# prefix declarationsPREFIX foo: http://example.com/resources/...# dataset definitionFROM ...# result clauseSELECT ...# query patternWHERE { ... }# query modifiersORDER BY ...
  • 7.
    SPARQL LandscapeSPARQL 1.0became a standard in January, 2008, and included:SPARQL 1.0 Query LanguageSPARQL 1.0 ProtocolSPARQL Results XML FormatSPARQL 1.1 is in-progress, and includes:Updated 1.1 versions of SPARQL Query and SPARQL ProtocolSPARQL 1.1 UpdateSPARQL 1.1 Uniform HTTP Protocol for Managing RDF GraphsSPARQL 1.1 Service DescriptionsSPARQL 1.1 Basic Federated Query
  • 8.
    First QueryQuery DBPediaat http://dbpedia.org/snorql/Give me all the objects that are a personSELECT ?personWHERE { ?person rdf:typefoaf:Person.}SPARQL variables start with a ? and can match any node (resource or literal) in the RDF dataset.Triple patterns are just like triples, except that any of the parts of a triple can be replaced with a variable.The SELECT result clause returns a table of variables and values that satisfy the query.Dataset: http://downloads.dbpedia.org/3.6/dbpedia_3.6.owl
  • 9.
    Multiple triple patternsGiveme all the people that had a Nobel Prize ideaSELECT ?person ?ideaWHERE { ?person rdf:typefoaf:Person. ?person <http://dbpedia.org/ontology/notableIdea> ?idea.}AlternativeSELECT *WHERE { ?person a foaf:Person; <http://dbpedia.org/ontology/notableIdea> ?idea.}We can use multiple triple patterns to retrieve multiple properties about a particular resourceShortcut: SELECT * selects all variables mentioned in the queryUser a instead of rdf:typeUse ; to refer to the same subject
  • 10.
    Multiple triple patterns:traversing a graphFind me all the artists that were born in BrazilSELECT *WHERE { ?person a <http://dbpedia.org/ontology/Artist>; <http://dbpedia.org/ontology/birthPlace> ?birthPlace. ?birthPlace <http://dbpedia.org/ontology/country> ?country. ?country rdfs:label "Brazil"@en.}countryArtistBirth PlacebirthPlaceCountrylabelRdfs:label
  • 11.
    Limit the numberof resultsFind me 50 example concepts in the DBPedia dataset.SELECT DISTINCT ?conceptWHERE { ?s a ?concept .} ORDER BY DESC(?concept)LIMIT 50LIMIT is a solution modifier that limits the number of rows returned from a query. SPARQL has two other solution modifiers:ORDER BY for sorting query solutions on the value of one or more variablesOFFSET, used in conjunction with LIMIT and ORDER BY to take a slice of a sorted solution set (e.g. for paging)The DISTINCT modifier eliminates duplicate rows from the query results.
  • 12.
    Basic SPARQL filtersFindme all landlocked countries with a population greater than 15 million.PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX type: <http://dbpedia.org/class/yago/>PREFIX prop: <http://dbpedia.org/property/>SELECT ?country_name ?populationWHERE { ?country a type:LandlockedCountries ;rdfs:label ?country_name ;prop:populationEstimate ?population . FILTER (?population > 15000000) .}FILTER constraints use boolean conditions to filter out unwanted query results.Shortcut: a semicolon (;) can be used to separate multiple triple patterns that share the same subject. (?country is the shared subject above.)rdfs:label is a common predicate for giving a human-friendly label to a resource.Note all the translated duplicates in the results. How can we deal with that?
  • 13.
    Basic SPARQL filters- languageFind me all landlocked countries with a population greater than 15 million and show me their English namePREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX type: <http://dbpedia.org/class/yago/>PREFIX prop: <http://dbpedia.org/property/>SELECT ?country_name ?populationWHERE { ?country a type:LandlockedCountries ;rdfs:label ?country_name ;prop:populationEstimate ?population . FILTER (?population > 15000000) . FILTER (lang(?country_name) = "en") .}
  • 14.
    Filters using arangeFind me all the artists born in Austria in the 19thsencurySELECT *WHERE { ?person a <http://dbpedia.org/ontology/Artist>; <http://dbpedia.org/ontology/birthPlace> ?birthPlace. ?birthPlace <http://dbpedia.org/ontology/country> ?country. ?country rdfs:label "Austria"@en. ?person <http://dbpedia.org/property/dateOfBirth> ?dob FILTER (?dob > "1/1/1800"^^xsd:date && ?dob < "12/31/1899"^^xsd:date)}
  • 15.
    SPARQL built-in filterfunctionsLogical: !, &&, ||Math: +, -, *, /Comparison: =, !=, >, <, ...SPARQL tests: isURI, isBlank, isLiteral, boundSPARQL accessors: str, lang, datatypeOther: sameTerm, langMatches, regex
  • 16.
    Finding artists' info- the wrong wayFind all Jamendo artists along with their image, home page, and the location they're near.PREFIX mo: <http://purl.org/ontology/mo/>PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?name ?img ?hp ?locWHERE { ?a amo:MusicArtist ;foaf:name ?name ;foaf:img ?img ;foaf:homepage ?hp ;foaf:based_near ?loc .} Jamendo has information on about 3,500 artists.Trying the query, though, we only get 2,667 results. What's wrong?Query at: DBTune.org'sJamendo-specific SPARQL endpoint
  • 17.
    Finding artists' info- the right wayPREFIX mo: <http://purl.org/ontology/mo/>PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?name ?img ?hp ?locWHERE { ?a amo:MusicArtist ;foaf:name ?name . OPTIONAL { ?a foaf:img ?img } OPTIONAL { ?a foaf:homepage ?hp } OPTIONAL { ?a foaf:based_near ?loc }}OPTIONAL tries to match a graph pattern, but doesn't fail the whole query if the optional match fails.If an OPTIONAL pattern fails to match for a particular solution, any variables in that pattern remain unbound (no value) for that solution.
  • 18.
    Querying alternativesFind meeverything about HawaiiSELECT ?property ?hasValue ?isValueOfWHERE { { <http://dbpedia.org/resource/Hawaii> ?property ?hasValue } UNION { ?isValueOf ?property <http://dbpedia.org/resource/Hawaii> }}The UNION keyword forms a disjunction of two graph patterns. Solutions to both sides of the UNION are included in the results.
  • 19.
    RDF DatasetsWe saidearlier that SPARQL queries are executed against RDF datasets, consisting of RDF graphs.So far, all of our queries have been against a single graph. In SPARQL, this is known as the default graph.RDF datasets are composed of the default graph and zero or more named graphs, identified by a URI.Named graphs can be specified with one or more FROM NAMED clauses, or they can be hardwired into a particular SPARQL endpoint.The SPARQL GRAPH keyword allows portions of a query to match against the named graphs in the RDF dataset. Anything outside a GRAPH clause matches against the default graph.
  • 20.
  • 21.
    Querying named graphsFindme people who have been involved with at least three ISWC or ESWC conference events.SELECT DISTINCT ?person ?nameWHERE { ?person foaf:name ?name . GRAPH ?g1 { ?person a foaf:Person } GRAPH ?g2 { ?person a foaf:Person } GRAPH ?g3 { ?person a foaf:Person } FILTER(?g1 != ?g2 && ?g1 != ?g3 && ?g2 != ?g3) .} N.B. The FILTER assures that we're finding a person who occurs in three distinct graphs.N.B. The Web interface we use for this SPARQL query defines the foaf: prefix, which is why we omit it here.Try it with the data.semanticweb.org SPARQL endpoint.
  • 22.
    Transforming between vocabulariesConvertFOAF data to VCard data.PREFIX vCard: <http://www.w3.org/2001/vcard-rdf/3.0#>PREFIX foaf: <http://xmlns.com/foaf/0.1/>CONSTRUCT { ?X vCard:FN ?name . ?X vCard:URL ?url . ?X vCard:TITLE ?title .}FROM <http://dig.csail.mit.edu/2008/webdav/timbl/foaf.rdf>WHERE { OPTIONAL { ?X foaf:name ?name . FILTER isLiteral(?name) . } OPTIONAL { ?X foaf:homepage ?url . FILTER isURI(?url) . } OPTIONAL { ?X foaf:title ?title . FILTER isLiteral(?title) . }}The result RDF graph is created by taking the results of the equivalent SELECT query and filling in the values of variables that occur in the CONSTRUCT template.Triples are not created in the result graph.Try it with ARQ or OpenLink's Virtuoso. (Expected results.)
  • 23.
    ASKing a questionIsthe Amazon river longer than the Nile River?PREFIX prop: <http://dbpedia.org/property/>ASK{ <http://dbpedia.org/resource/Amazon_River> prop:length ?amazon . <http://dbpedia.org/resource/Nile> prop:length ?nile . FILTER(?amazon > ?nile) .}As with SELECT queries, the boolean result is (by default) encoded in an SPARQL Results Format XML document.Shortcut: the WHERE keyword is optional--not only in ASK queries but in all SPARQL queries.
  • 24.
    Learning about aresourceTell me whatever you'd like to tell me about the Amazon river.PREFIX prop: <http://dbpedia.org/property/>DESCRIBE ?amazon{ ?amazon rdfs:label "Amazon River"@en.}Because the server is free to interpret DESCRIBE as it sees fit, DESCRIBE queries are not interoperable.Common implementations include concise-bounded descriptions, named graphs, minimum self-contained graphs, etc
  • 25.
    What’s new inSPARQL 1.1Learn about SPARQL 1.1 by David Becket