Introduction to Semantic Web TechnologiesIvan Herman, W3CJune 22nd, 2010
The Music site of the BBC
The Music site of the BBC
How to build such a site 1.Site editors roam the Web for new factsmay discover further links while roaming They update the site manuallyAnd the site gets soon out-of-date
How to build such a site 2.Editors roam the Web for new data published on Web sites“Scrape” the sites with a program to extract the informationie, write some codeto incorporate the new dataEasily get out of date again…
How to build such a site 3.Editors roam the Web for new data via API-sUnderstand those…input, output arguments, datatypes used, etcWrite some codeto incorporate the new dataEasily get out of date again…
The choice of the BBCUse external, public datasetsWikipedia, MusicBrainz, …They are available as data not API-s or hidden on a Web sitedata can be extracted using, eg, HTTP requests or standard queries
In short…Use the Web of Data as a Content Management SystemUse the community at large as content editors
And this is no secret…
Data on the WebThere are more an more data on the Webgovernment data, health related data, general knowledge, company information, flight information, restaurants,…More and more applications rely on the availability of that data
But… data are often in isolation, “silos”Photo credit Alex (ajagendorf25), Flickr
Imagine…A “Web” wheredocuments are available for download on the Internetbut there would be no hyperlinks among them
And the problem is real…
Data on the Web is not enough…We need a proper infrastructure for a real Web of Datadata is available on the Webaccessible via standard Web technologiesdata are interlinked over the Webie, data can be integrated over the WebThis is where Semantic Web technologies come in
A Web of Data unleashes now applications
A nice usage of UK government data
In what follows…We will use a simplistic example to introduce the main Semantic Web concepts
The rough structure of data integrationMap the various data onto an abstract data representationmake the data independent of its internal representation…Merge the resulting representationsStart making queries on the whole!queries not possible on the individual data sets
We start with a book...
Asimplified bookstore data (dataset “A”)
1st: export your data as a set of relationsa:titleThe Glass Palacehttp://…isbn/000651409Xa:year2000a:publishera:cityLondona:authora:p_nameHarper Collinsa:namea:homepagehttp://www.amitavghosh.comGhosh, Amitav
Some notes on the exporting the dataRelations form a graphthe nodes refer to the “real” data or contain some literalhow the graph is represented in machine is immaterial for now
Some notes on the exporting the dataData export does not necessarily mean physical conversion of the datarelations can be generated on-the-fly at query timevia SQL “bridges”scraping HTML pagesextracting data from Excel sheetsetc.One can export part of the data
Same book in French…
Another bookstore data (dataset “F”)
2nd: export your second set of datahttp://…isbn/000651409XLe palais des miroirsf:originalf:titref:auteurhttp://…isbn/2020386682f:traducteurf:nomf:nomGhosh, AmitavBesse, Christianne
3rd: start merging your dataa:titleThe Glass Palacehttp://…isbn/000651409Xa:year2000a:publishera:cityLondona:authorHarper Collinsa:p_namehttp://…isbn/000651409Xa:namea:homepageLe palais des miroirsf:originalGhosh, Amitavhttp://www.amitavghosh.comf:titref:auteurhttp://…isbn/2020386682f:traducteurf:nomf:nomGhosh, AmitavBesse, Christianne
3rd: start merging your data (cont)a:titleThe Glass Palacehttp://…isbn/000651409Xa:year2000Same URI!a:publishera:cityLondona:authorHarper Collinsa:p_namehttp://…isbn/000651409Xa:namea:homepageLe palais des miroirsf:originalGhosh, Amitavhttp://www.amitavghosh.comf:titref:auteurhttp://…isbn/2020386682f:traducteurf:nomf:nomGhosh, AmitavBesse, Christianne
3rd: start merging your dataa:titleThe Glass Palacehttp://…isbn/000651409Xa:year2000a:publishera:cityLondona:authorHarper Collinsa:p_namef:originalf:auteura:namea:homepageLe palais des miroirsGhosh, Amitavhttp://www.amitavghosh.comf:titrehttp://…isbn/2020386682f:traducteurf:nomf:nomGhosh, AmitavBesse, Christianne
Start making queries…User of data “F” can now ask queries like:“give me the title of the original”well, … « donnes-moi le titre de l’original »This information is not in the dataset “F”……but can be retrieved by merging with dataset “A”!
However, more can be achieved…We “feel” that a:author and f:auteur should be the sameBut an automatic merge doest not know that!Let us add some extra information to the merged data:a:author same as f:auteurboth identify a “Person”a term that a community may have already defined:a “Person” is uniquely identified by his/her name and, say, homepageit can be used as a “category” for certain type of resources
3rd revisited: use the extra knowledgea:titleThe Glass Palacehttp://…isbn/000651409X2000a:yearLe palais des miroirsf:originalf:titrea:publishera:cityLondonhttp://…isbn/2020386682a:authorHarper Collinsa:p_namef:auteurf:traducteurr:typer:typea:namehttp://…foaf/Persona:homepagef:nomf:nomBesse, ChristianneGhosh, Amitavhttp://www.amitavghosh.com
Start making richer queries!User of dataset “F” can now query:“donnes-moi la page d’accueil de l’auteur de l’original”well… “give me the home page of the original’s ‘auteur’”The information is not in datasets “F” or “A”……but was made available by:merging datasets “A” and datasets “F”adding three simple extra statements as an extra “glue”
Combine with different datasetsUsing, e.g., the “Person”, the dataset can be combined with other sourcesFor example, data in Wikipedia can be extracted using dedicated toolse.g., the “dbpedia” project can extract the “infobox” information from Wikipedia already…
Merge with Wikipedia dataa:titleThe Glass Palacehttp://…isbn/000651409X2000a:yearLe palais des miroirsf:originalf:titrea:publishera:cityLondonhttp://…isbn/2020386682a:authorHarper Collinsa:p_namef:auteurf:traducteurr:typea:namer:typehttp://…foaf/Persona:homepagef:nomf:nomr:typeBesse, ChristianneGhosh, Amitavhttp://www.amitavghosh.comfoaf:namew:referencehttp://dbpedia.org/../Amitav_Ghosh
Merge with Wikipedia dataa:titleThe Glass Palacehttp://…isbn/000651409X2000a:yearLe palais des miroirsf:originalf:titrea:publishera:cityLondonhttp://…isbn/2020386682a:authorHarper Collinsa:p_namef:auteurf:traducteurr:typea:namer:typehttp://…foaf/Persona:homepagef:nomf:nomr:typew:isbnBesse, ChristianneGhosh, Amitavhttp://www.amitavghosh.comhttp://dbpedia.org/../The_Glass_Palacefoaf:namew:referencew:author_ofhttp://dbpedia.org/../Amitav_Ghoshw:author_ofhttp://dbpedia.org/../The_Hungry_Tidew:author_ofhttp://dbpedia.org/../The_Calcutta_Chromosome
Merge with Wikipedia dataa:titleThe Glass Palacehttp://…isbn/000651409X2000a:yearLe palais des miroirsf:originalf:titrea:publishera:cityLondonhttp://…isbn/2020386682a:authorHarper Collinsa:p_namef:auteurf:traducteurr:typea:namer:typehttp://…foaf/Persona:homepagef:nomf:nomr:typew:isbnBesse, ChristianneGhosh, Amitavhttp://www.amitavghosh.comhttp://dbpedia.org/../The_Glass_Palacefoaf:namew:referencew:author_ofhttp://dbpedia.org/../Amitav_Ghoshw:born_inhttp://dbpedia.org/../Kolkataw:author_ofhttp://dbpedia.org/../The_Hungry_Tidew:latw:longw:author_ofhttp://dbpedia.org/../The_Calcutta_Chromosome
Is that surprising?It may look like it but, in fact, it should not be…What happened via automatic means is done every day by Web users!The difference: a bit of extra rigour so that machines could do this, too
It could become even more powerfulWe could add extra knowledge to the merged datasetse.g., a full classification of various types of library datageographical informationetc.This is where ontologies, extra rules, etc, come inontologies/rule sets can be relatively simple and small, or huge, or anything in between…Even more powerful queries can be asked as a result
What did we do?ManipulateQuery…ApplicationsMap,Expose,…Data represented in abstract formatData in various formats
So where is the Semantic Web?The Semantic Web provides technologies to make such integration possible! Hopefully you get a full picture at the end of the tutorial…
The Basis: RDF
RDF triplesLet us begin to formalize what we did!we “connected” the data…but a simple connection is not enough… data should be named somehowhence the RDF Triples: a labelled connection between two resources
RDF triples (cont.)An RDF Triple (s,p,o) is such that:“s”, “p” are URI-s, ie, resources on the Web; “o” is a URI or a literal“s”, “p”, and “o” stand for “subject”, “property”, and “object”here is the complete triple:(<http://…isbn…6682>, <http://…/original>, <http://…isbn…409X>)RDF is a general model for such triples
with machine readable formats like RDF/XML, Turtle, N3, RDFa, …RDF triples (cont.)Resources can use any URIhttp://www.example.org/file.html#homehttp://www.example.org/file2.xml#xpath(//q[@a=b])http://www.example.org/form?a=b&c=dRDF triples form a directed, labeled graph (the best way to think about them!)
A simple RDF example (in RDF/XML)http://…isbn/2020386682f:originalf:titrehttp://…isbn/000651409XLe palais des miroirs<rdf:Description rdf:about="http://…/isbn/2020386682"> <f:titre xml:lang="fr">Le palais des mirroirs</f:titre> <f:original rdf:resource="http://…/isbn/000651409X"/></rdf:Description>(Note: namespaces are used to simplify the URI-s)
A simple RDF example (in Turtle)http://…isbn/2020386682f:originalf:titrehttp://…isbn/000651409XLe palais des miroirs<http://…/isbn/2020386682> f:titre "Le palais des mirroirs"@fr ; f:original <http://…/isbn/000651409X> .
A simple RDF example (in RDFa)http://…isbn/2020386682f:originalf:titrehttp://…isbn/000651409XLe palais des miroirs<p about="http://…/isbn/2020386682">The book entitled“<span property="f:title" lang="fr">Le palais des mirroirs</span>” is the French translation of the “<span rel="f:original" resource="http://…/isbn/000651409X">GlassPalace</span>”</p> .
“Internal” nodesConsider the following statement:“the publisher is a «thing» that has a name and an address”Until now, nodes were identified with a URI. But……what is the URI of «thing»?Londona:citya:publisherhttp://…isbn/000651409Xa:p_nameHarper Collins
One solution: create an extra URIThe resource will be “visible” on the Webcare should be taken to define unique URI-s<rdf:Description rdf:about="http://…/isbn/000651409X"> <a:publisher rdf:resource="urn:uuid:f60ffb40-307d-…"/></rdf:Description><rdf:Description rdf:about="urn:uuid:f60ffb40-307d-…"> <a:p_name>HarpersCollins</a:p_name> <a:city>HarpersCollins</a:city></rdf:Description>
Internal identifier (“blank nodes”)<rdf:Description rdf:about="http://…/isbn/000651409X"> <a:publisher rdf:nodeID="A234"/></rdf:Description><rdf:Description rdf:nodeID="A234"> <a:p_name>HarpersCollins</a:p_name> <a:city>HarpersCollins</a:city></rdf:Description><http://…/isbn/2020386682> a:publisher _:A234._:A234 a:p_name "HarpersCollins".Internal = these resources are not visible outsideLondona:citya:publisherhttp://…isbn/000651409Xa:p_nameHarper Collins
Blank nodes: the system can do itLet the system create a “nodeID” internally (you do not really care about the name…)<http://…/isbn/000651409X> a:publisher [a:p_name "HarpersCollins"; …].Londona:citya:publisherhttp://…isbn/000651409Xa:p_nameHarper Collins
Blank nodes when mergingBlank nodes require attention when mergingblanks nodes with identical nodeID-s in different graphs are differentimplementations must be careful…
RDF in programming practiceFor example, using Java+Jena (HP’s Bristol Lab):a “Model” object is createdthe RDF file is parsed and results stored in the Modelthe Model offers methods to retrieve:triples(property,object) pairs for a specific subject(subject,property) pairs for specific objectetc.the rest is conventional programming…Similar tools exist in Python, PHP, etc.
Jena example// create a model Model model=new ModelMem(); Resource subject=model.createResource("URI_of_Subject") // 'in' refers to the input file model.read(new InputStreamReader(in)); StmtIterator iter=model.listStatements(subject,null,null); while(iter.hasNext()) { st = iter.next(); p = st.getProperty(); o = st.getObject(); do_something(p,o); }
Merge in practiceEnvironments merge graphs automaticallye.g., in Jena, the Model can load several filesthe load merges the new statements automaticallymerge takes care of blank node issues, too
Another relatively simple applicationGoal: reuse of older experimental data
Keep data in databases or XML, just export key “fact” as RDF
Use a faceted browser to visualize and interact with the resultCourtesy of Nigel Wilkinson, Lee Harland, Pfizer Ltd, MelliyalAnnamalai, Oracle (SWEO Case Study)
One level higher up(RDFS, Datatypes)
Need for RDF schemasFirst step towards the “extra knowledge”:define the terms we can usewhat restrictions applywhat extra relationships are there?Officially: “RDF Vocabulary Description Language”the term “Schema” is retained for historical reasons…
Classes, resources, …Think of well known traditional vocabularies:use the term “novel”“every novel is a fiction”“«The Glass Palace» is a novel”etc.RDFS defines resources and classes:everything in RDF is a “resource”“classes” are also resources, but……they are also a collection of possible resources (i.e., “individuals”)“fiction”, “novel”, …
Classes, resources, … (cont.)Relationships are defined among resources:“typing”: an individual belongs to a specific class “«The Glass Palace» is a novel”to be more precise: “«http://.../000651409X» is a novel”“subclassing”: all instances of one are also the instances of the other (“every novel is a fiction”)RDFS formalizes these notions in RDF
Classes, resources in RDF(S)rdfs:Classrdf:typerdf:type#Novelhttp://…isbn/000651409XRDFS defines the meaning of these terms(these are all special URI-s, we just use the namespace abbreviation)
Inferred properties#Fictionrdf:typerdf:subClassOfrdf:type#Novelhttp://…isbn/000651409X(<http://…/isbn/000651409X> rdf:type #Fiction)is not in the original RDF data……but can be inferred from the RDFS rulesRDFS environments return that triple, too
Inference: let us be formal…The RDF Semantics document has a list of (33) entailment rules:“if such and such triples are in the graph, add this and this”do that recursively until the graph does not changeThe relevant rule for our example:If: uuu rdfs:subClassOf xxx . vvv rdf:type uuu .Then add: vvv rdf:type xxx .
PropertiesProperty is a special class (rdf:Property)properties are also resources identified by URI-sThere is also a possibility for a “sub-property”all resources bound by the “sub” are also bound by the otherRange and domain of properties can be specifiedi.e., what type of resources serve as object and subject
Example for property characterization:title rdf:type rdf:Property; rdfs:domain :Fiction; rdfs:range rdfs:Literal.
What does this mean?Again, new relations can be deduced. Indeed, if:title rdf:type rdf:Property;rdfs:domain :Fiction; rdfs:range rdfs:Literal.<http://…/isbn/000651409X> :title "The Glass Palace" .then the system can infer that:<http://…/isbn/000651409X> rdf:type :Fiction .
LiteralsLiterals may have a data typefloats, integers, booleans, etc, defined in XML Schemasfull XML fragments(Natural) language can also be specified
Examples for datatypes<http://…/isbn/000651409X> :page_number "543"^^xsd:integer ; :publ_date "2000"^^xsd:gYear ; :price "6.99"^^xsd:float .
A bit of RDFS can take you far…Remember the power of merge?We could have used, in our example:f:auteur is a subproperty of a:author and vice versa(although we will see other ways to do that…)Of course, in some cases, more complex knowledge is necessary (see later…)
Find the right experts at NASAExpertise locater for nearly 70,000 NASA civil servants,
integrate 6 or 7 geographically distributed databases, …Michael Grove, Clark & Parsia, LLC, and Andrew Schain, NASA, (SWEO Case Study)
How to get and create RDF Data?
Simple approachWrite RDF/XML, RDFa, or Turtle “manually”In some cases that is necessary, but it really does not scale…
RDF with XHTMLObviously, a huge source of informationBy adding some “meta” information, the same source can be reused for, eg, data integration, better mashups, etctypical example: your personal information, like address, should be readable for humans and processable by machines
RDF with XML/(X)HTML (cont)Two solutions have emerged:use microformats and convert the content into RDFXSLT is the favorite approachadd RDF-like statements directly into XHTML via RDFa
Bridge to relational databasesData on the Web are mostly stored in databases“Bridges” are being defined:a layer between RDF and the relational dataRDB tables are “mapped” to RDF graphs, possibly on the flydifferent mapping approaches are being useda number RDB systems offer this facility already (eg, Oracle, OpenLink, …) W3C is working on a standard in this area
Linked Open Data
Linked Open Data ProjectGoal: “expose” open datasets in RDFSet RDF links among the data items from different datasetsSet up, if possible, query endpoints
Example data source: DBpediaDBpedia is a community effort toextract structured (“infobox”) information from Wikipediaprovide a query endpoint to the datasetinterlink the DBpedia dataset with other datasets on the Web
Extracting structured data from Wikipedia@prefix dbpedia <http://dbpedia.org/resource/>.@prefix dbterm <http://dbpedia.org/property/>.dbpedia:Amsterdam dbterm:officialName "Amsterdam" ; dbterm:longd "4” ; dbterm:longm "53" ; dbterm:longs "32” ; dbterm:leaderName dbpedia:Lodewijk_Asscher ; ... dbterm:areaTotalKm "219" ; ...dbpedia:ABN_AMRO dbterm:location dbpedia:Amsterdam ; ...
Automatic links among open datasets<http://dbpedia.org/resource/Amsterdam> owl:sameAs <http://rdf.freebase.com/ns/...> ; owl:sameAs <http://sws.geonames.org/2759793> ; ...<http://sws.geonames.org/2759793> owl:sameAs <http://dbpedia.org/resource/Amsterdam> wgs84_pos:lat "52.3666667" ; wgs84_pos:long "4.8833333"; geo:inCountry <http://www.geonames.org/countries/#NL> ; ...Processors can switch automatically from one to the other…
The LOD “cloud”, June 2009
Remember the BBC example?
NYT articles on university alumni
Query RDF Data(SPARQL)
Querying RDF graphsRemember the Jena idiom:StmtIterator iter=model.listStatements(subject,null,null);while(iter.hasNext()) { st = iter.next(); p = st.getProperty(); o = st.getObject(); do_something(p,o);In practice, more complex queries into the RDF data are necessary
something like “give me (a,b) pairs for which there is an x such that (x parent a) and (b brother x) holds” (ie, return the uncles)
The goal of SPARQL (Query Language for RDF)Analyze the Jena exampleStmtIterator iter=model.listStatements(subject,null,null);while(iter.hasNext()) { st = iter.next(); p = st.getProperty(); o = st.getObject(); do_something(p,o);?o?p?o?psubject?o?p?o?p
General: graph patternsThe fundamental idea: use graph patternsthe pattern contains unbound symbolsby binding the symbols, subgraphs of the RDF graph are selectedif there is such a selection, the query returns bound resources
Our Jena example in SPARQLSELECT ?p ?oWHERE {subject ?p ?o}The triples in WHERE define the graph pattern, with ?p and ?o “unbound” symbolsThe query returns all p,o pairs?o?p?o?psubject?o?p?o?p
Simple SPARQL exampleSELECT ?isbn ?price ?currency # note: not ?x!WHERE {?isbn a:price ?x. ?x rdf:value ?price. ?x p:currency ?currency.}a:nameGhosh, Amitava:authora:authorhttp://…isbn/2020386682http://…isbn/000651409Xa:pricea:pricea:pricea:pricep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:value:£:€:€:$33506078
Simple SPARQL exampleSELECT ?isbn ?price ?currency # note: not ?x!WHERE {?isbn a:price ?x. ?x rdf:value ?price. ?x p:currency ?currency.}Returns: [<…409X>,33,:£]a:nameGhosh, Amitava:authora:authorhttp://…isbn/2020386682http://…isbn/000651409Xa:pricea:pricea:pricea:pricep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:value:£:€:€:$33506078
Simple SPARQL exampleSELECT ?isbn ?price ?currency # note: not ?x!WHERE {?isbn a:price ?x. ?x rdf:value ?price. ?x p:currency ?currency.}Returns: [<…409X>,33,:£], [<…409X>,50,:€]a:nameGhosh, Amitava:authora:authorhttp://…isbn/2020386682http://…isbn/000651409Xa:pricea:pricea:pricea:pricep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:value:£:€:€:$33506078
Simple SPARQL exampleSELECT ?isbn ?price ?currency # note: not ?x!WHERE {?isbn a:price ?x. ?x rdf:value ?price. ?x p:currency ?currency.}Returns: [<…409X>,33,:£], [<…409X>,50,:€],[<…6682>,60,:€]a:nameGhosh, Amitava:authora:authorhttp://…isbn/2020386682http://…isbn/000651409Xa:pricea:pricea:pricea:pricep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:value:£:€:€:$33506078
Simple SPARQL exampleSELECT ?isbn ?price ?currency # note: not ?x!WHERE {?isbn a:price ?x. ?x rdf:value ?price. ?x p:currency ?currency.}Returns: [<…409X>,33,:£], [<…409X>,50,:€],[<…6682>,60,:€], [<…6682>,78,:$]a:nameGhosh, Amitava:authora:authorhttp://…isbn/2020386682http://…isbn/000651409Xa:pricea:pricea:pricea:pricep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:value:£:€:€:$33506078
Pattern constraintsSELECT ?isbn ?price ?currency # note: not ?x!WHERE { ?isbn a:price ?x. ?x rdf:value ?price. ?x p:currency ?currency. FILTER(?currency == :€) }Returns: [<…409X>,50,:€], [<…6682>,60,:€]a:nameGhosh, Amitava:authora:authorhttp://…isbn/2020386682http://…isbn/000651409Xa:pricea:pricea:pricea:pricep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:value:£:€:€:$33506078
Many extra SPARQL featuresLimit the number of returned results; remove duplicates, sort them, …Optional branches: if some part of the pattern does not match, ignore itSpecify several data sources (via URI-s) within the query (essentially, a merge on-the-fly!)Construct a graph using a separate pattern on the query resultsIn SPARQL 1.1: updating data, not only query
SPARQL usage in practiceSPARQL is usually used over the networkseparate documents define the protocol and the result formatSPARQL Protocol for RDF with HTTP and SOAP bindingsSPARQL results in XML or JSON formatsBig datasets often offer “SPARQL endpoints” using this protocoltypical example: SPARQL endpoint to DBpedia
SPARQL as a unifying pointApplicationSPARQL ConstructSPARQL ConstructSPARQL EndpointSPARQL EndpointSPARQL ProcessorDatabaseTriple storeNLP TechniquesRDFaGRDDL, RDFaSQLRDFRelationalDatabaseRDF GraphHTMLUnstructured TextXML/XHTML
Integrate knowledge for Chinese MedicineIntegration of a large number of TCM databases
around 80 databases, around 200,000 records eachCourtesy of Huajun Chen, Zhejiang University, (SWEO Case Study)
Vocabularies
VocabulariesData integration needs agreements onterms “translator”, “author”categories used “Person”, “literature”relationships among those “an author is also a Person…”, “historical fiction is a narrower term than fiction”ie, new relationships can be deduced
VocabulariesThere is a need for “languages” to define such vocabulariesto define those vocabulariesto assign clear “semantics” on how new relationships can be deduced
But what about RDFS?Indeed RDFS is such framework:there is typing, subtypingproperties can be put in a hierarchydatatypes can be definedRDFS is enough for many vocabulariesBut not for all!
Three technologies have emergedTo re-use thesauri, glossaries, etc: SKOSTo define more complex vocabularies with a strong logical underpinning: OWLGeneric framework to define rules on terms and data: RIF
Using thesauri, glossaries(SKOS)
SKOSRepresent and share classifications, glossaries, thesauri, etcfor example:Dewey Decimal Classification, Art and Architecture Thesaurus, ACM classification of keywords and terms…classification/formalization of Web 2.0 type tagsDefine classes and properties to add those structures to an RDF universeallow for a quick port of this traditional data, combine it with other data
Example: the term “Fiction”, as defined by the Library of Congress
Example: the term “Fiction”, as defined by the Library of Congress
Thesauri have identical structures…The structure of the LOC page is fairly typicallabel, alternate label, narrower, broader, …there is even an ISO standard for such structuresSKOS provides a basic structure to create an RDF representation of these
LOC’s “Fiction” in SKOS/RDFLiteratureskos:ConceptFictionskos:prefLabelrdf:typeskos:prefLabelskos:broaderhttp://id.loc.gov/…#conceptskos:altLabelMetafictionskos:narrowerskos:altLabelskos:narrowerNovelsskos:prefLabelAllegoriesskos:prefLabelAdventure stories
Usage of the LOC graphFictionskos:ConceptHistorical Fictionskos:prefLabelskos:prefLabelrdf:typeskos:broaderdc:subjectdc:titlehttp:.//…/isbn/…The GlassPalace
Importance of SKOSSKOS provides a simple bridge between the “print world” and the (Semantic) WebThesauri, glossaries, etc, from the library community can be made availableLOC is a good exampleSKOS can also be used to organize tags, annotate other vocabularies, …
Importance of SKOSAnybody in the World can refer to common conceptsthey mean the same for everybodyApplications may exploit the relationships among conceptseg, SPARQL queries may be issued on the merge of the library data and the LOC terms
Semantic portal for art collectionsCourtesy of Jacco van Ossenbruggen, CWI, and Guus Schreiber, VU Amsterdam
Ontologies(OWL)
SKOS is not enough…SKOS may be used to provide simple vocabulariesBut it is not a complete solutionit concentrates on the concepts onlyno characterization of properties in generalsimple from a logical perspectiveie, few inferences are possible
Application may want more…Complex applications may want more possibilities:characterization of properties identification of objects with different URI-sdisjointness or equivalence of classesconstruct classes, not only name themmore complex classification schemescan a program reason about some terms? E.g.:“if «Person» resources «A» and «B» have the same «foaf:email» property, then «A» and «B» are identical”etc.
Web Ontology Language = OWLOWL is an extra layer, a bit like RDFS or SKOSown namespace, own termsit relies on RDF SchemasIt is a separate recommendationactually… there is a 2004 version of OWL (“OWL 1”)and there is an update (“OWL 2”) published in 2009
OWL is complex…OWL is a large set of additional termsWe will not cover the whole thing here…
Term equivalencesFor classes:owl:equivalentClass: two classes have the same individualsowl:disjointWith: no individuals in commonFor properties:owl:equivalentPropertyremember the a:author vs. f:auteur?owl:propertyDisjointWith
Term equivalencesFor individuals:owl:sameAs: two URIs refer to the same concept (“individual”)owl:differentFrom: negation of owl:sameAs
Other example: connecting to Frenchowl:equivalentClassa:Novelf:Romanowl:equivalentPropertya:authorf:auteur
Typical usage of owl:sameAsLinking our example of Amsterdam from one data set (DBpedia) to the other (Geonames):<http://dbpedia.org/resource/Amsterdam>owl:sameAs <http://sws.geonames.org/2759793>;This is the main mechanism of “Linking” in the Linked Open Data projectProperty characterizationIn OWL, one can characterize the behavior of properties (symmetric, transitive, functional, reflexive, inverse functional…)One property can be defined as the “inverse” of another
What this means is…If the following holds in our triples::email rdf:type owl:InverseFunctionalProperty.
What this means is…If the following holds in our triples::email rdf:type owl:InverseFunctionalProperty. <A> :email "mailto:a@b.c".<B> :email "mailto:a@b.c".
What this means is…If the following holds in our triples::email rdf:type owl:InverseFunctionalProperty. <A> :email "mailto:a@b.c".<B> :email "mailto:a@b.c".then, processed through OWL, the following holds, too:<A> owl:sameAs <B>.
KeysInverse functional properties are important for identification of individualsthink of the email examplesBut… identification based on one property may not be enough
Keys“if two persons have the same emails and the samehomepages then they are identical”Identification is based on the identical values of two propertiesThe rule applies to persons only
Previous rule in OWL:Person rdf:type owl:Class; owl:hasKey (:email :homepage) .
What it means is…If:<A> rdf:type :Person ; :email "mailto:a@b.c"; :homepage "http://www.ex.org".<B> rdf:type :Person ; :email "mailto:a@b.c"; :homepage "http://www.ex.org".then, processed through OWL, the following holds, too:<A> owl:sameAs <B>.
Classes in OWLIn RDFS, you can subclass existing classes… that’s allIn OWL, you can construct classes from existing ones:enumerate its contentthrough intersection, union, complementetc
Enumerate class content:Currency rdf:type owl:Class; owl:oneOf (:€ :£ :$).I.e., the class consists of exactly of those individuals and nothing else
Union of classes:Novel rdf:type owl:Class.:Short_Story rdf:type owl:Class.:Poetry rdf:type owl:Class.:Literature rdf:type owl:Class; owl:unionOf (:Novel :Short_Story :Poetry).Other possibilities: owl:complementOf, owl:intersectionOf, …
For example…If::Novel rdf:type owl:Class.:Short_Story rdf:type owl:Class.:Poetry rdf:type owl:Class.:Literature rdf:type owl:Class; owl:unionOf (:Novel :Short_Story :Poetry).<myWork> rdf:type :Novel .then the following holds, too:<myWork> rdf:type :Literature .
It can be a bit more complicated…If::Novel rdf:type owl:Class.:Short_Story rdf:type owl:Class.:Poetry rdf:type owl:Class.:Literature rdf:type owlClass; owl:unionOf (:Novel :Short_Story :Poetry).fr:Roman owl:equivalentClass :Novel .<myWork> rdf:type fr:Roman .then, through the combination of different terms, the following still holds:<myWork> rdf:type :Literature .
What we have so far…The OWL features listed so far are already fairly powerfulE.g., various databases can be linked via owl:sameAs, functional or inverse functional properties, etc.Many inferred relationship can be found using a traditional rule engine
However… that may not be enoughVery large vocabularies might require even more complex featuressome major issuesthe way classes (i.e., “concepts”) are definedhandling of datatypes like intervalsOWL includes those extra features but… the inference engines become (much) more complex
Example: property value restrictionsNew classes are created by restricting the property values on a classFor example: how would I characterize a “listed price”?it is a price that is given in one of the “allowed” currencies (€, £, or $)this defines a new class
But: OWL is hard!The combination of class constructions with various restrictions is extremely powerfulWhat we have so far follows the same logic as beforeextend the basic RDF and RDFS possibilities with new featuresdefine their semantics, ie, what they “mean” in terms of relationshipsexpect to infer new relationships based on thoseHowever… a full inference procedure is hard not implementable with simple rule engines, for example
OWL “species” or profilesOWL species comes to the fore:restricting which terms can be used and under what circumstances (restrictions)if one abides to those restrictions, then simpler inference engines can be usedThey reflect compromises: expressiveness vs. implementability
OWL SpeciesOWL FullOWL DLOWL RLOWL ELOWL QL
OWL RLGoal: to be implementable with rule enginesUsage follows a similar approach to RDFS:merge the ontology and the instance data into an RDF graph use the rule engine to add new triples (as long as it is possible)
What can be done in OWL RL?Many features are available:identity of classes, instances, propertiessubproperties, subclasses, domains, rangesunion and intersection of classes (but with some restrictions)property characterizations (functional, symmetric, etc)property chainskeyssome property restrictionsAll examples so far could be inferred with OWL RL!
Improved Search via Ontology (GoPubMed)Search results are re-ranked using ontologiesrelated terms are highlighted
Improved Search via Ontology (Go3R)Same dataset, different ontology
(ontology is on non-animal experimentation)Rules(RIF)
Why rules on the Semantic Web?Some conditions may be complicated in ontologies (ie, OWL)eg, Horn rules: (P1 & P2 & …) -> CIn many cases applications just want 2-3 rules to complete integrationIe, rules may be an alternative to (OWL based) ontologies
Things you may want to expressAn example from our bookshop integration:“I buy a novel with over 500 pages if it costs less than €20”something like (in an ad-hoc syntax):{ ?x rdf:type p:Novel; p:page_number ?n; p:price [ p:currency :€; rdf:value ?z ]. ?n > "500"^^xsd:integer. ?z < "20.0"^^xsd:double. }=> { <me> p:buys ?x }
Things you may want to expressp:Novelrdf:type?np:page_number?n>500p:buys?x?xmep:price:€p:currencyrdf:value?z?z<20
RIF (Rule Interchange Format)The goals of the RIF work:define simple rule language(s) for the (Semantic) Webdefine interchange formats for rule based systemsRIF defines several “dialects” of languagesRIF is not bound to RDF onlyeg, relationships may involve more than 2 entitiesthere are dialects for production rule systems
RIF CoreThe simplest RIF “dialect”A Core document isdirectives like import, prefix settings for URI-s, etca sequence of logical implications
RIF Core exampleDocument( Prefix(cpt http://example.com/concepts#) Prefix(person http://example.com/people#) Prefix(isbn http://…/isbn/) Group ( Forall ?Buyer ?Book ?Seller ( cpt:buy(?Buyer ?Book ?Seller):- cpt:sell(?Seller ?Book ?Buyer) ) cpt:sell(person:John isbn:000651409X person:Mary) ))This infers the following relationship:cpt:buy(person:Mary isbn:000651409X person:John)
Expressivity of RIF CoreFormally: definite Horn without function symbols, a.k.a. “Datalog”eg, p(a,b,c) is fine, but p(f(a),b,c) is notIncludes some extra featuresbuilt-in datatypes and predicates“local” symbols, a bit like blank nodes
Expressivity of RIF CoreThere are also “safeness measures”eg, variable in a consequent should be in the antecedentthis secures a straightforward implementation strategy (“forward chaining”)
RIF SyntaxesRIF definesa “presentation syntax”a standard XML syntax to encode and exchange the rulesthere is a draft for expressing Core in RDFjust like OWL is represented in RDF
What about RDF and RIF?Typical scenario:the “data” of the application is available in RDFrules on that data is described using RIFthe two sets are “bound” (eg, RIF “imports” the data)a RIF processor produces new relationships
To make RIF/RDF workSome technical issues should be settled:RDF triples have to be representable in RIFvarious constructions (typing, datatypes, lists) should be alignedthe semantics of the two worlds should be compatibleThere is a separate document that brings these together
Remember the what we wanted from Rules?{ ?x rdf:type p:Novel; p:page_number ?n; p:price [ p:currency :€; rdf:value ?z ]. ?n > "500"^^xsd:integer. ?z < "20.0"^^xsd:double. }=> { <me> p:buys ?x }
The same with RIF Presentation syntaxDocument ( Prefix … Group ( Forall ?x ?n ?z ( <me>[p:buys->?x] :- And( ?x rdf:type p:Novel ?x[p:page_number->?n p:price->_abc] _abc[p:currency->:€ rdf:value->?z] External( pred:numeric-greater-than(?n "500"^^xsd:integer) ) External( pred:numeric-less-than(?z "20.0"^^xsd:double) ) ) ) ))
Discovering new relationships…Forall ?x ?n ?z ( <me>[p:buys->?x] :- And( ?x # p:Novel ?x[p:page_number->?n p:price->_abc] _abc[p:currency->:€ rdf:value->?z] External( pred:numeric-greater-than(?n "500"^^xsd:integer) ) External( pred:numeric-less-than(?z "20.0"^^xsd:double) ) ))
Discovering new relationships…Forall ?x ?n ?z ( <me>[p:buys->?x] :- And( ?x # p:Novel ?x[p:page_number->?n p:price->_abc] _abc[p:currency->:€ rdf:value->?z] External( pred:numeric-greater-than(?n "500"^^xsd:integer) ) External( pred:numeric-less-than(?z "20.0"^^xsd:double) ) ))combined with:<http://…/isbn/…> a p:Novel; p:page_number "600"^^xsd:integer ; p:price [ rdf:value "15.0"^^xsd:double ; p:currency :€ ] .
Discovering new relationships…Forall ?x ?n ?z ( <me>[p:buys->?x] :- And( ?x # p:Novel ?x[p:page_number->?n p:price->_abc] _abc[p:currency->:€ rdf:value->?z] External( pred:numeric-greater-than(?n "500"^^xsd:integer) ) External( pred:numeric-less-than(?z "20.0"^^xsd:double) ) ))combined with:<http://…/isbn/…> a p:Novel; p:page_number "600"^^xsd:integer ; p:price [ rdf:value "15.0"^^xsd:double ; p:currency :€ ] .yields:<me> p:buys <http://…/isbn/…> .
RIF vs. OWL?The expressivity of the two is fairly identicalthe emphasis are a bit differentUsing rules vs. ontologies may largely depend onavailable toolspersonal technical experience and expertisetaste…
What about OWL RL?OWL RL stands for “Rule Language”…OWL RL is in the intersection of RIF Core and OWLinferences in OWL RL can be expressed with RIF rulesRIF Core engines can act as OWL RL engines
Inferencing and SPARQLQuestion: how do SPARQL queries and inferences work together?RDFS, OWL, and RIF produce new relationshipson what data do we query?Answer: in current SPARQL, that is not definedBut, in SPARQL 1.1 it is…
SPARQL 1.1 and RDFS/OWL/RIFSPARQL Engine with entailmentRDF DataQuery resultentailmentRDFS/OWL/RIF dataSPARQL PatternSPARQL PatternRDF Data with extra triplespattern matching
What have we achieved?(putting all this together)
Remember the integration example?ManipulateQuery…ApplicationsMap,Expose,…Data represented in abstract formatData in various formats
Same with what we learnedSPARQL,Inferences…ApplicationsRDB  RDF,GRDL, RDFa,…Data represented in RDF with extra knowledge (RDFS, SKOS, RIF, OWL,…)Data in various formats
eTourism: provide personalized itineraryIntegration of relevant data in Zaragoza (using RDF and ontologies)
Use rules on the RDF data to provide a proper itinerary Courtesy of Jesús Fernández, Mun. of Zaragoza, and Antonio Campos, CTIC (SWEO Use Case)
Available documents, resources
Available specifications: Primers, GuidesThe “RDF Primer” and the “OWL Guide” give a formal introduction to RDF(S) and OWLSKOS has its separate “SKOS Primer”GRDDL Primer and RDFa Primer have been publishedThe W3C Semantic Web Activity Wiki has links to all the specifications

Introduction to Semantic Web Technologies

  • 1.
    Introduction to SemanticWeb TechnologiesIvan Herman, W3CJune 22nd, 2010
  • 2.
    The Music siteof the BBC
  • 3.
    The Music siteof the BBC
  • 4.
    How to buildsuch a site 1.Site editors roam the Web for new factsmay discover further links while roaming They update the site manuallyAnd the site gets soon out-of-date
  • 5.
    How to buildsuch a site 2.Editors roam the Web for new data published on Web sites“Scrape” the sites with a program to extract the informationie, write some codeto incorporate the new dataEasily get out of date again…
  • 6.
    How to buildsuch a site 3.Editors roam the Web for new data via API-sUnderstand those…input, output arguments, datatypes used, etcWrite some codeto incorporate the new dataEasily get out of date again…
  • 7.
    The choice ofthe BBCUse external, public datasetsWikipedia, MusicBrainz, …They are available as data not API-s or hidden on a Web sitedata can be extracted using, eg, HTTP requests or standard queries
  • 8.
    In short…Use theWeb of Data as a Content Management SystemUse the community at large as content editors
  • 9.
    And this isno secret…
  • 10.
    Data on theWebThere are more an more data on the Webgovernment data, health related data, general knowledge, company information, flight information, restaurants,…More and more applications rely on the availability of that data
  • 11.
    But… data areoften in isolation, “silos”Photo credit Alex (ajagendorf25), Flickr
  • 12.
    Imagine…A “Web” wheredocumentsare available for download on the Internetbut there would be no hyperlinks among them
  • 13.
    And the problemis real…
  • 14.
    Data on theWeb is not enough…We need a proper infrastructure for a real Web of Datadata is available on the Webaccessible via standard Web technologiesdata are interlinked over the Webie, data can be integrated over the WebThis is where Semantic Web technologies come in
  • 15.
    A Web ofData unleashes now applications
  • 16.
    A nice usageof UK government data
  • 17.
    In what follows…Wewill use a simplistic example to introduce the main Semantic Web concepts
  • 18.
    The rough structureof data integrationMap the various data onto an abstract data representationmake the data independent of its internal representation…Merge the resulting representationsStart making queries on the whole!queries not possible on the individual data sets
  • 19.
    We start witha book...
  • 20.
  • 21.
    1st: export yourdata as a set of relationsa:titleThe Glass Palacehttp://…isbn/000651409Xa:year2000a:publishera:cityLondona:authora:p_nameHarper Collinsa:namea:homepagehttp://www.amitavghosh.comGhosh, Amitav
  • 22.
    Some notes onthe exporting the dataRelations form a graphthe nodes refer to the “real” data or contain some literalhow the graph is represented in machine is immaterial for now
  • 23.
    Some notes onthe exporting the dataData export does not necessarily mean physical conversion of the datarelations can be generated on-the-fly at query timevia SQL “bridges”scraping HTML pagesextracting data from Excel sheetsetc.One can export part of the data
  • 24.
    Same book inFrench…
  • 25.
    Another bookstore data(dataset “F”)
  • 26.
    2nd: export yoursecond set of datahttp://…isbn/000651409XLe palais des miroirsf:originalf:titref:auteurhttp://…isbn/2020386682f:traducteurf:nomf:nomGhosh, AmitavBesse, Christianne
  • 27.
    3rd: start mergingyour dataa:titleThe Glass Palacehttp://…isbn/000651409Xa:year2000a:publishera:cityLondona:authorHarper Collinsa:p_namehttp://…isbn/000651409Xa:namea:homepageLe palais des miroirsf:originalGhosh, Amitavhttp://www.amitavghosh.comf:titref:auteurhttp://…isbn/2020386682f:traducteurf:nomf:nomGhosh, AmitavBesse, Christianne
  • 28.
    3rd: start mergingyour data (cont)a:titleThe Glass Palacehttp://…isbn/000651409Xa:year2000Same URI!a:publishera:cityLondona:authorHarper Collinsa:p_namehttp://…isbn/000651409Xa:namea:homepageLe palais des miroirsf:originalGhosh, Amitavhttp://www.amitavghosh.comf:titref:auteurhttp://…isbn/2020386682f:traducteurf:nomf:nomGhosh, AmitavBesse, Christianne
  • 29.
    3rd: start mergingyour dataa:titleThe Glass Palacehttp://…isbn/000651409Xa:year2000a:publishera:cityLondona:authorHarper Collinsa:p_namef:originalf:auteura:namea:homepageLe palais des miroirsGhosh, Amitavhttp://www.amitavghosh.comf:titrehttp://…isbn/2020386682f:traducteurf:nomf:nomGhosh, AmitavBesse, Christianne
  • 30.
    Start making queries…Userof data “F” can now ask queries like:“give me the title of the original”well, … « donnes-moi le titre de l’original »This information is not in the dataset “F”……but can be retrieved by merging with dataset “A”!
  • 31.
    However, more canbe achieved…We “feel” that a:author and f:auteur should be the sameBut an automatic merge doest not know that!Let us add some extra information to the merged data:a:author same as f:auteurboth identify a “Person”a term that a community may have already defined:a “Person” is uniquely identified by his/her name and, say, homepageit can be used as a “category” for certain type of resources
  • 32.
    3rd revisited: usethe extra knowledgea:titleThe Glass Palacehttp://…isbn/000651409X2000a:yearLe palais des miroirsf:originalf:titrea:publishera:cityLondonhttp://…isbn/2020386682a:authorHarper Collinsa:p_namef:auteurf:traducteurr:typer:typea:namehttp://…foaf/Persona:homepagef:nomf:nomBesse, ChristianneGhosh, Amitavhttp://www.amitavghosh.com
  • 33.
    Start making richerqueries!User of dataset “F” can now query:“donnes-moi la page d’accueil de l’auteur de l’original”well… “give me the home page of the original’s ‘auteur’”The information is not in datasets “F” or “A”……but was made available by:merging datasets “A” and datasets “F”adding three simple extra statements as an extra “glue”
  • 34.
    Combine with differentdatasetsUsing, e.g., the “Person”, the dataset can be combined with other sourcesFor example, data in Wikipedia can be extracted using dedicated toolse.g., the “dbpedia” project can extract the “infobox” information from Wikipedia already…
  • 35.
    Merge with Wikipediadataa:titleThe Glass Palacehttp://…isbn/000651409X2000a:yearLe palais des miroirsf:originalf:titrea:publishera:cityLondonhttp://…isbn/2020386682a:authorHarper Collinsa:p_namef:auteurf:traducteurr:typea:namer:typehttp://…foaf/Persona:homepagef:nomf:nomr:typeBesse, ChristianneGhosh, Amitavhttp://www.amitavghosh.comfoaf:namew:referencehttp://dbpedia.org/../Amitav_Ghosh
  • 36.
    Merge with Wikipediadataa:titleThe Glass Palacehttp://…isbn/000651409X2000a:yearLe palais des miroirsf:originalf:titrea:publishera:cityLondonhttp://…isbn/2020386682a:authorHarper Collinsa:p_namef:auteurf:traducteurr:typea:namer:typehttp://…foaf/Persona:homepagef:nomf:nomr:typew:isbnBesse, ChristianneGhosh, Amitavhttp://www.amitavghosh.comhttp://dbpedia.org/../The_Glass_Palacefoaf:namew:referencew:author_ofhttp://dbpedia.org/../Amitav_Ghoshw:author_ofhttp://dbpedia.org/../The_Hungry_Tidew:author_ofhttp://dbpedia.org/../The_Calcutta_Chromosome
  • 37.
    Merge with Wikipediadataa:titleThe Glass Palacehttp://…isbn/000651409X2000a:yearLe palais des miroirsf:originalf:titrea:publishera:cityLondonhttp://…isbn/2020386682a:authorHarper Collinsa:p_namef:auteurf:traducteurr:typea:namer:typehttp://…foaf/Persona:homepagef:nomf:nomr:typew:isbnBesse, ChristianneGhosh, Amitavhttp://www.amitavghosh.comhttp://dbpedia.org/../The_Glass_Palacefoaf:namew:referencew:author_ofhttp://dbpedia.org/../Amitav_Ghoshw:born_inhttp://dbpedia.org/../Kolkataw:author_ofhttp://dbpedia.org/../The_Hungry_Tidew:latw:longw:author_ofhttp://dbpedia.org/../The_Calcutta_Chromosome
  • 38.
    Is that surprising?Itmay look like it but, in fact, it should not be…What happened via automatic means is done every day by Web users!The difference: a bit of extra rigour so that machines could do this, too
  • 39.
    It could becomeeven more powerfulWe could add extra knowledge to the merged datasetse.g., a full classification of various types of library datageographical informationetc.This is where ontologies, extra rules, etc, come inontologies/rule sets can be relatively simple and small, or huge, or anything in between…Even more powerful queries can be asked as a result
  • 40.
    What did wedo?ManipulateQuery…ApplicationsMap,Expose,…Data represented in abstract formatData in various formats
  • 41.
    So where isthe Semantic Web?The Semantic Web provides technologies to make such integration possible! Hopefully you get a full picture at the end of the tutorial…
  • 42.
  • 43.
    RDF triplesLet usbegin to formalize what we did!we “connected” the data…but a simple connection is not enough… data should be named somehowhence the RDF Triples: a labelled connection between two resources
  • 44.
    RDF triples (cont.)AnRDF Triple (s,p,o) is such that:“s”, “p” are URI-s, ie, resources on the Web; “o” is a URI or a literal“s”, “p”, and “o” stand for “subject”, “property”, and “object”here is the complete triple:(<http://…isbn…6682>, <http://…/original>, <http://…isbn…409X>)RDF is a general model for such triples
  • 45.
    with machine readableformats like RDF/XML, Turtle, N3, RDFa, …RDF triples (cont.)Resources can use any URIhttp://www.example.org/file.html#homehttp://www.example.org/file2.xml#xpath(//q[@a=b])http://www.example.org/form?a=b&c=dRDF triples form a directed, labeled graph (the best way to think about them!)
  • 46.
    A simple RDFexample (in RDF/XML)http://…isbn/2020386682f:originalf:titrehttp://…isbn/000651409XLe palais des miroirs<rdf:Description rdf:about="http://…/isbn/2020386682"> <f:titre xml:lang="fr">Le palais des mirroirs</f:titre> <f:original rdf:resource="http://…/isbn/000651409X"/></rdf:Description>(Note: namespaces are used to simplify the URI-s)
  • 47.
    A simple RDFexample (in Turtle)http://…isbn/2020386682f:originalf:titrehttp://…isbn/000651409XLe palais des miroirs<http://…/isbn/2020386682> f:titre "Le palais des mirroirs"@fr ; f:original <http://…/isbn/000651409X> .
  • 48.
    A simple RDFexample (in RDFa)http://…isbn/2020386682f:originalf:titrehttp://…isbn/000651409XLe palais des miroirs<p about="http://…/isbn/2020386682">The book entitled“<span property="f:title" lang="fr">Le palais des mirroirs</span>” is the French translation of the “<span rel="f:original" resource="http://…/isbn/000651409X">GlassPalace</span>”</p> .
  • 49.
    “Internal” nodesConsider thefollowing statement:“the publisher is a «thing» that has a name and an address”Until now, nodes were identified with a URI. But……what is the URI of «thing»?Londona:citya:publisherhttp://…isbn/000651409Xa:p_nameHarper Collins
  • 50.
    One solution: createan extra URIThe resource will be “visible” on the Webcare should be taken to define unique URI-s<rdf:Description rdf:about="http://…/isbn/000651409X"> <a:publisher rdf:resource="urn:uuid:f60ffb40-307d-…"/></rdf:Description><rdf:Description rdf:about="urn:uuid:f60ffb40-307d-…"> <a:p_name>HarpersCollins</a:p_name> <a:city>HarpersCollins</a:city></rdf:Description>
  • 51.
    Internal identifier (“blanknodes”)<rdf:Description rdf:about="http://…/isbn/000651409X"> <a:publisher rdf:nodeID="A234"/></rdf:Description><rdf:Description rdf:nodeID="A234"> <a:p_name>HarpersCollins</a:p_name> <a:city>HarpersCollins</a:city></rdf:Description><http://…/isbn/2020386682> a:publisher _:A234._:A234 a:p_name "HarpersCollins".Internal = these resources are not visible outsideLondona:citya:publisherhttp://…isbn/000651409Xa:p_nameHarper Collins
  • 52.
    Blank nodes: thesystem can do itLet the system create a “nodeID” internally (you do not really care about the name…)<http://…/isbn/000651409X> a:publisher [a:p_name "HarpersCollins"; …].Londona:citya:publisherhttp://…isbn/000651409Xa:p_nameHarper Collins
  • 53.
    Blank nodes whenmergingBlank nodes require attention when mergingblanks nodes with identical nodeID-s in different graphs are differentimplementations must be careful…
  • 54.
    RDF in programmingpracticeFor example, using Java+Jena (HP’s Bristol Lab):a “Model” object is createdthe RDF file is parsed and results stored in the Modelthe Model offers methods to retrieve:triples(property,object) pairs for a specific subject(subject,property) pairs for specific objectetc.the rest is conventional programming…Similar tools exist in Python, PHP, etc.
  • 55.
    Jena example// createa model Model model=new ModelMem(); Resource subject=model.createResource("URI_of_Subject") // 'in' refers to the input file model.read(new InputStreamReader(in)); StmtIterator iter=model.listStatements(subject,null,null); while(iter.hasNext()) { st = iter.next(); p = st.getProperty(); o = st.getObject(); do_something(p,o); }
  • 56.
    Merge in practiceEnvironmentsmerge graphs automaticallye.g., in Jena, the Model can load several filesthe load merges the new statements automaticallymerge takes care of blank node issues, too
  • 57.
    Another relatively simpleapplicationGoal: reuse of older experimental data
  • 58.
    Keep data indatabases or XML, just export key “fact” as RDF
  • 59.
    Use a facetedbrowser to visualize and interact with the resultCourtesy of Nigel Wilkinson, Lee Harland, Pfizer Ltd, MelliyalAnnamalai, Oracle (SWEO Case Study)
  • 60.
    One level higherup(RDFS, Datatypes)
  • 61.
    Need for RDFschemasFirst step towards the “extra knowledge”:define the terms we can usewhat restrictions applywhat extra relationships are there?Officially: “RDF Vocabulary Description Language”the term “Schema” is retained for historical reasons…
  • 62.
    Classes, resources, …Thinkof well known traditional vocabularies:use the term “novel”“every novel is a fiction”“«The Glass Palace» is a novel”etc.RDFS defines resources and classes:everything in RDF is a “resource”“classes” are also resources, but……they are also a collection of possible resources (i.e., “individuals”)“fiction”, “novel”, …
  • 63.
    Classes, resources, …(cont.)Relationships are defined among resources:“typing”: an individual belongs to a specific class “«The Glass Palace» is a novel”to be more precise: “«http://.../000651409X» is a novel”“subclassing”: all instances of one are also the instances of the other (“every novel is a fiction”)RDFS formalizes these notions in RDF
  • 64.
    Classes, resources inRDF(S)rdfs:Classrdf:typerdf:type#Novelhttp://…isbn/000651409XRDFS defines the meaning of these terms(these are all special URI-s, we just use the namespace abbreviation)
  • 65.
    Inferred properties#Fictionrdf:typerdf:subClassOfrdf:type#Novelhttp://…isbn/000651409X(<http://…/isbn/000651409X> rdf:type#Fiction)is not in the original RDF data……but can be inferred from the RDFS rulesRDFS environments return that triple, too
  • 66.
    Inference: let usbe formal…The RDF Semantics document has a list of (33) entailment rules:“if such and such triples are in the graph, add this and this”do that recursively until the graph does not changeThe relevant rule for our example:If: uuu rdfs:subClassOf xxx . vvv rdf:type uuu .Then add: vvv rdf:type xxx .
  • 67.
    PropertiesProperty is aspecial class (rdf:Property)properties are also resources identified by URI-sThere is also a possibility for a “sub-property”all resources bound by the “sub” are also bound by the otherRange and domain of properties can be specifiedi.e., what type of resources serve as object and subject
  • 68.
    Example for propertycharacterization:title rdf:type rdf:Property; rdfs:domain :Fiction; rdfs:range rdfs:Literal.
  • 69.
    What does thismean?Again, new relations can be deduced. Indeed, if:title rdf:type rdf:Property;rdfs:domain :Fiction; rdfs:range rdfs:Literal.<http://…/isbn/000651409X> :title "The Glass Palace" .then the system can infer that:<http://…/isbn/000651409X> rdf:type :Fiction .
  • 70.
    LiteralsLiterals may havea data typefloats, integers, booleans, etc, defined in XML Schemasfull XML fragments(Natural) language can also be specified
  • 71.
    Examples for datatypes<http://…/isbn/000651409X> :page_number "543"^^xsd:integer ; :publ_date "2000"^^xsd:gYear ; :price "6.99"^^xsd:float .
  • 72.
    A bit ofRDFS can take you far…Remember the power of merge?We could have used, in our example:f:auteur is a subproperty of a:author and vice versa(although we will see other ways to do that…)Of course, in some cases, more complex knowledge is necessary (see later…)
  • 73.
    Find the rightexperts at NASAExpertise locater for nearly 70,000 NASA civil servants,
  • 74.
    integrate 6 or7 geographically distributed databases, …Michael Grove, Clark & Parsia, LLC, and Andrew Schain, NASA, (SWEO Case Study)
  • 75.
    How to getand create RDF Data?
  • 76.
    Simple approachWrite RDF/XML,RDFa, or Turtle “manually”In some cases that is necessary, but it really does not scale…
  • 77.
    RDF with XHTMLObviously,a huge source of informationBy adding some “meta” information, the same source can be reused for, eg, data integration, better mashups, etctypical example: your personal information, like address, should be readable for humans and processable by machines
  • 78.
    RDF with XML/(X)HTML(cont)Two solutions have emerged:use microformats and convert the content into RDFXSLT is the favorite approachadd RDF-like statements directly into XHTML via RDFa
  • 79.
    Bridge to relationaldatabasesData on the Web are mostly stored in databases“Bridges” are being defined:a layer between RDF and the relational dataRDB tables are “mapped” to RDF graphs, possibly on the flydifferent mapping approaches are being useda number RDB systems offer this facility already (eg, Oracle, OpenLink, …) W3C is working on a standard in this area
  • 80.
  • 81.
    Linked Open DataProjectGoal: “expose” open datasets in RDFSet RDF links among the data items from different datasetsSet up, if possible, query endpoints
  • 82.
    Example data source:DBpediaDBpedia is a community effort toextract structured (“infobox”) information from Wikipediaprovide a query endpoint to the datasetinterlink the DBpedia dataset with other datasets on the Web
  • 83.
    Extracting structured datafrom Wikipedia@prefix dbpedia <http://dbpedia.org/resource/>.@prefix dbterm <http://dbpedia.org/property/>.dbpedia:Amsterdam dbterm:officialName "Amsterdam" ; dbterm:longd "4” ; dbterm:longm "53" ; dbterm:longs "32” ; dbterm:leaderName dbpedia:Lodewijk_Asscher ; ... dbterm:areaTotalKm "219" ; ...dbpedia:ABN_AMRO dbterm:location dbpedia:Amsterdam ; ...
  • 84.
    Automatic links amongopen datasets<http://dbpedia.org/resource/Amsterdam> owl:sameAs <http://rdf.freebase.com/ns/...> ; owl:sameAs <http://sws.geonames.org/2759793> ; ...<http://sws.geonames.org/2759793> owl:sameAs <http://dbpedia.org/resource/Amsterdam> wgs84_pos:lat "52.3666667" ; wgs84_pos:long "4.8833333"; geo:inCountry <http://www.geonames.org/countries/#NL> ; ...Processors can switch automatically from one to the other…
  • 85.
  • 86.
  • 87.
    NYT articles onuniversity alumni
  • 88.
  • 89.
    Querying RDF graphsRememberthe Jena idiom:StmtIterator iter=model.listStatements(subject,null,null);while(iter.hasNext()) { st = iter.next(); p = st.getProperty(); o = st.getObject(); do_something(p,o);In practice, more complex queries into the RDF data are necessary
  • 90.
    something like “giveme (a,b) pairs for which there is an x such that (x parent a) and (b brother x) holds” (ie, return the uncles)
  • 91.
    The goal ofSPARQL (Query Language for RDF)Analyze the Jena exampleStmtIterator iter=model.listStatements(subject,null,null);while(iter.hasNext()) { st = iter.next(); p = st.getProperty(); o = st.getObject(); do_something(p,o);?o?p?o?psubject?o?p?o?p
  • 92.
    General: graph patternsThefundamental idea: use graph patternsthe pattern contains unbound symbolsby binding the symbols, subgraphs of the RDF graph are selectedif there is such a selection, the query returns bound resources
  • 93.
    Our Jena examplein SPARQLSELECT ?p ?oWHERE {subject ?p ?o}The triples in WHERE define the graph pattern, with ?p and ?o “unbound” symbolsThe query returns all p,o pairs?o?p?o?psubject?o?p?o?p
  • 94.
    Simple SPARQL exampleSELECT?isbn ?price ?currency # note: not ?x!WHERE {?isbn a:price ?x. ?x rdf:value ?price. ?x p:currency ?currency.}a:nameGhosh, Amitava:authora:authorhttp://…isbn/2020386682http://…isbn/000651409Xa:pricea:pricea:pricea:pricep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:value:£:€:€:$33506078
  • 95.
    Simple SPARQL exampleSELECT?isbn ?price ?currency # note: not ?x!WHERE {?isbn a:price ?x. ?x rdf:value ?price. ?x p:currency ?currency.}Returns: [<…409X>,33,:£]a:nameGhosh, Amitava:authora:authorhttp://…isbn/2020386682http://…isbn/000651409Xa:pricea:pricea:pricea:pricep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:value:£:€:€:$33506078
  • 96.
    Simple SPARQL exampleSELECT?isbn ?price ?currency # note: not ?x!WHERE {?isbn a:price ?x. ?x rdf:value ?price. ?x p:currency ?currency.}Returns: [<…409X>,33,:£], [<…409X>,50,:€]a:nameGhosh, Amitava:authora:authorhttp://…isbn/2020386682http://…isbn/000651409Xa:pricea:pricea:pricea:pricep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:value:£:€:€:$33506078
  • 97.
    Simple SPARQL exampleSELECT?isbn ?price ?currency # note: not ?x!WHERE {?isbn a:price ?x. ?x rdf:value ?price. ?x p:currency ?currency.}Returns: [<…409X>,33,:£], [<…409X>,50,:€],[<…6682>,60,:€]a:nameGhosh, Amitava:authora:authorhttp://…isbn/2020386682http://…isbn/000651409Xa:pricea:pricea:pricea:pricep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:value:£:€:€:$33506078
  • 98.
    Simple SPARQL exampleSELECT?isbn ?price ?currency # note: not ?x!WHERE {?isbn a:price ?x. ?x rdf:value ?price. ?x p:currency ?currency.}Returns: [<…409X>,33,:£], [<…409X>,50,:€],[<…6682>,60,:€], [<…6682>,78,:$]a:nameGhosh, Amitava:authora:authorhttp://…isbn/2020386682http://…isbn/000651409Xa:pricea:pricea:pricea:pricep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:value:£:€:€:$33506078
  • 99.
    Pattern constraintsSELECT ?isbn?price ?currency # note: not ?x!WHERE { ?isbn a:price ?x. ?x rdf:value ?price. ?x p:currency ?currency. FILTER(?currency == :€) }Returns: [<…409X>,50,:€], [<…6682>,60,:€]a:nameGhosh, Amitava:authora:authorhttp://…isbn/2020386682http://…isbn/000651409Xa:pricea:pricea:pricea:pricep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:valuep:currencyrdf:value:£:€:€:$33506078
  • 100.
    Many extra SPARQLfeaturesLimit the number of returned results; remove duplicates, sort them, …Optional branches: if some part of the pattern does not match, ignore itSpecify several data sources (via URI-s) within the query (essentially, a merge on-the-fly!)Construct a graph using a separate pattern on the query resultsIn SPARQL 1.1: updating data, not only query
  • 101.
    SPARQL usage inpracticeSPARQL is usually used over the networkseparate documents define the protocol and the result formatSPARQL Protocol for RDF with HTTP and SOAP bindingsSPARQL results in XML or JSON formatsBig datasets often offer “SPARQL endpoints” using this protocoltypical example: SPARQL endpoint to DBpedia
  • 102.
    SPARQL as aunifying pointApplicationSPARQL ConstructSPARQL ConstructSPARQL EndpointSPARQL EndpointSPARQL ProcessorDatabaseTriple storeNLP TechniquesRDFaGRDDL, RDFaSQLRDFRelationalDatabaseRDF GraphHTMLUnstructured TextXML/XHTML
  • 103.
    Integrate knowledge forChinese MedicineIntegration of a large number of TCM databases
  • 104.
    around 80 databases,around 200,000 records eachCourtesy of Huajun Chen, Zhejiang University, (SWEO Case Study)
  • 105.
  • 106.
    VocabulariesData integration needsagreements onterms “translator”, “author”categories used “Person”, “literature”relationships among those “an author is also a Person…”, “historical fiction is a narrower term than fiction”ie, new relationships can be deduced
  • 107.
    VocabulariesThere is aneed for “languages” to define such vocabulariesto define those vocabulariesto assign clear “semantics” on how new relationships can be deduced
  • 108.
    But what aboutRDFS?Indeed RDFS is such framework:there is typing, subtypingproperties can be put in a hierarchydatatypes can be definedRDFS is enough for many vocabulariesBut not for all!
  • 109.
    Three technologies haveemergedTo re-use thesauri, glossaries, etc: SKOSTo define more complex vocabularies with a strong logical underpinning: OWLGeneric framework to define rules on terms and data: RIF
  • 110.
  • 111.
    SKOSRepresent and shareclassifications, glossaries, thesauri, etcfor example:Dewey Decimal Classification, Art and Architecture Thesaurus, ACM classification of keywords and terms…classification/formalization of Web 2.0 type tagsDefine classes and properties to add those structures to an RDF universeallow for a quick port of this traditional data, combine it with other data
  • 112.
    Example: the term“Fiction”, as defined by the Library of Congress
  • 113.
    Example: the term“Fiction”, as defined by the Library of Congress
  • 114.
    Thesauri have identicalstructures…The structure of the LOC page is fairly typicallabel, alternate label, narrower, broader, …there is even an ISO standard for such structuresSKOS provides a basic structure to create an RDF representation of these
  • 115.
    LOC’s “Fiction” inSKOS/RDFLiteratureskos:ConceptFictionskos:prefLabelrdf:typeskos:prefLabelskos:broaderhttp://id.loc.gov/…#conceptskos:altLabelMetafictionskos:narrowerskos:altLabelskos:narrowerNovelsskos:prefLabelAllegoriesskos:prefLabelAdventure stories
  • 116.
    Usage of theLOC graphFictionskos:ConceptHistorical Fictionskos:prefLabelskos:prefLabelrdf:typeskos:broaderdc:subjectdc:titlehttp:.//…/isbn/…The GlassPalace
  • 117.
    Importance of SKOSSKOSprovides a simple bridge between the “print world” and the (Semantic) WebThesauri, glossaries, etc, from the library community can be made availableLOC is a good exampleSKOS can also be used to organize tags, annotate other vocabularies, …
  • 118.
    Importance of SKOSAnybodyin the World can refer to common conceptsthey mean the same for everybodyApplications may exploit the relationships among conceptseg, SPARQL queries may be issued on the merge of the library data and the LOC terms
  • 119.
    Semantic portal forart collectionsCourtesy of Jacco van Ossenbruggen, CWI, and Guus Schreiber, VU Amsterdam
  • 120.
  • 121.
    SKOS is notenough…SKOS may be used to provide simple vocabulariesBut it is not a complete solutionit concentrates on the concepts onlyno characterization of properties in generalsimple from a logical perspectiveie, few inferences are possible
  • 122.
    Application may wantmore…Complex applications may want more possibilities:characterization of properties identification of objects with different URI-sdisjointness or equivalence of classesconstruct classes, not only name themmore complex classification schemescan a program reason about some terms? E.g.:“if «Person» resources «A» and «B» have the same «foaf:email» property, then «A» and «B» are identical”etc.
  • 123.
    Web Ontology Language= OWLOWL is an extra layer, a bit like RDFS or SKOSown namespace, own termsit relies on RDF SchemasIt is a separate recommendationactually… there is a 2004 version of OWL (“OWL 1”)and there is an update (“OWL 2”) published in 2009
  • 124.
    OWL is complex…OWLis a large set of additional termsWe will not cover the whole thing here…
  • 125.
    Term equivalencesFor classes:owl:equivalentClass:two classes have the same individualsowl:disjointWith: no individuals in commonFor properties:owl:equivalentPropertyremember the a:author vs. f:auteur?owl:propertyDisjointWith
  • 126.
    Term equivalencesFor individuals:owl:sameAs:two URIs refer to the same concept (“individual”)owl:differentFrom: negation of owl:sameAs
  • 127.
    Other example: connectingto Frenchowl:equivalentClassa:Novelf:Romanowl:equivalentPropertya:authorf:auteur
  • 128.
    Typical usage ofowl:sameAsLinking our example of Amsterdam from one data set (DBpedia) to the other (Geonames):<http://dbpedia.org/resource/Amsterdam>owl:sameAs <http://sws.geonames.org/2759793>;This is the main mechanism of “Linking” in the Linked Open Data projectProperty characterizationIn OWL, one can characterize the behavior of properties (symmetric, transitive, functional, reflexive, inverse functional…)One property can be defined as the “inverse” of another
  • 129.
    What this meansis…If the following holds in our triples::email rdf:type owl:InverseFunctionalProperty.
  • 130.
    What this meansis…If the following holds in our triples::email rdf:type owl:InverseFunctionalProperty. <A> :email "mailto:a@b.c".<B> :email "mailto:a@b.c".
  • 131.
    What this meansis…If the following holds in our triples::email rdf:type owl:InverseFunctionalProperty. <A> :email "mailto:a@b.c".<B> :email "mailto:a@b.c".then, processed through OWL, the following holds, too:<A> owl:sameAs <B>.
  • 132.
    KeysInverse functional propertiesare important for identification of individualsthink of the email examplesBut… identification based on one property may not be enough
  • 133.
    Keys“if two personshave the same emails and the samehomepages then they are identical”Identification is based on the identical values of two propertiesThe rule applies to persons only
  • 134.
    Previous rule inOWL:Person rdf:type owl:Class; owl:hasKey (:email :homepage) .
  • 135.
    What it meansis…If:<A> rdf:type :Person ; :email "mailto:a@b.c"; :homepage "http://www.ex.org".<B> rdf:type :Person ; :email "mailto:a@b.c"; :homepage "http://www.ex.org".then, processed through OWL, the following holds, too:<A> owl:sameAs <B>.
  • 136.
    Classes in OWLInRDFS, you can subclass existing classes… that’s allIn OWL, you can construct classes from existing ones:enumerate its contentthrough intersection, union, complementetc
  • 137.
    Enumerate class content:Currency rdf:type owl:Class; owl:oneOf (:€ :£ :$).I.e., the class consists of exactly of those individuals and nothing else
  • 138.
    Union of classes:Novel rdf:type owl:Class.:Short_Story rdf:type owl:Class.:Poetry rdf:type owl:Class.:Literature rdf:type owl:Class; owl:unionOf (:Novel :Short_Story :Poetry).Other possibilities: owl:complementOf, owl:intersectionOf, …
  • 139.
    For example…If::Novel rdf:type owl:Class.:Short_Story rdf:type owl:Class.:Poetry rdf:type owl:Class.:Literature rdf:type owl:Class; owl:unionOf (:Novel :Short_Story :Poetry).<myWork> rdf:type :Novel .then the following holds, too:<myWork> rdf:type :Literature .
  • 140.
    It can bea bit more complicated…If::Novel rdf:type owl:Class.:Short_Story rdf:type owl:Class.:Poetry rdf:type owl:Class.:Literature rdf:type owlClass; owl:unionOf (:Novel :Short_Story :Poetry).fr:Roman owl:equivalentClass :Novel .<myWork> rdf:type fr:Roman .then, through the combination of different terms, the following still holds:<myWork> rdf:type :Literature .
  • 141.
    What we haveso far…The OWL features listed so far are already fairly powerfulE.g., various databases can be linked via owl:sameAs, functional or inverse functional properties, etc.Many inferred relationship can be found using a traditional rule engine
  • 142.
    However… that maynot be enoughVery large vocabularies might require even more complex featuressome major issuesthe way classes (i.e., “concepts”) are definedhandling of datatypes like intervalsOWL includes those extra features but… the inference engines become (much) more complex
  • 143.
    Example: property valuerestrictionsNew classes are created by restricting the property values on a classFor example: how would I characterize a “listed price”?it is a price that is given in one of the “allowed” currencies (€, £, or $)this defines a new class
  • 144.
    But: OWL ishard!The combination of class constructions with various restrictions is extremely powerfulWhat we have so far follows the same logic as beforeextend the basic RDF and RDFS possibilities with new featuresdefine their semantics, ie, what they “mean” in terms of relationshipsexpect to infer new relationships based on thoseHowever… a full inference procedure is hard not implementable with simple rule engines, for example
  • 145.
    OWL “species” orprofilesOWL species comes to the fore:restricting which terms can be used and under what circumstances (restrictions)if one abides to those restrictions, then simpler inference engines can be usedThey reflect compromises: expressiveness vs. implementability
  • 146.
    OWL SpeciesOWL FullOWLDLOWL RLOWL ELOWL QL
  • 147.
    OWL RLGoal: tobe implementable with rule enginesUsage follows a similar approach to RDFS:merge the ontology and the instance data into an RDF graph use the rule engine to add new triples (as long as it is possible)
  • 148.
    What can bedone in OWL RL?Many features are available:identity of classes, instances, propertiessubproperties, subclasses, domains, rangesunion and intersection of classes (but with some restrictions)property characterizations (functional, symmetric, etc)property chainskeyssome property restrictionsAll examples so far could be inferred with OWL RL!
  • 149.
    Improved Search viaOntology (GoPubMed)Search results are re-ranked using ontologiesrelated terms are highlighted
  • 150.
    Improved Search viaOntology (Go3R)Same dataset, different ontology
  • 151.
    (ontology is onnon-animal experimentation)Rules(RIF)
  • 152.
    Why rules onthe Semantic Web?Some conditions may be complicated in ontologies (ie, OWL)eg, Horn rules: (P1 & P2 & …) -> CIn many cases applications just want 2-3 rules to complete integrationIe, rules may be an alternative to (OWL based) ontologies
  • 153.
    Things you maywant to expressAn example from our bookshop integration:“I buy a novel with over 500 pages if it costs less than €20”something like (in an ad-hoc syntax):{ ?x rdf:type p:Novel; p:page_number ?n; p:price [ p:currency :€; rdf:value ?z ]. ?n > "500"^^xsd:integer. ?z < "20.0"^^xsd:double. }=> { <me> p:buys ?x }
  • 154.
    Things you maywant to expressp:Novelrdf:type?np:page_number?n>500p:buys?x?xmep:price:€p:currencyrdf:value?z?z<20
  • 155.
    RIF (Rule InterchangeFormat)The goals of the RIF work:define simple rule language(s) for the (Semantic) Webdefine interchange formats for rule based systemsRIF defines several “dialects” of languagesRIF is not bound to RDF onlyeg, relationships may involve more than 2 entitiesthere are dialects for production rule systems
  • 156.
    RIF CoreThe simplestRIF “dialect”A Core document isdirectives like import, prefix settings for URI-s, etca sequence of logical implications
  • 157.
    RIF Core exampleDocument( Prefix(cpt http://example.com/concepts#) Prefix(person http://example.com/people#) Prefix(isbn http://…/isbn/) Group ( Forall ?Buyer ?Book ?Seller ( cpt:buy(?Buyer ?Book ?Seller):- cpt:sell(?Seller ?Book ?Buyer) ) cpt:sell(person:John isbn:000651409X person:Mary) ))This infers the following relationship:cpt:buy(person:Mary isbn:000651409X person:John)
  • 158.
    Expressivity of RIFCoreFormally: definite Horn without function symbols, a.k.a. “Datalog”eg, p(a,b,c) is fine, but p(f(a),b,c) is notIncludes some extra featuresbuilt-in datatypes and predicates“local” symbols, a bit like blank nodes
  • 159.
    Expressivity of RIFCoreThere are also “safeness measures”eg, variable in a consequent should be in the antecedentthis secures a straightforward implementation strategy (“forward chaining”)
  • 160.
    RIF SyntaxesRIF definesa“presentation syntax”a standard XML syntax to encode and exchange the rulesthere is a draft for expressing Core in RDFjust like OWL is represented in RDF
  • 161.
    What about RDFand RIF?Typical scenario:the “data” of the application is available in RDFrules on that data is described using RIFthe two sets are “bound” (eg, RIF “imports” the data)a RIF processor produces new relationships
  • 162.
    To make RIF/RDFworkSome technical issues should be settled:RDF triples have to be representable in RIFvarious constructions (typing, datatypes, lists) should be alignedthe semantics of the two worlds should be compatibleThere is a separate document that brings these together
  • 163.
    Remember the whatwe wanted from Rules?{ ?x rdf:type p:Novel; p:page_number ?n; p:price [ p:currency :€; rdf:value ?z ]. ?n > "500"^^xsd:integer. ?z < "20.0"^^xsd:double. }=> { <me> p:buys ?x }
  • 164.
    The same withRIF Presentation syntaxDocument ( Prefix … Group ( Forall ?x ?n ?z ( <me>[p:buys->?x] :- And( ?x rdf:type p:Novel ?x[p:page_number->?n p:price->_abc] _abc[p:currency->:€ rdf:value->?z] External( pred:numeric-greater-than(?n "500"^^xsd:integer) ) External( pred:numeric-less-than(?z "20.0"^^xsd:double) ) ) ) ))
  • 165.
    Discovering new relationships…Forall?x ?n ?z ( <me>[p:buys->?x] :- And( ?x # p:Novel ?x[p:page_number->?n p:price->_abc] _abc[p:currency->:€ rdf:value->?z] External( pred:numeric-greater-than(?n "500"^^xsd:integer) ) External( pred:numeric-less-than(?z "20.0"^^xsd:double) ) ))
  • 166.
    Discovering new relationships…Forall?x ?n ?z ( <me>[p:buys->?x] :- And( ?x # p:Novel ?x[p:page_number->?n p:price->_abc] _abc[p:currency->:€ rdf:value->?z] External( pred:numeric-greater-than(?n "500"^^xsd:integer) ) External( pred:numeric-less-than(?z "20.0"^^xsd:double) ) ))combined with:<http://…/isbn/…> a p:Novel; p:page_number "600"^^xsd:integer ; p:price [ rdf:value "15.0"^^xsd:double ; p:currency :€ ] .
  • 167.
    Discovering new relationships…Forall?x ?n ?z ( <me>[p:buys->?x] :- And( ?x # p:Novel ?x[p:page_number->?n p:price->_abc] _abc[p:currency->:€ rdf:value->?z] External( pred:numeric-greater-than(?n "500"^^xsd:integer) ) External( pred:numeric-less-than(?z "20.0"^^xsd:double) ) ))combined with:<http://…/isbn/…> a p:Novel; p:page_number "600"^^xsd:integer ; p:price [ rdf:value "15.0"^^xsd:double ; p:currency :€ ] .yields:<me> p:buys <http://…/isbn/…> .
  • 168.
    RIF vs. OWL?Theexpressivity of the two is fairly identicalthe emphasis are a bit differentUsing rules vs. ontologies may largely depend onavailable toolspersonal technical experience and expertisetaste…
  • 169.
    What about OWLRL?OWL RL stands for “Rule Language”…OWL RL is in the intersection of RIF Core and OWLinferences in OWL RL can be expressed with RIF rulesRIF Core engines can act as OWL RL engines
  • 170.
    Inferencing and SPARQLQuestion:how do SPARQL queries and inferences work together?RDFS, OWL, and RIF produce new relationshipson what data do we query?Answer: in current SPARQL, that is not definedBut, in SPARQL 1.1 it is…
  • 171.
    SPARQL 1.1 andRDFS/OWL/RIFSPARQL Engine with entailmentRDF DataQuery resultentailmentRDFS/OWL/RIF dataSPARQL PatternSPARQL PatternRDF Data with extra triplespattern matching
  • 172.
    What have weachieved?(putting all this together)
  • 173.
    Remember the integrationexample?ManipulateQuery…ApplicationsMap,Expose,…Data represented in abstract formatData in various formats
  • 174.
    Same with whatwe learnedSPARQL,Inferences…ApplicationsRDB  RDF,GRDL, RDFa,…Data represented in RDF with extra knowledge (RDFS, SKOS, RIF, OWL,…)Data in various formats
  • 175.
    eTourism: provide personalizeditineraryIntegration of relevant data in Zaragoza (using RDF and ontologies)
  • 176.
    Use rules onthe RDF data to provide a proper itinerary Courtesy of Jesús Fernández, Mun. of Zaragoza, and Antonio Campos, CTIC (SWEO Use Case)
  • 177.
  • 178.
    Available specifications: Primers,GuidesThe “RDF Primer” and the “OWL Guide” give a formal introduction to RDF(S) and OWLSKOS has its separate “SKOS Primer”GRDDL Primer and RDFa Primer have been publishedThe W3C Semantic Web Activity Wiki has links to all the specifications
  • 179.
    “Core” vocabulariesThere arealso a number “core vocabularies”Dublin Core: about information resources, digital libraries, with extensions for rights, permissions, digital right managementFOAF: about people and their organizationsDOAP: on the descriptions of software projectsSIOC: Semantically-Interlinked Online CommunitiesvCard in RDF…One should never forget: ontologies/vocabularies must be shared and reused!
  • 180.
    Some booksJ. Pollock:Semantic Web for Dummies, 2009G. Antoniu and F. van Harmelen: Semantic Web Primer, 2nd edition in 2008D. Allemang and J. Hendler: Semantic Web for the Working Ontologist, 2008P. Hitzler, R. Sebastian, M. Krötzsch: Foundation of Semantic Web Technologies, 2009…See the separate Wiki page collecting book references
  • 181.
    Lots of Tools(not an exhaustive list!)Categories:
  • 182.
  • 183.
  • 184.
  • 185.
  • 186.
  • 187.
  • 188.
  • 189.
  • 190.
  • 191.
  • 192.
  • 193.
    Jena, AllegroGraph, Mulgara,Sesame, flickurl, …
  • 194.
    TopBraid Suite, Virtuoso environment,Falcon, Drupal 7, Redland, Pellet, …
  • 195.
    Disco, Oracle 11g, RacerPro,IODT, Ontobroker, OWLIM, Talis Platform, …
  • 196.
    RDF Gateway, RDFLib,Open Anzo, DartGrid, Zitgist, Ontotext, Protégé, …
  • 197.
    Thetus publisher, SemanticWorks,SWI-Prolog, RDFStore…
  • 198.
    …Further informationPlanet RDFaggregates a number of SW blogs:http://planetrdf.com/Semantic Web Interest Groupa forum developers with archived (and public) mailing list, and a constant IRC presence on freenode.net#swiganybody can sign up on the listhttp://www.w3.org/2001/sw/interest/
  • 199.
    Thank you foryour attention!These slides are also available on the Web: http://www.w3.org/2010/Talks/0622-SemTech-IH/

Editor's Notes

  • #17 The point is: they combine data drawn from data.gov.uk to produce, traditional, printed paper to be distributed in the neighborhood providing practical information like doctors, pharmacies, etc, in an up-to-date fashion
  • #85 List of universities come from Dbpedia -&gt; auto completion based on that list -&gt; NYT identifiers are present in DBPedia, used to back index into the NYT archives; sameAs links to, say, freeBase is also provided