Creating Knowledge out of Interlinked Data Sören Auer
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 2 http://lod2.eu 1. Reasoning does not scale on the Web • IR / one dimensional indexing scales (Google) • Next step conjunctive querying (OWL-QL?, dynamic scale-out / clustering) • Web scalable DL reasoning is out-of-sight (maybe fragment, fuzzy reasoning has some chances) 2. If it would scale it would not be affordable • “What is the only former Yugoslav republic in the European Union?” • 2880 POWER7 cores, 16 Terabytes memory, 4 Terabytes clustered storage (IBM Watson) still can not answer this question Why the Semantic Web won‘t work (soon)
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 3 http://lod2.eu What shall we do inbetween? How can we make it happen faster? If the Semantic Web does not happen soon… Dayton BRANDFIELD (American) Old Hill Road, April 2, 1937, (historical depression)
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 4 http://lod2.eu … and try to find an shallow migration path We can do what works already now… http://www.flickr.com/photos/jurvetson/
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 5 http://lod2.eu Achievements 1. Extension of the Web with a data commons (25B facts 2. vibrant, global RTD community 3. Industrial uptake begins (e.g. BBC, Thomson Reuters, Eli Lilly) 4. Emerging governmental adoption in sight 5. Establishing Linked Data as a deployment path for the Semantic Web. What works now? What has to be done?  Challenges 1. Coherence: Relatively few, expensively maintained links 2. Quality: partly low quality data and inconsistencies 3. Performance: Still substantial penalties compared to relational 4. Data consumption: large-scale processing, schema mapping and data fusion still in its infancy 5. Usability: Establishing direct end-user tools and network effect • Web - a global, distributed platform for data, information and knowledge integration • exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF July 2007 April 2008 September 2008 July 2009
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 6 http://lod2.eu Inter- linking/ Fusing Classifi- cation/ Enrichment Quality Analysis Evolution / Repair Search/ Browsing/ Exploration Extraction Storage/ Querying Manual revision/ authoring Linked Data Lifecycle
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 7 http://lod2.eu Extraction
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 8 http://lod2.eu From unstructured sources • NLP, text mining, annotation From semi-structured sources • DBpedia, LinkedGeoData, SCOVO/DataCube From structured sources • RDB2RDF Extraction
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 9 http://lod2.eu extract structured information from Wikipedia & make this information available on the Web as LOD: • ask sophisticated queries against Wikipedia (e.g. universities in brandenburg, mayors of elevated towns, soccer players), • link other data sets on the Web to Wikipedia data • Represents a community consensus Recently launched DBpedia Live transforms Wikipedia into a structured knowledge base Transforming Wikipedia into an Knowledge Base
Structure in Wikipedia • Title • Abstract • Infoboxes • Geo-coordinates • Categories • Images • Links – other language versions – other Wikipedia pages – To the Web – Redirects – Disambiguations 13.08.2016 Sören Auer - The emerging Web of Linked Data 10
Infobox templates {{Infobox Korean settlement | title = Busan Metropolitan City | img = Busan.jpg | imgcaption = A view of the [[Geumjeong]] district in Busan | hangul = 부산 광역시 ... | area_km2 = 763.46 | pop = 3635389 | popyear = 2006 | mayor = Hur Nam-sik | divs = 15 wards (Gu), 1 county (Gun) | region = [[Yeongnam]] | dialect = [[Gyeongsang]] }} http://dbpedia.org/resource/Busan dbp:Busan dbpp:title ″Busan Metropolitan City″ dbp:Busan dbpp:hangul ″부산 광역시″@Hang dbp:Busan dbpp:area_km2 ″763.46“^xsd:float dbp:Busan dbpp:pop ″3635389“^xsd:int dbp:Busan dbpp:region dbp:Yeongnam dbp:Busan dbpp:dialect dbp:Gyeongsang ... Wikitext-Syntax RDF representation 13.08.2016 Sören Auer - The emerging Web of Linked Data 11
A vast multi-lingual, multi-domain knowledge base DBpedia extraction results in: • descriptions of ca. 3.4 million things (1.5 million classified in a consistent ontology, including 312,000 persons, 413,000 places, 94,000 music albums, 49,000 films, 15,000 video games, 140,000 organizations, 146,000 species, 4,600 diseases • labels and abstracts for these 3.2 million things in up to 92 different languages; 1,460,000 links to images and 5,543,000 links to external web pages; 4,887,000 external links into other RDF datasets, 565,000 Wikipedia categories, and 75,000 YAGO categories • altogether over 1 billion pieces of information (i.e. RDF triples): 257M from English edition, 766M from other language editions • DBpedia Live (http://live.dbpedia.org/sparql/) & Mappings Wiki (http://mappings.dbpedia.org) integrate the community into a refinement cycle • Upcomming DBpedia inline 13.08.2016 Sören Auer - The emerging Web of Linked Data 12
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 13 http://lod2.eu OpenStreetMaps • Wikipedia for GeoData • Outperformes commercial map providers in many regions • Extremly rich source of data (shop hours, trash bins, excavations, …) LinkedGeoData – revealing the data behind OpenStreetMaps LinkedGeoData Architecture
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 14 http://lod2.eu
Stevie http://tiny.cc/stevie10 Vicibit http://vicibit.linkedgeodata.org BeAware http://beaware.at/ LinkedGeoData Apps
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 16 http://lod2.eu DataCube – Publishing Statistical Data http://publishing-statistical- data.googlecode.com/svn/trunk/specs/src/main/html/cube.html
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 17 http://lod2.eu DataCube Importer – Linked Statistical Data
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 18 http://lod2.eu Many different approaches (D2R, Virtuoso RDF Views, Triplify, …) No agreement on a formal semantics of RDF2RDF mapping • LOD readiness, SPARQL-SQL translation W3C RDB2RDF WG Extraction Relational Data Tool Triplify D2RQ Virtuoso RDF Views Technology Scripting languages (PHP) Java Whole middleware solution SPARQL endpoint - X X Mapping language SQL RDF based RDF based Mapping generation Manual Semi- automatic Manual Scalability Medium-high (but no SPARQL) Medium High
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 19 http://lod2.eu From unstructured sources • Deploy existing NLP approaches (OpenCalais, Ontos API) • Develop standardized, LOD enabled interfaces between NLP tools (NLP2RDF) From semi-structured sources • Efficient bi-directional synchronization From structured sources • Declarative syntax and semantics of data model transformations (W3C WG RDB2RDF) Orthogonal challenges • Using LOD as background knowledge • Provenance Extraction Challenges
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 20 http://lod2.euStorage and Querying
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 21 http://lod2.eu Still by a factor 5-50 slower than relational data management (BSBM, DBpedia Benchmark) Performance increases steadily Comprehensive, well-supported open-soure and commercial implementations are available: • OpenLink’s Virtuoso (os+commercial) • Big OWLIM (commercial), Swift OWLIM (os) • 4store (os) • Talis (hosted) • Bigdata (distributed) • Allegrograph (commercial) • Mulgara (os) RDF Data Management
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 22 http://lod2.eu • Uses DBpedia as data and a selection of 25 frequently executed queries • Can generate fractions and multiples of DBpedia‘s size • Does not resemble relational data Performance differences, observed with other benchmarks are amplified DBpedia Benchmark Geometric Mean
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 23 http://lod2.eu • Reduce the performance gap between relational and RDF data management • SPARQL Query extensions • Spatial/semantic/temporal data management • More advanced query result caching • View maintenance / adaptive reorganization based on common access patterns • More realistic benchmarks Storage and Querying Challenges
Authoring
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 25 http://lod2.eu 1. Semantic (Text) Wikis • Authoring of semantically annotated texts 2. Semantic Data Wikis • Direct authoring of structured information (i.e. RDF, RDF-Schema, OWL) Two Kinds of Semantic Wikis
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 26 http://lod2.eu Versatile domain-independent tool Serves as Linked Data / SPARQL endpoint on the Data Web Open-source project hosted at Google code Not just a Wiki UI, but a whole framework for the development of Semantic Web applications Developed in PHP based on the Zend framework Very active developer and user community More than 500 downloads monthly Large number of use cases OntoWiki – a semantic data wiki
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 27 http://lod2.eu OntoWiki Dynamische Sichten auf die Wissensbasis 13.08.2016 Sören Auer - The emerging 27
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 28 http://lod2.eu OntoWiki RDF-Triple auf Resourcen- Detailseite 13.08.2016 Sören Auer - The emerging Web of Linked 28
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 29 http://lod2.eu OntoWiki Dynamische Vorschläge aus dem Daten Web 13.08.2016 Sören Auer - The emerging Web of Linked 29
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 30 http://lod2.eu Catalogus Professorum Lipsiensis
13.08.2016 Sören Auer - The emerging Web of Linked Data 31 OntoWiki: Caucasian Spiders
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 32 http://lod2.eu RDFauthor in OntoWiki
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 33 http://lod2.eu OntoWiki: Supporting Requirements engineering
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 34 http://lod2.eu Semantic Portal with OntoWiki: Vakantieland
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 35 http://lod2.eu RDFaCE- RDFa Content Editor
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 36 http://lod2.eu © CC-BY-NC-ND by ~Dezz~ (residae on flickr) Linking
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 37 http://lod2.eu Automatic Semi-automatic • SILK • LIMES Manual • Sindice integration into UIs • Semantic Pingback LOD Linking
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 38 http://lod2.eu update and notification services for LOD Downward compatible with Pingback (blogosphere) http://aksw.org/Projects/SemanticPingBack Creating a network effect around Linking Data: Semantic Pingback
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 39 http://lod2.eu Visualizing Pingbacks in OntoWiki
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 40 http://lod2.eu Only 5% of the information on the Data Web is actually linked • Make sense of work in the de-duplication/record linkage literature • Consider the open world nature of Linked Data • Use LOD background knowledge • Zero-configuration linking • Explore active learning approaches, which integrate users in a feedback loop • Maintain a 24/7 linking service: Linked Open Data Around-The- Clock project (LATC-project.eu) Interlinking Challenges
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 41 http://lod2.eu Enrichment
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 42 http://lod2.eu Linked Data is mainly instance data and !!! ORE (Ontology Repair and Enrichment) tool allows to improve an OWL ontology by fixing inconsistencies & making suggestions for adding further axioms. • Ontology Debugging: OWL reasoning to detect inconsistencies and satisfiable classes + detect the most likely sources for the problems. user can create a repair plan, while maintaining full control. • Ontology Enrichment: uses the DL-Learner framework to suggest definitions & super classes for existing classes in the KB. works if instance data is available for harmonising schema and data. http://aksw.org/Projects/ORE Enrichment & Repair
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 43 http://lod2.euAnalysis Quality CC BY SA Wikipedia
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 44 http://lod2.eu Quality on the Data Web is varying a lot • Hand crafted or expensively curated knowledge base (e.g. DBLP, UMLS) vs. extracted from text or Web 2.0 sources (DBpedia) Research Challenge • Establish measures for assessing the authority, provenance, reliability of Data Web resources Linked Data Quality Analysis
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 45 http://lod2.eu Evolution © CC-BY-SA by alasis on flickr)
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 46 http://lod2.eu • unified method, for both data evolution and ontology refactoring. • modularized, declarative definition of evolution patterns is relatively simple compared to an imperative description of evolution • allows domain experts and knowledge engineers to amend the ontology structure and modify data with just a few clicks • Combined with RDF representation of evolution patterns and their exposure on the Linked Data Web, EvoPat facilitates the development of an evolution pattern ecosystem • patterns can be shared and reused on the Data Web. • declarative definition of bad smells and corresponding evolution patterns promotes the (semi-)automatic improvement of information quality. EvoPat – Pattern based KB Evolution
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 47 http://lod2.eu Evolution Patterns
13.08.2016 Sören Auer - The emerging Web of Linked Data 48
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 49 http://lod2.eu Exploration
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 51 http://lod2.eu Visual Query Builder
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 52 http://lod2.eu Relationship Finder in CPL
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 53 http://lod2.eu Inter- linking/ Fusing Classifi- cation/ Enrichment Quality Analysis Evolution / Repair Search/ Browsing/ Exploration Extraction Storage/ Querying Manual revision/ authoring LOD Lifecycle supported by Debian based LOD2 Stack (to be released in September)
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 54 http://lod2.eu Enables applications that allow for • getting a 360 ° view on an issue • rapid fact-checking • cross-referencing and checking of statistical claims (“how to lie with statistics”) • increased transparency in public debate • release of the creative potential of “the crowd” Help citizens in their daily life • to understand their governments better • find good places to live (little pollution, good schools, close to protected natural sites…) • locate public services (administrative offices, public toilets…) Brings citizens and government closer together Some Benefits of Linked Open Gov’t Data
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 55 http://lod2.eu PublicData.eu
http://fintrans.publicdata.eu
http://energy.publicdata.eu
http://scoreboard.lod2.eu
13.08.2016 Sören Auer - The emerging Web of Linked Data 60
13.08.2016 Sören Auer - The emerging Web of Linked Data 61
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 62 http://lod2.eu 1. Linked Enterprise Intra Data Webs can fill the gap between Intra-/Extranets and EIS/ERP 2. Facilitates data integration along value-chains within and across enterprises 3. The pragmatic, incremental, vocabulary based Linked Data approach reduces data integration costs significantly 4. The wealth of knowledge available as Linked Open Data can be leveraged as background knowledge for Enterprise applications Linked Enterprise Data
Creating Knowledge out of Interlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 63 http://lod2.eu Thanks for your attention! Sören Auer http://www.informatik.uni-leipzig.de/~auer/ | http://aksw.org | http://lod2.org auer@uni-leipzig.de

Creating knowledge out of interlinked data

  • 1.
    Creating Knowledge outof Interlinked Data Sören Auer
  • 2.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 2 http://lod2.eu 1. Reasoning does not scale on the Web • IR / one dimensional indexing scales (Google) • Next step conjunctive querying (OWL-QL?, dynamic scale-out / clustering) • Web scalable DL reasoning is out-of-sight (maybe fragment, fuzzy reasoning has some chances) 2. If it would scale it would not be affordable • “What is the only former Yugoslav republic in the European Union?” • 2880 POWER7 cores, 16 Terabytes memory, 4 Terabytes clustered storage (IBM Watson) still can not answer this question Why the Semantic Web won‘t work (soon)
  • 3.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 3 http://lod2.eu What shall we do inbetween? How can we make it happen faster? If the Semantic Web does not happen soon… Dayton BRANDFIELD (American) Old Hill Road, April 2, 1937, (historical depression)
  • 4.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 4 http://lod2.eu … and try to find an shallow migration path We can do what works already now… http://www.flickr.com/photos/jurvetson/
  • 5.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 5 http://lod2.eu Achievements 1. Extension of the Web with a data commons (25B facts 2. vibrant, global RTD community 3. Industrial uptake begins (e.g. BBC, Thomson Reuters, Eli Lilly) 4. Emerging governmental adoption in sight 5. Establishing Linked Data as a deployment path for the Semantic Web. What works now? What has to be done?  Challenges 1. Coherence: Relatively few, expensively maintained links 2. Quality: partly low quality data and inconsistencies 3. Performance: Still substantial penalties compared to relational 4. Data consumption: large-scale processing, schema mapping and data fusion still in its infancy 5. Usability: Establishing direct end-user tools and network effect • Web - a global, distributed platform for data, information and knowledge integration • exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF July 2007 April 2008 September 2008 July 2009
  • 6.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 6 http://lod2.eu Inter- linking/ Fusing Classifi- cation/ Enrichment Quality Analysis Evolution / Repair Search/ Browsing/ Exploration Extraction Storage/ Querying Manual revision/ authoring Linked Data Lifecycle
  • 7.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 7 http://lod2.eu Extraction
  • 8.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 8 http://lod2.eu From unstructured sources • NLP, text mining, annotation From semi-structured sources • DBpedia, LinkedGeoData, SCOVO/DataCube From structured sources • RDB2RDF Extraction
  • 9.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 9 http://lod2.eu extract structured information from Wikipedia & make this information available on the Web as LOD: • ask sophisticated queries against Wikipedia (e.g. universities in brandenburg, mayors of elevated towns, soccer players), • link other data sets on the Web to Wikipedia data • Represents a community consensus Recently launched DBpedia Live transforms Wikipedia into a structured knowledge base Transforming Wikipedia into an Knowledge Base
  • 10.
    Structure in Wikipedia •Title • Abstract • Infoboxes • Geo-coordinates • Categories • Images • Links – other language versions – other Wikipedia pages – To the Web – Redirects – Disambiguations 13.08.2016 Sören Auer - The emerging Web of Linked Data 10
  • 11.
    Infobox templates {{Infobox Koreansettlement | title = Busan Metropolitan City | img = Busan.jpg | imgcaption = A view of the [[Geumjeong]] district in Busan | hangul = 부산 광역시 ... | area_km2 = 763.46 | pop = 3635389 | popyear = 2006 | mayor = Hur Nam-sik | divs = 15 wards (Gu), 1 county (Gun) | region = [[Yeongnam]] | dialect = [[Gyeongsang]] }} http://dbpedia.org/resource/Busan dbp:Busan dbpp:title ″Busan Metropolitan City″ dbp:Busan dbpp:hangul ″부산 광역시″@Hang dbp:Busan dbpp:area_km2 ″763.46“^xsd:float dbp:Busan dbpp:pop ″3635389“^xsd:int dbp:Busan dbpp:region dbp:Yeongnam dbp:Busan dbpp:dialect dbp:Gyeongsang ... Wikitext-Syntax RDF representation 13.08.2016 Sören Auer - The emerging Web of Linked Data 11
  • 12.
    A vast multi-lingual,multi-domain knowledge base DBpedia extraction results in: • descriptions of ca. 3.4 million things (1.5 million classified in a consistent ontology, including 312,000 persons, 413,000 places, 94,000 music albums, 49,000 films, 15,000 video games, 140,000 organizations, 146,000 species, 4,600 diseases • labels and abstracts for these 3.2 million things in up to 92 different languages; 1,460,000 links to images and 5,543,000 links to external web pages; 4,887,000 external links into other RDF datasets, 565,000 Wikipedia categories, and 75,000 YAGO categories • altogether over 1 billion pieces of information (i.e. RDF triples): 257M from English edition, 766M from other language editions • DBpedia Live (http://live.dbpedia.org/sparql/) & Mappings Wiki (http://mappings.dbpedia.org) integrate the community into a refinement cycle • Upcomming DBpedia inline 13.08.2016 Sören Auer - The emerging Web of Linked Data 12
  • 13.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 13 http://lod2.eu OpenStreetMaps • Wikipedia for GeoData • Outperformes commercial map providers in many regions • Extremly rich source of data (shop hours, trash bins, excavations, …) LinkedGeoData – revealing the data behind OpenStreetMaps LinkedGeoData Architecture
  • 14.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 14 http://lod2.eu
  • 15.
  • 16.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 16 http://lod2.eu DataCube – Publishing Statistical Data http://publishing-statistical- data.googlecode.com/svn/trunk/specs/src/main/html/cube.html
  • 17.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 17 http://lod2.eu DataCube Importer – Linked Statistical Data
  • 18.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 18 http://lod2.eu Many different approaches (D2R, Virtuoso RDF Views, Triplify, …) No agreement on a formal semantics of RDF2RDF mapping • LOD readiness, SPARQL-SQL translation W3C RDB2RDF WG Extraction Relational Data Tool Triplify D2RQ Virtuoso RDF Views Technology Scripting languages (PHP) Java Whole middleware solution SPARQL endpoint - X X Mapping language SQL RDF based RDF based Mapping generation Manual Semi- automatic Manual Scalability Medium-high (but no SPARQL) Medium High
  • 19.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 19 http://lod2.eu From unstructured sources • Deploy existing NLP approaches (OpenCalais, Ontos API) • Develop standardized, LOD enabled interfaces between NLP tools (NLP2RDF) From semi-structured sources • Efficient bi-directional synchronization From structured sources • Declarative syntax and semantics of data model transformations (W3C WG RDB2RDF) Orthogonal challenges • Using LOD as background knowledge • Provenance Extraction Challenges
  • 20.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 20 http://lod2.euStorage and Querying
  • 21.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 21 http://lod2.eu Still by a factor 5-50 slower than relational data management (BSBM, DBpedia Benchmark) Performance increases steadily Comprehensive, well-supported open-soure and commercial implementations are available: • OpenLink’s Virtuoso (os+commercial) • Big OWLIM (commercial), Swift OWLIM (os) • 4store (os) • Talis (hosted) • Bigdata (distributed) • Allegrograph (commercial) • Mulgara (os) RDF Data Management
  • 22.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 22 http://lod2.eu • Uses DBpedia as data and a selection of 25 frequently executed queries • Can generate fractions and multiples of DBpedia‘s size • Does not resemble relational data Performance differences, observed with other benchmarks are amplified DBpedia Benchmark Geometric Mean
  • 23.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 23 http://lod2.eu • Reduce the performance gap between relational and RDF data management • SPARQL Query extensions • Spatial/semantic/temporal data management • More advanced query result caching • View maintenance / adaptive reorganization based on common access patterns • More realistic benchmarks Storage and Querying Challenges
  • 24.
  • 25.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 25 http://lod2.eu 1. Semantic (Text) Wikis • Authoring of semantically annotated texts 2. Semantic Data Wikis • Direct authoring of structured information (i.e. RDF, RDF-Schema, OWL) Two Kinds of Semantic Wikis
  • 26.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 26 http://lod2.eu Versatile domain-independent tool Serves as Linked Data / SPARQL endpoint on the Data Web Open-source project hosted at Google code Not just a Wiki UI, but a whole framework for the development of Semantic Web applications Developed in PHP based on the Zend framework Very active developer and user community More than 500 downloads monthly Large number of use cases OntoWiki – a semantic data wiki
  • 27.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 27 http://lod2.eu OntoWiki Dynamische Sichten auf die Wissensbasis 13.08.2016 Sören Auer - The emerging 27
  • 28.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 28 http://lod2.eu OntoWiki RDF-Triple auf Resourcen- Detailseite 13.08.2016 Sören Auer - The emerging Web of Linked 28
  • 29.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 29 http://lod2.eu OntoWiki Dynamische Vorschläge aus dem Daten Web 13.08.2016 Sören Auer - The emerging Web of Linked 29
  • 30.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 30 http://lod2.eu Catalogus Professorum Lipsiensis
  • 31.
    13.08.2016 Sören Auer -The emerging Web of Linked Data 31 OntoWiki: Caucasian Spiders
  • 32.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 32 http://lod2.eu RDFauthor in OntoWiki
  • 33.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 33 http://lod2.eu OntoWiki: Supporting Requirements engineering
  • 34.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 34 http://lod2.eu Semantic Portal with OntoWiki: Vakantieland
  • 35.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 35 http://lod2.eu RDFaCE- RDFa Content Editor
  • 36.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 36 http://lod2.eu © CC-BY-NC-ND by ~Dezz~ (residae on flickr) Linking
  • 37.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 37 http://lod2.eu Automatic Semi-automatic • SILK • LIMES Manual • Sindice integration into UIs • Semantic Pingback LOD Linking
  • 38.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 38 http://lod2.eu update and notification services for LOD Downward compatible with Pingback (blogosphere) http://aksw.org/Projects/SemanticPingBack Creating a network effect around Linking Data: Semantic Pingback
  • 39.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 39 http://lod2.eu Visualizing Pingbacks in OntoWiki
  • 40.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 40 http://lod2.eu Only 5% of the information on the Data Web is actually linked • Make sense of work in the de-duplication/record linkage literature • Consider the open world nature of Linked Data • Use LOD background knowledge • Zero-configuration linking • Explore active learning approaches, which integrate users in a feedback loop • Maintain a 24/7 linking service: Linked Open Data Around-The- Clock project (LATC-project.eu) Interlinking Challenges
  • 41.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 41 http://lod2.eu Enrichment
  • 42.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 42 http://lod2.eu Linked Data is mainly instance data and !!! ORE (Ontology Repair and Enrichment) tool allows to improve an OWL ontology by fixing inconsistencies & making suggestions for adding further axioms. • Ontology Debugging: OWL reasoning to detect inconsistencies and satisfiable classes + detect the most likely sources for the problems. user can create a repair plan, while maintaining full control. • Ontology Enrichment: uses the DL-Learner framework to suggest definitions & super classes for existing classes in the KB. works if instance data is available for harmonising schema and data. http://aksw.org/Projects/ORE Enrichment & Repair
  • 43.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 43 http://lod2.euAnalysis Quality CC BY SA Wikipedia
  • 44.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 44 http://lod2.eu Quality on the Data Web is varying a lot • Hand crafted or expensively curated knowledge base (e.g. DBLP, UMLS) vs. extracted from text or Web 2.0 sources (DBpedia) Research Challenge • Establish measures for assessing the authority, provenance, reliability of Data Web resources Linked Data Quality Analysis
  • 45.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 45 http://lod2.eu Evolution © CC-BY-SA by alasis on flickr)
  • 46.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 46 http://lod2.eu • unified method, for both data evolution and ontology refactoring. • modularized, declarative definition of evolution patterns is relatively simple compared to an imperative description of evolution • allows domain experts and knowledge engineers to amend the ontology structure and modify data with just a few clicks • Combined with RDF representation of evolution patterns and their exposure on the Linked Data Web, EvoPat facilitates the development of an evolution pattern ecosystem • patterns can be shared and reused on the Data Web. • declarative definition of bad smells and corresponding evolution patterns promotes the (semi-)automatic improvement of information quality. EvoPat – Pattern based KB Evolution
  • 47.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 47 http://lod2.eu Evolution Patterns
  • 48.
    13.08.2016 Sören Auer -The emerging Web of Linked Data 48
  • 49.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 49 http://lod2.eu Exploration
  • 51.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 51 http://lod2.eu Visual Query Builder
  • 52.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 52 http://lod2.eu Relationship Finder in CPL
  • 53.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 53 http://lod2.eu Inter- linking/ Fusing Classifi- cation/ Enrichment Quality Analysis Evolution / Repair Search/ Browsing/ Exploration Extraction Storage/ Querying Manual revision/ authoring LOD Lifecycle supported by Debian based LOD2 Stack (to be released in September)
  • 54.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 54 http://lod2.eu Enables applications that allow for • getting a 360 ° view on an issue • rapid fact-checking • cross-referencing and checking of statistical claims (“how to lie with statistics”) • increased transparency in public debate • release of the creative potential of “the crowd” Help citizens in their daily life • to understand their governments better • find good places to live (little pollution, good schools, close to protected natural sites…) • locate public services (administrative offices, public toilets…) Brings citizens and government closer together Some Benefits of Linked Open Gov’t Data
  • 55.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 55 http://lod2.eu PublicData.eu
  • 56.
  • 57.
  • 58.
  • 60.
    13.08.2016 Sören Auer -The emerging Web of Linked Data 60
  • 61.
    13.08.2016 Sören Auer -The emerging Web of Linked Data 61
  • 62.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 62 http://lod2.eu 1. Linked Enterprise Intra Data Webs can fill the gap between Intra-/Extranets and EIS/ERP 2. Facilitates data integration along value-chains within and across enterprises 3. The pragmatic, incremental, vocabulary based Linked Data approach reduces data integration costs significantly 4. The wealth of knowledge available as Linked Open Data can be leveraged as background knowledge for Enterprise applications Linked Enterprise Data
  • 63.
    Creating Knowledge out ofInterlinked Data Sören Auer – WIMS: Creating Knowledge out of Linked Data 26.5.2011 Page 63 http://lod2.eu Thanks for your attention! Sören Auer http://www.informatik.uni-leipzig.de/~auer/ | http://aksw.org | http://lod2.org auer@uni-leipzig.de

Editor's Notes

  • #37 http://www.flickr.com/photos/residae/2560241604/#/
  • #46 http://www.flickr.com/photos/alasis/3541341601/sizes/l/in/photostream/
  • #55 Gov/citizen interaction: data as communication aid