knowIT Mapping out Informatics systems Laurent Alquier Keith McCormick Ed Jaeger
About Laurent Alquier Software engineer, Project lead Johnson & Johnson Pharmaceutical Research & Development, L.L.C [email_address]
Could you answer these questions ? Can you give us a list of all of your applications, related servers and stakeholders and send us an update every six months ? All Linux servers need to be patched this weekend. Can you send an outage announcement with a list of affected applications by tomorrow ? Is this server still in use ? Can we retire it ? What is the meaning of DRU ? (Based on real questions)
Systems knowledge
knowIT in a nutshell A collaborative database Semantic wiki Capture knowledge about informatics systems Information Systems components Applications, Servers, Data sources, plugins Map relationships between components Capture Business context around them Organizations, Companies, Locations Document known issues, procedures, processes
Goals Answer recurring questions Subject Matter Experts lists for Application Support Application / License rationalization Outage communications Increase knowledge retention Many ways to contribute Facilitate “Transfer In / Transfer Out” Capture knowledge from experts before they leave Facilitate learning for new resources Enable self service Many ways to search and explore
Pragmatic approach Bottom up knowledge management in a corporate, R&D environment Search is not enough Complementary to a document library with search index Capture details about individual components of systems Rely on queries as much as search Change will happen Plan for future integration and migration from the start Import content from several sources Export content to several formats “ Know your content, respect your users. “ – E.Tufte Accept incomplete content Evolve the data model as necessary Let real data, use cases drive requirements Above all, remain flexible
Evolution Started as disconnected files Turned into a relational database Rigid design Lack of collaboration tools Solution: Collaborative Database using a Semantic Wiki Collaborative features and flexibility of wiki Structure from Semantic annotations
Collaborative database Flexible yet structured content management Collaborative data model Discussions, comments, community editing Knowledge management tools Redirections, wanted pages Automated maintenance tasks Background jobs to enforce consistency and updates Monitoring tools, change tracking Modular and extensible design Templates Open source components
Semantic Media Wiki Based on Media Wiki Proven platform (Wikipedia) Redirect, wanted pages, templates, API, bots Active development, commercial support No licensing fee (PHP, Mysql) Structure from Semantic annotations Inline annotations Supports forms and direct annotations Map complex relationships between objects Allow both Search and Queries Multiple input / output formats Compatible with Semantic Web integration Semantic Web in a bottle
Semantic Annotations Tags with meaning Syntax Triple: Page -> Property -> Value [[Has support contact::Help Desk]] Data types Page, URL, Date, String, Text, Number, Geo-location Custom units for Number Browse properties Summary of all properties for a page
Relationships Defined as links to other pages Enhanced with semantic properties Tracking lists of things is not enough Knowledge comes from understanding relationships SMW assisted Ontology design
Wiki ? What wiki ? Focus on content , not technology Occasional users less intimidated when wiki tools are not visible But keep wiki tools available to advanced users Use forms to standardize data capture Make semantic annotations invisible using forms and templates Enforce (some) naming conventions Auto-completion Automated page names Be ready to provide help with difficult tasks Provide guidance and training Front loading wiki with data users care about
Content Migration From relational tables to Categories and Pages Review data model, drop unnecessary attributes Create forms, templates, properties in Semantic MediaWiki One category per page Separate ‘semantic categories’ from ‘supporting categories’ Extract old content into tabular form Review, clean up, correct Unique titles (Disambiguation) Special characters in titles Load pages in bulk using PHP API (bulkinsert.php) Consider specialized import forms if content needs detailed review Example: Support articles
Queries Visualize structure of content Ad-hoc reports Interactive queries ( Exhibit ) Automate system configuration pages Architectural layers Business, Functional, Process, Data, Applications, Physical Network diagrams Concepts Saved queries, dynamic categories
Enhanced Search Default search replaced by Sphinx Search extension Faceted search Drill down by properties Search results grouped by Category Semantic search Semantic summary instead of excerpt Customized by Category Annotations used to improve results Aliases, keywords Related terms Selection of default category Feedback option Ask a question
Input flexibility - Data capture Import Manually using Forms Remote CSV files, databases, LDAP FOAF format to retrieve and provide vocabularies OWL DL ontologies can be imported Explicit statements only – no support for reasoning Query remote sources Linked data import SMW+ can enrich page annotations with queries across multiple sources Supports OpenCalais, DBPedia, RSS feeds
Output flexibility - Data integration Export HTML, PDF, CSV, XML, Email, Maps (Yahoo, Google, Open Layers), Timeline (Simile), Google graphs, vCard, iCalendar Machine readable Default RSS feed replaced by #ask query for recent content RDF view for each page RDFa, CSV index, FOAF files, Web Services (SMW+) Ontology and content export RDF dumps / SPARQL endpoint available Follows Linked Data principles One page per entity One HTTP URI for each entity RDF information available from each page RDF statements are browsable
Familiar look and feel Consistent with other intranet sites, familiar interface Integration with MS SharePoint look and feel using RILPoint theme Login using global directory
Make basic tasks explicit Search, Explore, Contribute On main page and on side bar
Consistent navigation for every pages ‘ Table of Content’ links Browse content Using Semantic Drilldown Categories Using Nice Categories List for recursive tree view Topic #ask query for pages with Topic defined as a property A-Z index / Glossary Using a mix of Table of Content template, #urlget and #ask queries Single link to add New content With list of forms available
Reduce clutter Advanced tasks moved to the bottom of pages Maintenance tasks Upload file Page tools RDF link Browse properties
UI Simplification – Special Pages Custom made administrative tasks page
UI Simplification – Recent changes Simplified Recent changes using Dynamic Page List extension
UI Customization – Category:Location Customization of categories according to page type Maps for locations Timelines for events A-Z index for people
UI Customization – Category:Events
Status - Usage After a year 2900 pages of content (4600 pages total ) 31 registered users ( 5 active contributors ) Between 15 and 75 updates a day 130 unique visitors/month 400 visits / 600 searches a month Entering phase of growing interest
Status - Content Data imported from old system except for Articles and Persons Built an ontology of IT systems components 550+ Applications, 90+ Databases and 280+ Servers portfolio mostly RED systems at this point 145 data sources Semi-automated generation of Data landscape A Glossary of 950+ acronyms and definitions imported from multiple sources within J&J and outside About 170 support articles, how-to and FAQs Another 400 old articles pending review 340+ Organizations Including 44 J&J Operating Companies Google Maps of J&J PRD sites
Features KnowIT currently includes: An IT systems portfolio management (inventory) A Configuration management tool for these systems (components and relationships) A Communication component (calendar / timeline of announcements, outages and training sessions) A Question / feedback list (similar to WikiAnswers) A Logging mechanism (to track events, outages) A Service Account Password expiration management (with notification by RSS and eMail) Semantic / faceted search results Dynamic maps of known locations (with built-in form to driving directions) A Self service help system (knowledge base of solutions) And an Advanced glossary (terms organized by domains, with synonyms, related terms, etc ) Future directions Advanced bulk manipulations Dynamic visualizations of relationships network Automated annotations using internal and external sources Improved Semantic search
Observations from day to day use SMW is structured yet flexible Allows for exceptions, changes as well as standardization SMW doesn’t get in the way New content can be added, edited very quickly Remember to monitor response time of page edits, search Use PHP cache, optimization strategies to keep wiki as fast as possible Keep a single structure of ‘semantic categories’ Separate from other categories Use semantic properties for complex categorizations of pages Keep realistic expectations A long way to go before shared ownership and fully documented systems
Acknowledgements We would like to thank current and past contributors for their patience, ideas and support : Jim Gainor Brian Wegner Deborah Yates David Epstein John Baum Lisa Valetta Dimitris Agrafiotis Mario Dolbec Brian Johnson Emmanouil Skoufos.
Resources Semantic MediaWiki http://semantic-mediawiki.org Referata tips for SMW http://smw.referata.com/wiki/Special:BrowseData/Tips Wiki Patterns http://www.wikipatterns.com/display/wikipatterns/Wikipatterns Sphinx search extension http://www.mediawiki.org/wiki/Extension:SphinxSearch RILPoint – SharePoint theme for MediaWiki http://www.rilnet.com/en/rilpoint-sharepoint-look-alike-drupal-and-mediawiki-skin Gruff – Triple store browser for AlleroGraph (Relationships graph) http://www.franz.com/agraph/gruff/ Cytoscape – Network graph http://www.cytoscape.org/

KnowIT, semantic informatics knowledge base

  • 1.
    knowIT Mapping outInformatics systems Laurent Alquier Keith McCormick Ed Jaeger
  • 2.
    About Laurent AlquierSoftware engineer, Project lead Johnson & Johnson Pharmaceutical Research & Development, L.L.C [email_address]
  • 3.
    Could you answerthese questions ? Can you give us a list of all of your applications, related servers and stakeholders and send us an update every six months ? All Linux servers need to be patched this weekend. Can you send an outage announcement with a list of affected applications by tomorrow ? Is this server still in use ? Can we retire it ? What is the meaning of DRU ? (Based on real questions)
  • 4.
  • 5.
    knowIT in anutshell A collaborative database Semantic wiki Capture knowledge about informatics systems Information Systems components Applications, Servers, Data sources, plugins Map relationships between components Capture Business context around them Organizations, Companies, Locations Document known issues, procedures, processes
  • 6.
    Goals Answer recurringquestions Subject Matter Experts lists for Application Support Application / License rationalization Outage communications Increase knowledge retention Many ways to contribute Facilitate “Transfer In / Transfer Out” Capture knowledge from experts before they leave Facilitate learning for new resources Enable self service Many ways to search and explore
  • 7.
    Pragmatic approach Bottomup knowledge management in a corporate, R&D environment Search is not enough Complementary to a document library with search index Capture details about individual components of systems Rely on queries as much as search Change will happen Plan for future integration and migration from the start Import content from several sources Export content to several formats “ Know your content, respect your users. “ – E.Tufte Accept incomplete content Evolve the data model as necessary Let real data, use cases drive requirements Above all, remain flexible
  • 8.
    Evolution Startedas disconnected files Turned into a relational database Rigid design Lack of collaboration tools Solution: Collaborative Database using a Semantic Wiki Collaborative features and flexibility of wiki Structure from Semantic annotations
  • 9.
    Collaborative database Flexibleyet structured content management Collaborative data model Discussions, comments, community editing Knowledge management tools Redirections, wanted pages Automated maintenance tasks Background jobs to enforce consistency and updates Monitoring tools, change tracking Modular and extensible design Templates Open source components
  • 10.
    Semantic Media WikiBased on Media Wiki Proven platform (Wikipedia) Redirect, wanted pages, templates, API, bots Active development, commercial support No licensing fee (PHP, Mysql) Structure from Semantic annotations Inline annotations Supports forms and direct annotations Map complex relationships between objects Allow both Search and Queries Multiple input / output formats Compatible with Semantic Web integration Semantic Web in a bottle
  • 11.
    Semantic Annotations Tagswith meaning Syntax Triple: Page -> Property -> Value [[Has support contact::Help Desk]] Data types Page, URL, Date, String, Text, Number, Geo-location Custom units for Number Browse properties Summary of all properties for a page
  • 12.
    Relationships Defined aslinks to other pages Enhanced with semantic properties Tracking lists of things is not enough Knowledge comes from understanding relationships SMW assisted Ontology design
  • 13.
    Wiki ? Whatwiki ? Focus on content , not technology Occasional users less intimidated when wiki tools are not visible But keep wiki tools available to advanced users Use forms to standardize data capture Make semantic annotations invisible using forms and templates Enforce (some) naming conventions Auto-completion Automated page names Be ready to provide help with difficult tasks Provide guidance and training Front loading wiki with data users care about
  • 14.
    Content Migration Fromrelational tables to Categories and Pages Review data model, drop unnecessary attributes Create forms, templates, properties in Semantic MediaWiki One category per page Separate ‘semantic categories’ from ‘supporting categories’ Extract old content into tabular form Review, clean up, correct Unique titles (Disambiguation) Special characters in titles Load pages in bulk using PHP API (bulkinsert.php) Consider specialized import forms if content needs detailed review Example: Support articles
  • 15.
    Queries Visualize structureof content Ad-hoc reports Interactive queries ( Exhibit ) Automate system configuration pages Architectural layers Business, Functional, Process, Data, Applications, Physical Network diagrams Concepts Saved queries, dynamic categories
  • 16.
    Enhanced Search Defaultsearch replaced by Sphinx Search extension Faceted search Drill down by properties Search results grouped by Category Semantic search Semantic summary instead of excerpt Customized by Category Annotations used to improve results Aliases, keywords Related terms Selection of default category Feedback option Ask a question
  • 17.
    Input flexibility -Data capture Import Manually using Forms Remote CSV files, databases, LDAP FOAF format to retrieve and provide vocabularies OWL DL ontologies can be imported Explicit statements only – no support for reasoning Query remote sources Linked data import SMW+ can enrich page annotations with queries across multiple sources Supports OpenCalais, DBPedia, RSS feeds
  • 18.
    Output flexibility -Data integration Export HTML, PDF, CSV, XML, Email, Maps (Yahoo, Google, Open Layers), Timeline (Simile), Google graphs, vCard, iCalendar Machine readable Default RSS feed replaced by #ask query for recent content RDF view for each page RDFa, CSV index, FOAF files, Web Services (SMW+) Ontology and content export RDF dumps / SPARQL endpoint available Follows Linked Data principles One page per entity One HTTP URI for each entity RDF information available from each page RDF statements are browsable
  • 19.
    Familiar look andfeel Consistent with other intranet sites, familiar interface Integration with MS SharePoint look and feel using RILPoint theme Login using global directory
  • 20.
    Make basic tasksexplicit Search, Explore, Contribute On main page and on side bar
  • 21.
    Consistent navigation forevery pages ‘ Table of Content’ links Browse content Using Semantic Drilldown Categories Using Nice Categories List for recursive tree view Topic #ask query for pages with Topic defined as a property A-Z index / Glossary Using a mix of Table of Content template, #urlget and #ask queries Single link to add New content With list of forms available
  • 22.
    Reduce clutter Advancedtasks moved to the bottom of pages Maintenance tasks Upload file Page tools RDF link Browse properties
  • 23.
    UI Simplification –Special Pages Custom made administrative tasks page
  • 24.
    UI Simplification –Recent changes Simplified Recent changes using Dynamic Page List extension
  • 25.
    UI Customization –Category:Location Customization of categories according to page type Maps for locations Timelines for events A-Z index for people
  • 26.
    UI Customization –Category:Events
  • 27.
    Status - UsageAfter a year 2900 pages of content (4600 pages total ) 31 registered users ( 5 active contributors ) Between 15 and 75 updates a day 130 unique visitors/month 400 visits / 600 searches a month Entering phase of growing interest
  • 28.
    Status - ContentData imported from old system except for Articles and Persons Built an ontology of IT systems components 550+ Applications, 90+ Databases and 280+ Servers portfolio mostly RED systems at this point 145 data sources Semi-automated generation of Data landscape A Glossary of 950+ acronyms and definitions imported from multiple sources within J&J and outside About 170 support articles, how-to and FAQs Another 400 old articles pending review 340+ Organizations Including 44 J&J Operating Companies Google Maps of J&J PRD sites
  • 29.
    Features KnowIT currentlyincludes: An IT systems portfolio management (inventory) A Configuration management tool for these systems (components and relationships) A Communication component (calendar / timeline of announcements, outages and training sessions) A Question / feedback list (similar to WikiAnswers) A Logging mechanism (to track events, outages) A Service Account Password expiration management (with notification by RSS and eMail) Semantic / faceted search results Dynamic maps of known locations (with built-in form to driving directions) A Self service help system (knowledge base of solutions) And an Advanced glossary (terms organized by domains, with synonyms, related terms, etc ) Future directions Advanced bulk manipulations Dynamic visualizations of relationships network Automated annotations using internal and external sources Improved Semantic search
  • 30.
    Observations from dayto day use SMW is structured yet flexible Allows for exceptions, changes as well as standardization SMW doesn’t get in the way New content can be added, edited very quickly Remember to monitor response time of page edits, search Use PHP cache, optimization strategies to keep wiki as fast as possible Keep a single structure of ‘semantic categories’ Separate from other categories Use semantic properties for complex categorizations of pages Keep realistic expectations A long way to go before shared ownership and fully documented systems
  • 31.
    Acknowledgements We wouldlike to thank current and past contributors for their patience, ideas and support : Jim Gainor Brian Wegner Deborah Yates David Epstein John Baum Lisa Valetta Dimitris Agrafiotis Mario Dolbec Brian Johnson Emmanouil Skoufos.
  • 32.
    Resources Semantic MediaWikihttp://semantic-mediawiki.org Referata tips for SMW http://smw.referata.com/wiki/Special:BrowseData/Tips Wiki Patterns http://www.wikipatterns.com/display/wikipatterns/Wikipatterns Sphinx search extension http://www.mediawiki.org/wiki/Extension:SphinxSearch RILPoint – SharePoint theme for MediaWiki http://www.rilnet.com/en/rilpoint-sharepoint-look-alike-drupal-and-mediawiki-skin Gruff – Triple store browser for AlleroGraph (Relationships graph) http://www.franz.com/agraph/gruff/ Cytoscape – Network graph http://www.cytoscape.org/

Editor's Notes