On Packagist:
WikibaseEntityStore is a small library that provides an unified interface to interact with Wikibase entities.
It currently has two backends:
- An API based backend, slow but very easy to deploy
- A MongoDB based backend, faster but requires to maintain a local copy of Wikibase content
Use one of the below methods:
1 - Use composer to install the library and all its dependencies using the master branch:
composer require "ppp/wikibase-entity-store":dev-master"
2 - Create a composer.json file that just defines a dependency on version 1.0 of this package, and run 'composer install' in the directory:
{ "require": { "ppp/wikibase-entity-store": "~1.0" } }
The entity storage system is based on the abstract class EntityStore that provides different services to manipulate entities.
These services are:
$storeBuilder = new EntityStoreFromConfigurationBuilder(); $store = $storeBuilder->buildEntityStore( 'MY_CONFIG_FILE.json' ); //See backend section for examples of configuration file //Retrieves the item Q1 try { $item = $store->getItemLookup()->getItemForId( new ItemId( 'Q1' ) ); } catch( ItemNotFoundException $e ) { //Item not found } //Retrieves the property P1 try { $item = $store->getPropertyLookup()->getPropertyForId( new PropertyId( 'P1' ) ); } catch( PropertyNotFoundException $e ) { //Property not found } //Retrieves the item Q1 as EntityDocument try { $item = $store->getEntityLookup()->getEntityDocumentForId( new ItemId( 'Q1' ) ); } catch( EntityNotFoundException $e ) { //Property not found } //Retrieves the item Q1 and the property P1 as EntityDocuments $entities = $store->getEntityLookup()->getEntityDocumentsForIds( array( new ItemId( 'Q1' ), new PropertyId( 'P1' ) ) ); //Retrieves the ids of the items that have as label or alias the term "Nyan Cat" in English (with a case insensitive compare) $itemIds = $store->getItemIdForTermLookup()->getItemIdsForTerm( new Term( 'en', 'Nyan Cat' ) ); //Retrieves the ids of the properties that have as label or alias the term "foo" in French (with a case insensitive compare) $propertyIds = $store->getPropertyIdForTermLookup()->getPropertyIdsForTerm( new Term( 'fr', 'Foo' ) ); //Do a query on items using the Ask query language: retrieves the first 10 items with P1: Q1 $itemIds = $store->getItemIdForQueryLookup()->getItemIdsForQuery( new Query( new SomeProperty( new EntityIdValue( new PropertyId( 'P1' ) ), new ValueDescription( new EntityIdValue( new ItemId( 'Q1' ) ) ) ), array(), new QueryOptions( 10, 0 ) ) );
The API backend is the most easy to use one. It uses the API of a Wikibase instance and WikidataQuery if you use this EntityStore as a backend for Wikidata data and you want query support.
The configuration file looks like:
{ "backend": "api", "api": { "url": "http://www.wikidata.org/w/api.php", "wikidataquery-url": "http://wdq.wmflabs.org/api" } }
Replace http://www.wikidata.org/w/api.php
with the URL of your WediaWiki API if you want to use your store with an other Wikibase instance than Wikidata.
The parameter wikidataquery-url
is optional and may be unset if you don't want query support using Wikidata content.
Without configuration file:
$store = new \Wikibase\EntityStore\Api\ApiEntityStore( new \Mediawiki\Api\MediawikiApi( 'http://www.wikidata.org/w/api.php' ), new \WikidataQueryApi\WikidataQueryApi( 'http://wdq.wmflabs.org/api' ) );
The MongoDB backend uses a MongoDB database. Requires doctrine/mongodb.
The configuration file looks like:
{ "backend": "mongodb", "mongodb": { "server": SERVER, "database": DATABASE } }
server
should be a MongoDB server connection string and database
the name of the database to use.
Without configuration file:
//Connect to MongoDB $connection = new \Doctrine\MongoDB\Connection( MY_CONNECTION_STRING ); if( !$connection->connect() ) { throw new RuntimeException( 'Fail to connect to the database' ); } //Gets the database where entities are stored $database = $connection ->selectDatabase( 'wikibase' ); $store = new \Wikibase\EntityStore\MongoDB\MongDBEntityStore( $database );
You can fill the MongoDB database from Wikidata JSON dumps using this script:
php entitystore import-json-dump MY_JSON_DUMP MY_CONFIGURATION_FILE
Or from incremental XML dumps using this script:
php entitystore import-incremental-xml-dump MY_XML_DUMP MY_CONFIGURATION_FILE
Backend based on an array of EntityDocuments. Useful for tests.
$store = new \Wikibase\EntityStore\InMemory\InMemoryEntityStore( array( new Item( new ItemId( 'Q42' ) ) ) );
The different backends support a shared set of options. These options are:
- string[]
languages
: Allows to filter the set of internationalized values. Default value:null
(a.k.a. all languages). - bool
languagefallback
: Apply language fallback system to languages defined using languages option. Default value:false
.
They can be injected in the configuration:
{ "options": { "languages": ["en", "fr"], "languagefallback": true } }
They can be also passed as last parameter of EntityStore
constructors:
$options = new \Wikibase\EntityStore\EntityStoreOptions( array( EntityStore::OPTION_LANGUAGES => array( 'en', 'fr' ), EntityStore::OPTION_LANGUAGE_FALLBACK => true ) ); $store = new \Wikibase\EntityStore\Api\ApiEntityStore( new \Mediawiki\Api\MediawikiApi( 'http://www.wikidata.org/w/api.php' ), null, $options );
It is possible, in order to get far better performances, to add a cache layer on top of EntityStore:
Adds to the configuration file a cache
section.
Example with a two layers cache. The first one is a PHP array and the second one a Memcached instance on localhost:11211
.
{ "cache": { "array": true, "memcached": { "host": "localhost", "port": 11211 } } }
Without configuration file:
$store = MY_ENTITY_STORE; $cache = new \Doctrine\Common\Cache\ArrayCache(); //A very simple cache $cacheLifeTime = 100000; //Life time of cache in seconds $cachedStore = new \Wikibase\EntityStore\Cache\CachedEntityStore( $store, $cache, $cacheLifeTime );
- Adds support of WikibaseDataModel 5.0 and 6.0 and Number 0.7
- Update to WikibaseDataModel 4.0, WikibaseDataModelServices 3.2 and MediaWikiApiBase 2.0
- Change encoding of label indexes in MongoDB
Initial release with these features:
- Retrieve entities from ids or terms
- Import Wikidata JSON full dumps and incremental XML dumps
- Beginning of support of simple queries
- API and MongoDB backends
- Basic cache system