dennis-de-swart / php-stanford-nlp-datastore
Stores NLP data from Stanford CoreNLP server.
Installs: 70
Dependents: 0
Suggesters: 0
Security: 0
Stars: 4
Watchers: 1
Forks: 0
Open Issues: 0
pkg:composer/dennis-de-swart/php-stanford-nlp-datastore
Requires
- php: >=5.3
- dennis-de-swart/php-stanford-corenlp-adapter: *
- doctrine/dbal: *
This package is auto-updated.
Last update: 2025-11-29 02:29:47 UTC
README
Stores NLP data from Stanford CoreNLP server.
What does it do?
It analyses a text using Stanford CoreNLP server, then stores the result.
Which data gets stored?
- OpenIE: these are "Subject-Relation-Object" triples. The concept is similar to "Subject-Verb-Object" triples.
http://stanfordnlp.github.io/CoreNLP/openie.html - Named-Entities: if a word is a "Named Entity", like a Location, Name or Time, it will store this data
http://stanfordnlp.github.io/CoreNLP/ner.html - Coreference: if there is a reference to a word in another sentence.
http://stanfordnlp.github.io/CoreNLP/coref.html How does it work?
- You submit a text.
- The text is analyzed by the Stanford CoreNLP server
- Results are stored in a SQLite file based database. The database file is called "datastore.db"
https://sqlite.org/ https://github.com/sqlitebrowser - The results are displayed on screen
- There is also a search form to find data
This package depends on Stanford CoreNLP Server
http://stanfordnlp.github.io/CoreNLP/index.html#download This package also depends on PHP-Stanford-CoreNLP-Adapter
https://github.com/DennisDeSwart/php-stanford-corenlp-adapter Note: since this package contains a full version of the CoreNLP Adapter, you can use all of it's features with this package.
Installation
This package depends on these packages:
http://stanfordnlp.github.io/CoreNLP/index.html#download https://github.com/DennisDeSwart/php-stanford-corenlp-adapter https://github.com/doctrine/dbal https://github.com/guzzle/guzzle Install procedure using the ZIP files
- Install Stanford CoreNLP Server. Check the "php-stanford-corenlp-adapter" package for an installation walkthrough
- Download and unpack the files from this package.
- Copy the files to your to your webserver directory. Usually "htdocs" or "var/www".
- Run a Composer update to install the dependencies
Install as part of another project
- Install Stanford CoreNLP Server. Check the "php-stanford-corenlp-adapter" package for an installation walkthrough
- Add the following lines to your main project's "composer.json" require section:
{ "require": { "dennis-de-swart/php-stanford-nlp-datastore": "*" } } - Run a Composer update to install the dependencies
Copy these files from "/vendor/dennis-de-swart/php-stanford-nlp-datastore" to your webserver directory. Usually "htdocs" or "var/www". - datastore.db - bootstrap.php - Example code for your main project:
// instantiate constants and the database require_once __DIR__.'/bootstrap.php'; // startup Corenlp Adapter $coreNLP = new CorenlpAdapter(); $coreNLP->getOutput($yourText); print_r($coreNLP->serverMemory); // result from CoreNLP Adapter // Save result to database $datastore = new Datastore($db->conn); $datastore->storeNLP($coreNLP); Requirements
- PHP 5.6 or higher: it also works on PHP 7
- Java SE Runtime Enviroment, version 1.8
- Stanford CoreNLP Server 3.7.0
- Windows or Linux/Unix 64-bit OS, 8Gb or more memory recommended.
- Composer for PHP
https://getcomposer.org/ SQLite Browser
If you need a SQLite browser check here:
http://sqlitebrowser.org/ Important notes
- Starting the CoreNLP server for the first time, takes some time because it will load a large amount of data.
- After the first startup, the server will be much faster.
- In my experience the Stanford CoreNLP server runs best with 8Gb of memory or more. Start the server with "-mx8g" instead of "-mx4g".
- Also use version 3.7.0 of the server, this gives you the best and quickest results.
Example output
See
- "datastore_result_a.PNG"
- "datastore_result_b.PNG"
- "datastore_result_search.PNG"
and "example.db", this is how a filled database looks like
Any questions?
Let me know. You can create an issue on GitHub. Any bugs will be fixed ASAP.