JEDI

PUBLICATIONS

Discovering Diversified Paths in Knowledge Bases

Demo paper in VLDB 2018. Download
Authors: Christian Aebeloe, Gabriela Montoya, Vinay Setty and Katja Hose.

Vast amounts of world knowledge is now accessible through Knowledge Graphs (KGs) in RDF format and can be queried using SPARQL. Yet, finding paths between nodes in such graphs is not part of the official SPARQL 1.1 standard; only the simpler functionality of checking reachability is supported, i.e., assessing whether two nodes are connected based on certain conditions formalized as property paths but without providing information on how they are actually connected. To close this gap of functionality, we propose JEDI, a Jena Extension for DIscovering diversified paths in knowledge bases. JEDI extends a popular SPARQL engine, Jena, with the ability to compute the paths connecting entities in a KG. JEDI shows the k most relevant results to the user where relevance is assessed as a trade-off between path length and diversification of the intermediate nodes in the path. Moreover, our solution is not limited to a single property path pattern but supports queries containing multiple property path patterns. While JEDI is able to work with any KG, for demonstration purposes some predefined KGs, such as YAGO and DBLP, are provided, as well as example queries. Attendees will be encouraged to interact with the JEDI system to try their own queries.

Top-K Diversification for Path Queries in Knowledge Graphs

Poster paper in ISWC 2018. Download
Authors: Christian Aebeloe, Vinay Setty, Gabriela Montoya and Katja Hose.

To explore the relationships between entities in RDF graphs, property path queries were introduced in SPARQL 1.1. However, existing RDF engines return only reachability of the entities ignoring the intermediate nodes in the path. If the paths are output, they are too many, which makes it difficult for users to find the most relevant paths. To address this issue, we propose a generalized top- k ranking technique that balances the trade-off between relevance and diversity. We propose a shortest path based relevance scoring in combination with several path similarity measures for diversification. With preliminary experiments and examples, we show that our diversification strategies provide more informative paths compared to shortest path based ranking.

DEMONSTRATION

INSTALLATION

Requirements

Apache Tomcat 9.0.2 or newer. ElasticSearch 6.1.1 or newer (used for auto-complete). Java 8 or newer. SPARQL1.1 endpoint.

Auto-Complete

Download and install ElasticSearch and Kibana.
Create an index in ElasticSearch with the following request: PUT sparql { "mappings": { "iri" : { "properties" : { "prefix" : { "type" : "text" }, "value" : { "type" : "text" }, "val_suggest" : { "type" : "completion" } } } } }
Download .ttl files for wanted data (for example YAGO).
Download the reader.py script and run with the downloaded .ttl files.

Web

Download the web.zip file and unzip to ${TOMCAT_HOME}/webapps/ROOT/.
Start the Tomcat server. JEDI is now available at the IP specified in Tomcat configurations.

Configuration

Configurations are available in the ./WEB-INF/config.json file in the web.zip file. The configuration file is in JSON format, and contains configurations for accessing the SPARQL1.1 endpoint and ElasticSearch server (IP, port and other information).

EXAMPLES

For the demonstration, conference attendees will be able to use the following knowledge graphs: YAGO, DBpedia and DBLP.
Following example shows how to retrieve all common ancestors between the Queen of Denmark and the King of Sweden:

PREFIX yago: <http://yago-knowledge.org/resource/>
PREFIX rdfs: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT * WHERE {
?a <http://yago-knowledge.org/resource/hasChild>-> yago:Margrethe_II_of_Denmark .
?a <http://yago-knowledge.org/resource/hasChild>-> yago:Carl_XVI_Gustaf_of_Sweden .
?a rdfs:label ?b
}

DOWNLOADS

Download the web.zip file or download the sources.
Download the reader.py script.
Download the jenaPathExtension sources.