PUBLICATIONS
Discovering Diversified Paths in Knowledge Bases
Vast amounts of world knowledge is now accessible through Knowledge Graphs (KGs) in RDF format and can be queried using SPARQL. Yet, finding paths between nodes in such graphs is not part of the official SPARQL 1.1 standard; only the simpler functionality of checking reachability is supported, i.e., assessing whether two nodes are connected based on certain conditions formalized as property paths but without providing information on how they are actually connected. To close this gap of functionality, we propose JEDI, a Jena Extension for DIscovering diversified paths in knowledge bases. JEDI extends a popular SPARQL engine, Jena, with the ability to compute the paths connecting entities in a KG. JEDI shows the k most relevant results to the user where relevance is assessed as a trade-off between path length and diversification of the intermediate nodes in the path. Moreover, our solution is not limited to a single property path pattern but supports queries containing multiple property path patterns. While JEDI is able to work with any KG, for demonstration purposes some predefined KGs, such as YAGO and DBLP, are provided, as well as example queries. Attendees will be encouraged to interact with the JEDI system to try their own queries.Top-K Diversification for Path Queries in Knowledge Graphs
To explore the relationships between entities in RDF graphs, property path queries were introduced in SPARQL 1.1. However, existing RDF engines return only reachability of the entities ignoring the intermediate nodes in the path. If the paths are output, they are too many, which makes it difficult for users to find the most relevant paths. To address this issue, we propose a generalized top- k ranking technique that balances the trade-off between relevance and diversity. We propose a shortest path based relevance scoring in combination with several path similarity measures for diversification. With preliminary experiments and examples, we show that our diversification strategies provide more informative paths compared to shortest path based ranking.DEMONSTRATION
INSTALLATION
Requirements
Apache Tomcat 9.0.2 or newer. ElasticSearch 6.1.1 or newer (used for auto-complete). Java 8 or newer. SPARQL1.1 endpoint.Auto-Complete
PUT sparql { "mappings": { "iri" : { "properties" : { "prefix" : { "type" : "text" }, "value" : { "type" : "text" }, "val_suggest" : { "type" : "completion" } } } } }
.ttl
files for wanted data (for example YAGO).reader.py
script and run with the downloaded .ttl
files.Web
web.zip
file and unzip to ${TOMCAT_HOME}/webapps/ROOT/
.Configuration
Configurations are available in the./WEB-INF/config.json
file in the web.zip
file. The configuration file is in JSON format, and contains configurations for accessing the SPARQL1.1 endpoint and ElasticSearch server (IP, port and other information).
EXAMPLES
For the demonstration, conference attendees will be able to use the following knowledge graphs: YAGO, DBpedia and DBLP.PREFIX yago: <http://yago-knowledge.org/resource/>
PREFIX rdfs: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT * WHERE {
?a <http://yago-knowledge.org/resource/hasChild>-> yago:Margrethe_II_of_Denmark .
?a <http://yago-knowledge.org/resource/hasChild>-> yago:Carl_XVI_Gustaf_of_Sweden .
?a rdfs:label ?b
}
DOWNLOADS
web.zip
file or download the sources.reader.py
script.jenaPathExtension
sources.