e RDF2vec.org

RDF2vec.org

The hitchhiker's guide to RDF2vec.

About RDF2vec

RDF2vec is a tool for creating vector representations of RDF graphs. In essence, RDF2vec creates a numeric vector for each node in an RDF graph.

RDF2vec was developed by Petar Ristoski as a key contribution of his PhD thesis Exploiting Semantic Web Knowledge Graphs in Data Mining [Ristoski, 2019], which he defended in January 2018 at the Data and Web Science Group at the University of Mannheim, supervised by Heiko Paulheim. In 2019, he was awarded the SWSA Distinguished Dissertation Award for this outstanding contribution to the field.

RDF2vec was inspired by the word2vec approach [Mikolov et al., 2013] for representing words in a numeric vector space. word2vec takes as input a set of sentences, and trains a neural network using one of the two following variants: predict a word given its context words (continuous bag of words, or CBOW), or to predict the context words given a word (skip gram, or SG):

This approach can be applied to RDF graphs as well. In the original version presented at ISWC 2016 [Ristoski and Paulheim, 2016], random walks on the RDF graph are used to create sequences of RDF nodes, which are then used as input for the word2vec algorithm. It has been shown that such a representation can be utilized in many application scenarios, such as using knowledge graphs as background knowledge in data mining tasks, or for building content-based recommender systems [Ristoski et al., 2019].

The resulting vectors have similar properties as word2vec embeddings. In particular, similar entities are closer in the vector space than dissimilar ones, which makes those representations ideal for learning patterns about those entities. In the example below, showing embeddings for DBpedia and Wikidata, countries and cities are grouped together, and European and Asian cities and countries form clusters:

The two figures above indicate that classes (in the example: countries and cities) can be separated well in the projected vector space, indicated by the dashed lines. [Zouaq and Martel, 2020] have compared the suitability for separating classes in a knowledge graph for different knowledge graph embedding methods. They have shown that RDF2vec is outperforming other embedding methods like TransE, TransH, TransD, ComplEx, and DistMult, in particular on smaller classes. On the task of entity classification, RDF2vec shows results which are competitive with more recent graph convolutional neural networks [Schlichtkrull et al., 2018].

RDF2vec has been tailored to RDF graphs by respecting the type of edges (i.e., the predicates). Related variants, like node2vec [Grover and Leskovec, 2016] or DeepWalk [Perozzi et al., 2014], are defined for graphs with just one type of edges. They create sequences of nodes, while RDF creates alternating sequences of entities and predicates.

This video by Petar Ristoski introduces the main ideas of RDF2vec:

Implementations

There are a few different implementations of RDF2vec out there:

  • The original implementation from the 2016 paper. Not well documented. Uses Java for walk generation, and Python/gensim for the embedding training.
  • jRDF2vec is a more versatile and better peforming Java-based implementation. Like the original one, it uses Java to generate the walks, and Python/gensim for training the embedding. It also implements the RDF2vec Light variant (see below). There is also a Docker image available here.
  • pyRDF2vec is a pure Python-based implementation. It implements multiple strategies to generate the walks, not only random walks.
  • ataweel55's implementation is another pure Python-based implementation. It includes all strategies for biasing the walks described in [Cochez et al., 2017a] and [Al Taweel and Paulheim, 2020].

Models and Services

Training RDF2vec from scratch can take quite a bit of time. Here is a list of pre-trained models we know:

There is also an alternative for downloading and processing an entire knowledge graph embedding (which may consume several GB):

  • KGvec2go provides a REST API for retrieving pre-computed embedding vectors for selected entities one by one, as well as further functions, such as computing the vector space similarity of two concepts, and retrieving the n closest concepts. There is also a service for RDF2vec Light (see below) [Portisch et al., 2020].

Extensions and Variants

There are quite a few variants of RDF2vec which have been examined in the past.

  • Walking RDF and OWL pursues exactly the same idea as RDF2vec, and the two can be considered identical. It uses random walks and Skip Gram embeddings. The approach has been developed at the same time as RDF2vec. [Alsharani et al., 2017]
  • KG2vec pursues a similar idea as RDF2vec by first transforming the directed, labeled RDF graph into an undirected, unlabeled graph (using nodes for the relations) and then extracting walks from that transformed graph. [Wang et al., 2021] Although no direct comparison is available, we assume that the embeddings are comparable.
  • Wembedder is a simplified version of RDF2vec which uses the raw triples of a knowledge graph as input to the word2vec implementation, instead of random walks. It serves pre-computed vectors for Wikidata. [Nielsen, 2017]
  • KG2vec (not to be confused with the aforementioned approach also named KG2vec) follows the same idea of using triples as input to a Skip-Gram algorithm. [Soru et al., 2018]

RDF2vec always generates embedding vectors for an entire knowledge graph. In many practical cases, however, we only need vectors for a small set of target entities. In such cases, generating vectors for an entire large graph like DBpedia would not be a practical solution.

  • RDF2vec Light is an alternative which can be used in such scenarios. It only creates random walks on a subset of the knowledge graph and can produce embedding vectors for a target subset of entities fast. In many cases, the results are competitive with those achieved with embeddings of the full graph. [Portisch et al., 2020] Details about the implementation are found here.
  • LODVec uses the same mechanism as RDF2vec Light, but creates sequences across different datasets by exploiting owl:sameAs links, and unifying classes and predicates by exploiting owl:equivalentClass and owl:equivalentProperty definitions. [Mountantonakis and Tzitzikas, 2021]

One area which has been undergone extensive research is the creation of walks for the RDF2vec algorithm. While the original implementation uses random walks, alternatives have been explored include:

  • The use of different heuristics for biasing the walks, e.g., prefering edges with more/less frequent predicates, prefering links to nodes with higher/lower PageRank, etc. An extensive study is available in [Cochez et al., 2017a].
  • A similar approach is analyzed in [Al Taweel and Paulheim, 2020], where embeddings for DBpedia are trained with external edge weights derived from page transition probabilities in Wikipedia.
  • In [Vandewiele et al., 2020], we have analyzed different alternatives to using random walks, such as walk strategies with teleportation within communities. While random walks are usually a good choice, there are scenarios in which other walking strategies are superior.
  • In [Saeed and Prasanna, 2018], the identification of specific properties for groups of entities is discussed as a means to find task-specific edge weights.
  • Mukherjee et al. [Mukherjee et al., 2019] also observe that biasing the walks with prior knowledge on relevant properties and classes for a domain can improve the results obtained with RDF2vec.
  • The ontowalk2vec approach [Gkotse, 2020] combines the random walk strategies of RDF2vec and node2vec, and trains a language model on the union of both walk sets.

RDF2vec relies on the word2vec embedding mechanism once the sequences are created. This is not the only choice:

While the original RDF2vec approach is agnostic to the type of knowledge encoded in RDF, it is also possible to extend the approach to specific types of datasets.

To materialize or not to materialize? While it might look like a good idea to enrich the knowledge graph with implicit knowledge before training the embeddings, experimental results show that materializing implicit knowledge actually makes the resulting embedding worse, not better.

  • In [Iana and Paulheim, 2020], we have conducted a series of experiments training embeddings on DBpedia as is, vs. training embeddings on DBpedia with implicit knowledge materialized. In most settings, the results on downstream tasks get worse when adding implicit knowledge. Our hypothesis is that missing information in many knowledge graphs is not missing at random, but a signal of lesser importance, and that signal is canceled out by materialization. A similar observation was made by [Alsharani et al., 2017].

Other Resources

Other useful resources for working with RDF2vec:

Applications

RDF2vec has been used in a variety of applications. In the following, we list a number of those, organized by different fields of applications.

Knowledge Graph Refinement

Knowledge Graph Refinement subsumes the usage of embeddings for adding additional information to a knowledge graph (e.g., link/relation or type prediction), to extend its schema/ontology, or the identification (and potentially: correction) of existing facts in the graph [Paulheim, 2017]. In most of the applications, RDF2vec embedding vectors are used as representations for training a machine learning classifier for the task at hand, e.g., a predictive model for entity types. Applications in this area include:
  • TIEmb is an approach for learning subsumption relations using RDF2vec embeddings. [Ristoski et al., 2017]
  • Kejriwal and Szekely discuss the use RDF2vec embeddings for entity type prediction in knowledge graphs. [Kejriwal and Szekely, 2017] Another approach in that direction is proposed by Sofronova et al., who contrast supervised and unsupervised methods for exploiting RDF2vec embeddings for type prediction. [Sofronova et al., 2020] Furthermore, the usage of RDF2vec for type prediction in knowledge graphs is discussed in [Weller, 2021] and [Jain et al., 2021].
  • GraphEmbeddings4DDI utilizes RDF2vec for predicting drug-drug interactions [Çelebi et al., 2018]. A similar system is introduced by Karim et al., using a complex LSTM on top of the entity embeddings generated with RDF2vec [Karim et al., 2019]. Since the drug-drug-interactions are modeled as relation in the knowledge graphs used for the experiments, this task is essentially a relation predction task.
  • Ammar and Celebi showcase the use of RDF2vec embeddings for the fact validation task at the 2019 edition of the Semantic Web Challenge. [Ammar and Celebi, 2019]. A similar approach is pursued by Pister and Atemezing [Pister and Atemezing, 2019].
  • Chen et al. show that RDF2vec embeddings can be used for relation prediction and yields results competitive with TransE and DistMult [Chen et al., 2020].
  • Yao and Barbosa combines RDF2vec and outlier detection for detecting wrong type assertions in knowledge graphs [Yao and Barbosa, 2021].

Knowledge Matching and Integration

In knowledge matching and integration, entity embedding vectors are mostly utilized to determine whether two entities in two datasets are similar enough to each other to merge them into one. Different approaches have been proposed using RDF2vec for matching and integratino both on the schema as well as on the instance level:
  • MERGILO is a tool for merging structured knowledge extracted from text. A refinement of MERGILO using RDF2vec embeddings on FrameNet is discussed in [Alam et al., 2017].
  • EARL is a named entity linking tool which uses pre-trained RDF2vec embeddings. [Dubey et al., 2018]
  • ALOD2vec Matcher is an ontology matching system which uses pre-trained embeddings on the WebIsALOD knowledge graph to determine the similarity of two concepts. [Portisch and Paulheim, 2018]. A similar approach is pursued by the DESKMatcher system, which uses domain specific embeddings from the business domain, e.g., the FIBO ontology [Monych et al., 2020].
  • AnyGraphMatcher is another ontology matching system which leverages RDF2vec embeddings trained on the two input ontologies to match [Lütke, 2019].
  • Azmy et al. use RDF2vec for entity matching across knowledge graphs, and show a large-scale study for matching DBpedia and Wikidata [Azmy et al., 2019]. A similar approach is introduced by Aghaei and Fensel, who combine RDF2vec embeddings with clustering and BERT sentence embeddings to identify related entities in two knowledge graphs [Aghaei and Fensel, 2021].
  • In a showcase for the MELT ontology matching framework, Hertling et al. show that by learning a non-linear mapping between RDF2vec embeddings of different ontologies, ontology matching can be performed at least for structurally similar ontologies [Hertling et al., 2020]. This is particularly remarkable since that metric measures similarity, not relatedness, which is actually needed for the task at hand.

Applications in NLP

In natural language processing, knowledge graph embeddings are particularly handy in setups that already exploit knowledge graphs, for example, for linking entities in text to a knowledge graph using named entitiy linking and named entity disambiguation. Applications of RDF2vec in the NLP field include:
  • TREC CAR is a benchmark for complex answer retrieval. The authors use pre-trained RDF2vec embeddings as one means to represent queries and answers, and for matching them onto each other. [Nanni et al., 2017a]
  • Inan and Dikenelli demonstrate the usage of RDF2vec embeddings in named entity disambiguation in the entity disambiguation frameworks DoSeR and AGDISTIS. [Inan and Dikenelli, 2017]
  • Wang et al. have used RDF2vec embeddings for analyzing entity co-occurence in tweets [Wang et al., 2017].
  • Nanni et al. showcase the use of RDF2vec embeddings for entity aspect linking in [Nanni et al., 2018].
  • KGA-CGM is a system for describing images with captions. It uses RDF2vec embeddings for handling out-of-training entities [Mogadala et al., 2018].
  • Türker discusses the use of RDF2vec for text categorization by embedding both texts and categories [Türker, 2019].
  • Vakulenko demonstrates the use of RDF2vec in dialogue systems [Vakulenko, 2019].
  • G-Rex is a tool for relation extraction from text which leverages RDF2vec entity embeddings [Ristoski et al., 2020].
  • El Vaigh et al. show that using cosine similarity in the RDF2vec space creates a strong baseline for collective entity linking [El Vaigh al., 2020]. This is particularly remarkable since that metric measures similarity, not relatedness, which is actually needed for the task at hand.
  • FinMatcher is a tool for named entity classification in the financial domain, developed for the FinSim-2 shared task. It uses pre-trained RDF2vec embeddings on WebIsALOD [Portisch et al., 2021]

Information Retrieval

In information retrieval, similarity and relatedness of entities can be utilized to retrieve and/or rank results for queries for a given entity. Examples for the use of RDF2vec in the information retrieval field include:
  • Nanni et al. describe a system for harvesting event collections from Wikipedia, where RDF2vec is used internally for entity ranking. [Nanni et al., 2017b]
  • Ad Hoc Table Retrieval using Semantic Similarity describes the use of pre-trained RDF2vec embeddings for retrieving Wikipedia tables. [Zhang and Balog, 2018] In a later extension, they distinguish two kinds of retrieval tasks (using either keywords or tables as queries), and show that entity embeddings with RDF2vec can be used for both scenarios. [Zhang and Balog, 2021]
  • Cyber-all-intel is an application in the computer security domain. It uses RDF2vec vectors for retrieving information on security alerts [Mittal et al., 2019].
  • The COVID-19 literature knowledge graph is a large citation network of CoViD-19 related scientific publications, derived from the CORD-19 dataset. In [Steenwinckel et al., 2020], the authors exploit RDF2vec embeddings on that graph for facilitating the retrieval of related articles, as well as for clustering the large body of literature.

Predictive Modeling

Predictive modeling was the original use case for which RDF2vec was developed. Here, external variables (which might be continuous or categorical) are predicted for a set of entities. By linking these entities to a knowledge graph, entity embeddings have been shown to be suitable representations for the downstream predictive modeling tools. Examples in this field include:
  • Hascoet et al. show how to use RDF2vec for image classification, especially for classes of images for which no training data is available, i.e., zero-shot-learning [Hascoet et al., 2017].
  • evoKGsim* combines similarity metrics and genetic programming for predicting out-of-KG relations. The framework implements RDF2vec as one source of similarity metrics.[Sousa et al., 2021]
  • Biswas et al. discuss the use of RDF2vec as a signal for predicting infobox types in Wikipedia articles [Biswas et al., 2018].
  • Egami et al. show the use case of geospatial data analytics in urban spaces by constructing a geospatial knowledge graph and computing RDF2vec embeddings thereon [Egami et al., 2018].
  • Hees discusses the use of pre-trained RDF2vec models for predicting human associations of terms [Hees, 2018].
  • The utilization of RDF2vec for content-based recommender systems is discussed in [Saeed and Prasanna, 2018], [Ristoski et al., 2019], and [Voigt and Paulheim, 2021].
  • Jurgovsky demonstrates the use of RDF2vec for data augmentation on the task of credit card fraud detection [Jurgovsky, 2019].
  • Hoppe et al. demonstrate the use of RDF2vec embeddings on DBpedia for improving the classification of scientific articles [Hoppe et al., 2021].
  • [Nunes et al., 2021] show how graph embeddings on biomedical ontologies can be utilized for predicting drug-gene-interactions. They train classifiers such as random forests over the concatenated embedding vectors of the drugs and genes.

Other Applications

No matter how sophisticated your categorization schema is, you always end up with a category called "other" or "misc.". Here are examples for applications of RDF2vec in that category:
  • REMES is an entity summarization approach which uses RDF2vec to select a suitable subset of statements for describing an entity. [Gunaratna et al., 2017]
  • Similar to that, Shi et al. propose an approach for extracting semantically coherent subgraphs from a knowledge graph, which uses RDF2vec as a measure for semantic distance to guarantee semantic coherence. [Shi et al., 2021]
  • Jurisch and Igler demonstrate that utilization of RDF2vec embeddings for detecting changes in ontologies in [Jurisch and Igler, 2018].

References

These are the core publications of RDF2vec:

  1. Petar Ristoski, Heiko Paulheim: RDF2Vec: RDF Graph Embeddings for Data Mining. International Semantic Web Conference, 2016
  2. Petar Ristoski, Jessica Rosati, Tommaso Di Noia, Renato De Leone, Heiko Paulheim: RDF2Vec: RDF Graph Embeddings and Their Applications. Semantic Web Journal 10(4), 2019

Further references used above:

  1. Sareh Aghaei, Anna Fensel: Finding Similar Entities Across Knowledge Graphs. International Conference on Advances in Computer Science and Information Technology, 2021.
  2. Mehwish Alam, Diego Reforgiato Recupero, Misael Mongiovi, Aldo Gangemi, Petar Ristoski: Event-based knowledge reconciliation using frame embeddings and frame similarity. Knowledge-based Systems (135), 2017
  3. Mona Alshahrani, Mohammad Asif Khan, Omar Maddouri, Akira R Kinjo, Núria Queralt-Rosinach, Robert Hoehndorf: Neuro-symbolic representation learning on biological knowledge graphs. Bioinformatics 33(17), 2017.
  4. Faisal Alshargi, Saeedeh Shekarpour, Tommaso Soru, Amit Sheth: Concept2vec: Metrics for Evaluating Quality of Embeddings for Ontological Concepts. Spring Symposium on Combining Machine Learning with Knowledge Engineering, 2019
  5. Ahmad Al Taweel, Heiko Paulheim: Towards Exploiting Implicit Human Feedback for Improving RDF2vec Embeddings. Deep Learning for Knowledge Graphs Workshop, 2020
  6. Ammar Ammar, Remzi Celebi: Fact Validation with Knowledge Graph Embeddings. International Semantic Web Conference, 2019
  7. Michael Azmy, Peng Shi, Jimmy Lin, Ihab F. Ilyas: Matching Entities Across Different Knowledge Graphs with Graph Embeddings. arxiv.org, 2019
  8. Remzi Çelebi, Erkan Yaşar, Hüseyin Uyar, Özgür Gümüş, Oguz Dikenelli, Michel Dumontier: Evaluation of knowledge graph embedding approaches for drug-drug interaction prediction using Linked Open Data. International Conference Semantic Web Applications and Tools for Life Sciences, 2018
  9. Russa Biswas, Rima Türker, Farshad Bakhshandegan-Moghaddam, Maria Koutraki, Harald Sack: Wikipedia Infobox Type Prediction Using Embeddings. Workshop on Deep Learning for Knowledge Graphs and Semantic Technologies, 2018
  10. Jiaoyan Chen, Xi Chen, Ian Horrocks, Erik B. Myklebust, Ernesto Jiménez-Ruiz: Correction Knowledge Base Assertions. The Web Conference, 2020
  11. Jiaoyan Chen, Pan Hu, Ernesto Jimenez-Ruiz, Ole Magnus Holter, Denvar Antonyrajah, Ian Horrocks: OWL2Vec*: Embedding of OWL Ontologies. arxiv.org, 2020
  12. Michael Cochez, Petar Ristoski, Simone Paolo Ponzetto, Heiko Paulheim: Biased Graph Walks for RDF Graph Embeddings. International Conference on Web Intelligence, Mining, and Semantics, 2017
  13. Michael Cochez, Petar Ristoski, Simone Paolo Ponzetto, Heiko Paulheim: Global RDF Vector Space Embeddings. International Semantic Web Conference, 2017
  14. Mohnish Dubey, Debayan Banerjee, Debanjan Chaudhuri, Jens Lehmann: EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs. International Semantic Web Conference, 2018
  15. Shusaku Egami, Takahiro Kawamura, Akihiko Ohsuga: Predicting Urban Problems: A Comparison of Graph-based and Image-based Methods. Joint International Semantic Technology Conference, 2018
  16. Cheikh-Brahim El Vaigh, François Goasdoué, Guillaume Gravier, Pascale Sébillot: A Novel Path-based Entity Relatedness Measure for Efficient Collective Entity Linking. In: International Semantic Web Conference, 2020
  17. Michael Färber: The Microsoft Academic Knowledge Graph: A Linked Data Source with 8 Billion Triples of Scholarly Data. International Semantic Web Conference, 2019
  18. Blerina Gkotse: Ontology-based Generation of Personalised Data Management Systems: an Application to Experimental Particle Physics. PhD Thesis at MINES ParisTech, 2020.
  19. Aditya Grover and Jure Leskovec: node2vec: Scalable Feature Learning for Networks.ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2016.
  20. Kalpa Gunaratna, Amir Hossein Yazdavar, Krishnaprasad Thirunarayan, Amit Sheth, Gong Cheng: Relatedness-based Multi-Entity Summarization. International Joint Conference on Artificial Intelligence, 2017
  21. Tristan Hascoet, Yasuo Ariki, Tetsuya Takiguchi: Semantic Web and Zero-Shot Learning of Large Scale Visual Classes. International Workshop on Symbolic-Neural Learning, 2017
  22. Jörn Hees: Simulating Human Associations with Linked Data. University of Kaiserslautern, 2018
  23. Sven Hertling, Jan Portisch, Heiko Paulheim: Supervised Ontology and Instance Matching with MELT. Ontology Matching, 2020.
  24. Ole Magnus Holter, Erik B. Myklebust, Jiaoyan Chen, Ernesto Jimenez-Ruiz: Embedding OWL Ontologies with OWL2Vec. International Semantic Web Conference, 2019
  25. Fabian Hoppe, Danilo Dessì, Harald Sack: Deep Learning meets Knowledge Graphs for Scholarly Data Classification. Companion Proceedings of the Web Conference, 2021.
  26. Andreea Iana, Heiko Paulheim: More is not Always Better: The Negative Impact of A-box Materialization on RDF2vec Knowledge Graph Embeddings. Combining Symbolic and Sub-symbolic methods and their Applications (CSSA), 2020
  27. Emrah Inan, Oguz Dikenelli: Effect of Enriched Ontology Structures on RDF Embedding-Based Entity Linking. Metadata and Semantic Research, 2017
  28. Nitisha Jain, Jan-Christoph Kalo, Wolf-Tilo Balke, Ralf Krestel: Do Embeddings Actually Capture Knowledge Graph Semantics?. Extended Semantic Web Conference, 2021
  29. Johannes Jurgovsky: Context-Aware Credit Card Fraud Detection. University of Passau, 2019
  30. Md Rezaul Karim, Michael Cochez, Joao Bosco Jares, Mamtaz Uddin, Oya Beyan, Stefan Decker: Drug-drug interaction prediction based on knowledge graph embeddings and convolutional-LSTM network. ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, 2019
  31. Matthias Jurisch, Bodo Igler: RDF2Vec-based Classification of Ontology Alignment Changes. Workshop on Deep Learning for Knowledge Graphs and Semantic Technologies, 2018
  32. Mayank Kejriwal, Pedro Szekely: Supervised Typing of Big Graphs using Semantic Embeddings. International Workshop on Semantic Big Data, 2017
  33. Alexander Lütke: AnyGraphMatcher Submission to the OAEI Knowledge Graph Challenge 2019. International Workshop on Ontology Matching, 2019
  34. Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean: Efficient Estimation of Word Representations in Vector Space. International Conference on Learning Representations, 2013
  35. Sudip Mittal, Anupam Joshi, Tim Finin: Cyber-All-Intel: An AI for Security related Threat Intelligence. arxiv.org, 2019
  36. Aditya Mogadala, Umanga Bista, Lexing Xie, Achim Rettinger: Knowledge Guided Attention and Inference for Describing Images Containing Unseen Objects. Extended Semantic Web Conference, 2018
  37. Michael Monych, Jan Portisch, Michael Hladik, Heiko Paulheim: DESKMatcher. Ontology Matching, 2020
  38. Michalis Mountantonakis, Yannis Tzitzikas: Applying Cross-Dataset Identity Reasoning for Producing URI Embeddings over Hundreds of RDF Datasets, Journal of Metadata, Semantics and Ontologies, 2021
  39. Sourav Mukherjee, Tim Oates, Ryan Wright: Graph Node Embeddings using Domain-Aware Biased Random Walks. arxiv.org, 2019
  40. Federico Nanni, Bhaskar Mitra, Matt Magnusson, Laura Dietz: Benchmark for Complex Answer Retrieval. ACM International Conference on the Theory of Information Retrieval, 2017
  41. Federico Nanni, Simone Paolo Ponzetto, Laura Dietz: Building Entity-Centric Event Collections. ACM/IEEE Joint Conference on Digital Libraries, 2017
  42. Federico Nanni, Simone Paolo Ponzetto, Laura Dietz: Entity-aspect linking: providing fine-grained semantics of entities in context. International Joint Conference on Digital Libraries, 2018
  43. Finn Årup Nielsen: Wembedder: Wikidata entity embedding web service. arxiv.org, 2017
  44. Susana Nunes, Rita T. Sousa, Catia Pesquita: Predicting Gene-Disease Associations with Knowledge Graph Embeddings over Multiple Ontologies. arxiv. org, 2021
  45. Paulheim, H. (2017). Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic web, 8(3), 2017.
  46. Maria Angela Pellegrino, Michael Cochez, Martina Garofalo, Petar Ristoski: A Configurable Evaluation Framework for Node Embedding Techniques. Extended Semantic Web Conference, 2019
  47. Maria Angela Pellegrino, Abdulrahman Altabba, Martina Garofalo, Petar Ristoski, Michael Cochez: GEval: A Modular and Extensible Evaluation Framework for Graph Embedding Techniques. Extended Semantic Web Conference, 2020
  48. Jeffrey Pennington, Richard Socher, Christopher D. Manning: GloVe: Global Vectors for Word Representation. Empirical Methods in Natural Language Processing, 2014
  49. Bryan Perozzi, Bryan, Rami Al-Rfou, Steven Skiena: Deepwalk: Online learning of social representations. Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 2014.
  50. Alexis Pister, Ghislain Atemezing: Knowledge Graph Embedding for Triples Fact Validation. International Semantic Web Conference, 2019
  51. Jan Portisch and Heiko Paulheim: ALOD2vec Matcher. International Workshop on Ontology Matching, 2018
  52. Jan Portisch, Michael Hladik, Heiko Paulheim: KGvec2go - Knowledge Graph Embeddings as a Service. International Conference on Language Resources and Evaluation, 2020
  53. Jan Portisch, Michael Hladik, Heiko Paulheim: RDF2Vec Light – A Lightweight Approach for Knowledge Graph Embeddings. International Semantic Web Conference, Posters and Demos, 2020.
  54. Jan Portisch, Michael Hladik, Heiko Paulheim: FinMatcher at FinSim-2: Hypernym Detection in the Financial Services Domain using Knowledge Graphs. Workshop on Financial Technology on the Web (FinWeb), 2021.
  55. Petar Ristoski, Stefano Faralli, Simone Paolo Ponzetto, Heiko Paulheim: Large-scale taxonomy induction using entity and word embeddings. International Conference on Web Intelligence, 2017
  56. Petar Ristoski: Exploiting Semantic Web Knowledge Graphs in Data Mining. IOS Press, Studies on the Semantic Web (38), 2019
  57. Petar Ristoski, Anna Lisa Gentile, Alfredo Alba, Daniel Gruhl, Steven Welch: Large-scale relation extraction from web documents and knowledge graphs with human-in-the-loop. Semantic Web Journal (60), 2020
  58. Muhammad Rizwan Saeed, Viktor K. Prasanna: Extracting Entity-Specific Substructures for RDF Graph Embedding. IEEE International Conference on Information Reuse and Integration, 2018
  59. Michael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, Max Welling: Modeling Relational Data with Graph Convolutional Networks. Extended Semantic Web Conference, 2018.
  60. Yuxuan Shi, Gong Cheng, Trung-Kien Tran, Evgeny Kharlamov, Yulin Shen: Efficient Computation of Semantically Cohesive Subgraphs for Keyword-Based Knowledge Graph Exploration. The Web Conference, 2021.
  61. Radina Sofronova, Russa Biswas, Mehwish Alam, Harald Sack: Entity Typing based on RDF2Vec usingSupervised and Unsupervised Methods. Extended Semantic Web Conference, 2020.
  62. Tommaso Soru, Stefano Ruberto, Diego Moussallem, Edgard Marx, Diego Esteves, Axel-Cyrille Ngonga Ngomo: Expeditious Generation of Knowledge Graph Embeddings. European Conference on Data Analysis, 2018
  63. Rita T. Sousa, Sara Silva, Catia Pesquita: evoKGsim*: a framework for tailoring Knowledge Graph-based similarity forsupervised learning. OpenReview, 2021.
  64. Bram Steenwinckel, Gilles Vandewiele, Ilja Rausch, Pieter Heyvaert, Ruben Taelman, Pieter Colpaert, Pieter Simoens, Anastasia Dimou, Filip De Turck, Femke Ongenae: Facilitating the Analysis of COVID-19 Literature Through a Knowledge Graph. International Semantic Web Conference, 2020.
  65. Rima Türker: Knowledge-Based Dataless Text Categorization. Extended Semantic Web Conference, 2019
  66. Gilles Vandewiele, Bram Steenwinckel, Pieter Bonte, Michael Weyns, Heiko Paulheim, Petar Ristoski, Filip De Turck, Femke Ongenae: Walk Extraction Strategies for Node Embeddings with RDF2Vec in Knowledge Graphs, arxiv.org, 2020.
  67. Svitlana Vakulenko: Knowledge-based Conversational Search. TU Wien, 2019.
  68. Michael Matthias Voit, Heiko Paulheim: Bias in Knowledge Graphs - an Empirical Study with Movie Recommendation and Different Language Editions of DBpedia. Conference on Language, Data and Knowledge, 2021
  69. Yiwei Wang, Mark James Carman, Yuan Fang Li: Using knowledge graphs to explain entity co-occurrence in Twitter. ACM Conference on Knowledge and Information Management, 2017
  70. YueQun Wang, LiYan Dong, XiaoQuan Jiang, XinTao Ma, YongLi Li, Hao Zhang: KG2Vec: A node2vec-based vectorization model for knowledge graph. PLOS ONE, 2021
  71. Tobias Weller: Learning Latent Features using Stochastic Neural Networks on Graph Structured Data. KIT, 2021.
  72. Peiran Yao and Denilson Barbosa: Typing Errors in Factual Knowledge Graphs: Severity and Possible Ways Out. The Web Conference, 2021
  73. Shuo Zhang and Krisztian Balog: Ad Hoc Table Retrieval using Semantic Similarity. The Web Conference, 2018
  74. Shuo Zhang and Krisztian Balog: Semantic Table Retrieval using Keyword and Table Queries. arxiv.org, 2021
  75. Amal Zouaq and Felix Martel: What is the schema of your knowledge graph?: leveraging knowledge graph embeddings and clustering for expressive taxonomy learning. International Workshop on Semantic Big Data, 2020.

Acknowledgements

The original development of RDF2vec was funded in the project Mine@LOD by the Deutsche Forschungsgemeinschaft (DFG) under grant number PA 2373/1-1 from 2013 to 2018.

Contact

If you are aware of any implementations, extensions, pre-trained models, or applications of RDF2vec not listed on this Web page, please get in touch with Heiko Paulheim.