Knowledge Graphs, Link Prediction and enterprises

Hello, there! In my previous post I discussed the basics of embedding-based Link Prediction on Knowledge Graphs.

On that occasion I included a pointer to a comparative analysis that I published on the topic; in this post I’d like to borrow a few of the concepts from that work about the current status of Link Prediction research.

Quick recap: Link Prediction infers new facts in a Knowledge Graph leveraging the already known ones. Most Link Prediction models use Machine Learning to learn embeddings of the entities and relations; embeddings are optimized to fit a Scoring Function Φ that estimates the plausibility of individual facts. After training on many facts already known to be true, the learned embeddings should yield good Φ values for unknown true facts as well.

Link Prediction Trends

Since the pioneering TransE model researchers have gone into a “Link Prediction frenzy” and they have started creating dozens and dozens of new embedding-based models, each with a Φ function of its own. To put things into a numeric perspective, these are Google Scholar yearly numbers of the works citing TransE:

Damn, 2020, you just couldn’t keep the exponential trend going, could you?

This trend is pretty darn impressive. Of course, not all of these works propose new models, nonetheless these numbers are huge – and TransE is now a bit outdated, so the most recent works may not even mention it that often anymore. It would not be an understatement to say that hundreds of embedding-based Link Prediction models have been developed in the last 7-8 years.

With such a crowded scene, when I started my PhD on this topic I felt utterly overwhelmed. My first reaction was denial: “There’s no way all of these models are really meaningful! Most of them must just be junk!”. While this is not necessarily wrong, after a while my point of view started shifting. When you think about it, there is a special kind of beauty in the fact that knowledge can be learned in hundreds of different approaches, each with its own features and quirks, and even if many of them turn out to be suboptimal research-wise. I guess this is part of what makes Machine Learning so attractive, after all.

I was still puzzled, though: why are there so many Link Prediction models? Why is this topic getting this much attention, when it does not even have many practical applications yet? It took me some time to wrap my head around this, but I think the answer lies in the big shift that Knowledge Graphs have undergone in the last decade.

The Knowledge Graph Boom

In the 2000s Knowledge Graphs (or Knowledge Bases, or Ontologies) were synonyms to Linked Open data. The big players were open projects like Freebase or DBpedia, trying to implement the vision of Tim Berners-Lee of a Semantic Web of distributed, machine-readable concepts.

In the 2010s, for the good or the bad, enterprise Knowledge Graphs have taken over. In 2010 Google outright bought Freebase and built their Google Knowledge Graph (2012) on top of it to enhance their web engine with semantic knowledge (fun fact: this is where the term “Knowledge Graph” comes from!). Two years later they launched the Knowledge Vault project, combining the data obtained by a multitude of extractors with Link Prediction outcomes. In the same years, Microsoft developed their own Knowledge Graph Satori (“understanding” in Japanese), and in 2017 they merged it with Bing into the Bing Entity Search service.

Big marketplaces like Amazon and Ebay joined the game too, developing product graphs to encompass semantic knowledge on the products they sale. Amazon leverages the data in their graph to improve product recommendation, while Ebay mostly uses theirs to power smart conversational agents. In 2018 both Airbnb and Uber Eats announced the creation of their own Knowledge Graphs, that they use to improve recommendation of activities and foods/restaurants respectively.

In the meantime the social networks did not stay dormant. In 2013 Facebook launched their Facebook Graph Search project to leverage semantic knowledge on the entities and topics that users are most invested on; it is mostly used to improve user profiling and, thus, provide users with better recommendations and targeted advertising. In 2016 LinkedIn announced their own Knowledge Graph with similar purposes, and in 2020 the same route has been followed by Pinterest.

It is undeniable that such a Knowledge Graph boom has boosted many related research topics too. In the specific case of Link Prediction, I think it has become so popular because it has hit the “sweet spot” of three extremely favorable conditions:

It applies to tools that have rapidly become very useful and profitable for giant tech companies, i.e., Knowledge Graphs;
It attempts to tackle arguably the greatest issue that such tools suffer from, i.e., incompleteness;
It does so with super fancy and trendy novel technologies that everybody is interested in, i.e., Machine Learning.

I finally tried to come full circle, developing my own organization to the messy scenario of Link Prediction models.

Link Prediction Taxonomy

This taxonomy groups embedding-based models based on the interpretation of their Φ function.

My taxonomy for Link Prediction models, with a small selection of representative examples. And with colors!

I identified three main families, each with further sub-groups:

Matrix Factorization Models
They mostly rely on linear algebra to combine the embeddings of heads, relations, and tails. I further split them into Bilinear Models, based on bilinear products, and Non-bilinear Models, that may employ more “esoteric” operations, e.g., Circular Correlation or Tucker composition.
Geometric Models
They interpret relations as geometric operations in the embedding space. Starting from TransE, which is a purely translational model, researchers have studied smart ways including Additional Embeddings, often mapping each entity to multiple relation-specific embeddings. Lately, though, Roto-translational operations have become the most promising research direction in this family.
Deep Learning Models
They rely on Neural Networks, which include deep sequences of layers interspersed with non-linear activation functions. Their parameters are learned jointly the embeddings, on which they operate in the ɸ function. This family can be naturally divided into sub-groups based on the neural architecture type: Convolutional Models, Recurrent Models, Capsule Models, etc. The use of additional parameters makes these models quite expressive, but in turn lead to longer training times and greater risks of overfitting.

That’s it for this post! Thank you for reading this far 🙏

As usual I will leave here a few useful references:

Some data on enterprise Knowledge Graphs have been collected from the excellent book “Knowledge Graphs”, by Hogan et al.
Tsinghua University has collected a list with no less than 50 must-read scientific papers on Link Prediction: enjoy!