“We at Google have made tremendous advances in understanding language. Our knowledge graph has been fundamental to that. The destiny of [Google’s search engine] is to become that Star Trek computer, and that’s what we are building.”

Amit Singhal, former head of Google Search

The topic of knowledge graphs as a representation of artificial intelligence surfaced in a discussion at a recent American Board of AI in Medicine (ABAIM) introductory course. Knowledge graphs are well used in sectors outside medicine and healthcare, such as organizing knowledge over the internet, integrating data in enterprises, and representing output for natural language processing and computer vision algorithms. The knowledge graph is an important concept as it is very germane to how clinicians think in clinical practice, and yet this important topic is not typically covered in the medical school or clinical training curricula.

A knowledge graph (also known as a semantic network) is a representation of knowledge in the form of interconnections between elements called nodes and edges. An edge connects two nodes and these interconnections are essential representations of the relationships between entities. Of note, Resource Description Framework (RDF) is a standard for data interchange and for defining relationships between data. In a biomedical knowledge graph, these entities can be real-world objects (for example, intravenous inotropic agents) or abstract concepts (such as septic shock). Knowledge graphs are therefore abstractions for organizing knowledge in a structured format.

These knowledge graphs are essential underpinnings of a type of learning called knowledge representation learning. This learning system comprises of three parts: 1) knowledge graph data; 2) entity/relationship embeddings (embeddings transform a symbolic input into a vector of numbers) in the form of “triples” (head and tail entities with a relationship); and 3) scoring component that assesses the plausibility of similar nodes by two types of models (semantic matching model based on similarity or translational distance model based on distance).

In summary, knowledge graphs are robust interlinked frameworks for representing data and will have higher potential for applications in biomedicine. The recent advances in natural language processing and computer vision have further promulgated the relevance of knowledge graphs. The unprecedented large scale of knowledge graphs (such as the Google knowledge graph of over 500 million entities and 18 billion relationships) will inspire further work in this domain for biomedicine. Finally, both top-down and a bottom-up approaches to knowledge graphs will provide clinicians ample opportunities to contribute to this effort in biomedicine.