Node-based Semantic Similarity Measure

AKA: Content-based Semantic Similarity Measure.
Example(s):
Counter-Example(s):
See: Semantic Similarity Measure, Semantic Similarity Neural Network, Semantic Word Similarity Measure, Gene Semantic Similarity Measure, Semantic Relatedness Measure, Similarity Matrix, Generalized Cosine-Similarity Measure (GCSM).

References

(Benabderrahmane et al., 2010 ) ⇒ Sidahmed Benabderrahmane, Malika Smail-Tabbone, Olivier Poch, Amedeo Napoli, and Marie-Dominique Devignes (2010). "IntelliGO: a New Vector-based Semantic Similarity Measure Including Annotation Origin. BMC bioinformatics, 11(1), 1-16.
- QUOTE: Concerning the comparison between individual ontology terms, the two types of approaches reviewed by Pesquita et al. (2009) are similar to those proposed by Blanchard et al. (2008), namely the edge-based measures which rely on counting edges in the graph, and node-based measures which exploit information contained in the considered term, its descendants and its parents.
  In most edge-based measures, the Shortest Path-Length (SPL) is used as a distance measure between two terms in a graph.

(Pesquita et al., 2009 ) ⇒ Catia Pesquita, Daniel Faria, Andre O. Falcao, Phillip Lord, and Francisco M. Couto (2009). "Semantic Similarity in Biomedical Ontologies". In: PLoS Computational Biology 5(7): e1000443.
- QUOTE: Node-based approaches rely on comparing the properties of the terms involved, which can be related to the terms themselves, their ancestors, or their descendants. One concept commonly used in these approaches is information content (IC), which gives a measure how specific and informative a term is. The IC of a term $c$ can be quantified as the negative log likelihood,
  $-\log p(c)$
  where $p(c)$ is the probability of occurrence of $c$ in a specific corpus (such as the UniProt Knowledgebase), being normally estimated by its frequency of annotation. Alternatively, the IC can also be calculated from the number of children a term has in the GO structure^[1], although this approach is less commonly used.

↑ Seco N, Veale T, Hayes J. An intrinsic information content metric for semantic similarity in wordnet. ECAI. 2004. pp. 1089–1090.

(Resnik, 1995) ⇒ Philip Resnik. (1995). “Using Information Content to Evaluate Semantic Similarity in a Taxonomy.” In: Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI 1995).