G-SESAME
A G-SESAME is a Hybrid Semantic Similarity Measure that determines the semantic similarity between nodes based on both their locations of semantic graph-structure and their semantic relations with their ancestor nodes,
- AKA: Wang-Du-Payattakool-Yu-Chen Semantic Similarity Measure, Gene Semantic Similarity Analysis and Measurement Tools.
- Context:
- Website: http://bioinformatics.clemson.edu/G-SESAME/
- It was initially developed by (Wang et al., 2007a).
- Example(s):
- Counter-Example(s):
- See: Semantic Similarity Measure, Semantic Similarity Neural Network, Semantic Word Similarity Measure, Gene Semantic Similarity Measure, Semantic Relatedness Measure, Similarity Matrix, Generalized Cosine-Similarity Measure (GCSM), Path Distance Similarity Measure.
References
2021
- (G-SESAME, 201) ⇒ http://bioinformatics.clemson.edu/G-SESAME/ Retrieved: 2021-08-08.
- QUOTE: G-SESAME is a set of on-line tools to measure the semantic similarities of Gene Ontology (GO) terms and the functional similarities of gene products, and to discover biomedical knowledge through GO database. These tools are originally based on the G-SESAME paper in 2007. They were developed using MariaDB, PHP and hosted by an Apache Web server running on a Linux operating system (CentOS 7). New methods taking into account the statistical distribution of the GO database are implemented as a new features. Other state-of-the-art methods were also implemented to allow researchers to choose the best methods on their own needs.
2014
- (Song et al., 2014) ⇒ Xuebo Song, Lin Li, Pradip K. Srimani, Philip S. Yu, and James Z. Wang (2014). "Measure the Semantic Similarity of GO Terms Using Aggregate Information Content". In: IEEE/ACM Transactions on Computational Biology and Bioinformatics 11(3).
- QUOTE: In this paper we propose a novel and efficient method to measure the semantic similarity of GO terms. The proposed method addresses the limitations in existing GO term similarity measurement techniques; it computes the semantic content of a GO term by considering the information content of all of its ancestor terms in the graph. The aggregate information content (AIC) of all ancestor terms of a GO term implicitly reflects the GO term's location in the GO graph and also represents how human beings use this GO term and all its ancestor terms to annotate genes. We show that semantic similarity of GO terms obtained by our method closely matches the human perception. Extensive experimental studies show that this novel method also outperforms all existing methods in terms of the correlation with gene expression data. We have developed web services for measuring semantic similarity of GO terms and functional similarity of genes using the proposed AIC method and other popular methods. These web services are available at http://bioinformatics.clemson.edu/G-SESAME.
2009
- (Du et al., 2009) ⇒ Zhidian Du, Lin Li, Chin-Fu Chen, Philip S. Yu, and James Z. Wang. "G-SESAME: web tools for go term based gene similarity analysis and knowledge discovery. In: Nucleic Acids Research, 37:W345-W349.
- QUOTE: We provide and maintain a set of online tools to measure the semantic similarities of Gene Ontology (GO) terms and the functional similarities of gene products, and to discover biomedical knowledge through GO database. These tools are developed based on methods and algorithms proposed in our G-SESAME article (Wang et al., 2007a) using MySQL 5.0.45 and PHP 5.1.6 and hosted by an Apache Web server (version 2.2.3) running on a Linux operating system (CentOS 5).
2007a
- (Wang et al., 2007a) ⇒ James Z. Wang, Zhidian Du, Rapeeporn Payattakool, Philip S. Yu, and Chin-Fu Chen (2007). "A new method to measure the semantic similarity of GO terms"In: Bioinformatics 23 (10).
- QUOTE: To address this critical need, we proposed a novel method to encode a GO term's semantics (biological meanings) into a numeric value by aggregating the semantic contributions of their ancestor terms (including this specific term) in the GO graph and, in turn, designed an algorithm to measure the semantic similarity of GO terms. Based on the semantic similarities of GO terms used for gene annotation, we designed a new algorithm to measure the functional similarity of genes.
(...)
Given $DAG_A = (A, T_A, E_A)$ and $DAG_B = (B, T_B, E_B)$ for GO terms $A$ and $B$ respectively, the semantic similarity between these two terms, $S_{GO}(A, B)$, is defined as
- QUOTE: To address this critical need, we proposed a novel method to encode a GO term's semantics (biological meanings) into a numeric value by aggregating the semantic contributions of their ancestor terms (including this specific term) in the GO graph and, in turn, designed an algorithm to measure the semantic similarity of GO terms. Based on the semantic similarities of GO terms used for gene annotation, we designed a new algorithm to measure the functional similarity of genes.
$S_{G O}\left(A, B\right)=\dfrac{\displaystyle\sum_{t \in T_{A} \cap T_{B}}\left(S_{A}(t)+S_{B}(t)\right)}{S V(A)+S V(B)}$ |
(3) |
- where $S_A(t)$ is the S-value of GO term $t$ related to term $A$ and $S_B(t)$ is the S-value of GO term $t$ related to term $B$.
This formula determines the semantic similarity of two GO terms based on both the locations of these terms in the GO graph and their semantic relations with their ancestor terms, addressing the drawbacks in the existing approaches. For any term $t \in T_A \cap T_B$, $S_A(t)$ may differ from $S_B(t)$ even if term $t$ is a common term in both $DAG_A$ and $DAG_B$. This is because the locations of term $A$ and $B$ are different in the entire GO graph.
- where $S_A(t)$ is the S-value of GO term $t$ related to term $A$ and $S_B(t)$ is the S-value of GO term $t$ related to term $B$.
2007b
- (Wang et al., 2007b) ⇒ James Z. Wang, Zhidian Du, Philip S. Yu, and Chin-Fu Chen (2007). "An Effient Online Tool to Search Top-N Genes with Similar Biological Functions in Gene Ontology Database". In: 2007 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2007).
- QUOTE: In this paper, using a new method to measure the semantic similarity of GO terms, an efficient algorithm is proposed to find genes that have similar biological functions with a given gene. An online tool is then implemented to search the top N genes having similar biological functions with a particular gene within the same or cross different species. Furthermore, various performance enhancement techniques are utilized to reduce the user query response time of the online tool.