[HTML][HTML] Avoiding the pitfalls of gene set enrichment analysis with SetRank

C Simillion, R Liechti, HEL Lischer, V Ioannidis… - BMC …, 2017 - Springer
C Simillion, R Liechti, HEL Lischer, V Ioannidis, R Bruggmann
BMC bioinformatics, 2017Springer
Background The purpose of gene set enrichment analysis (GSEA) is to find general trends in
the huge lists of genes or proteins generated by many functional genomics techniques and
bioinformatics analyses. Results Here we present SetRank, an advanced GSEA algorithm
which is able to eliminate many false positive hits. The key principle of the algorithm is that it
discards gene sets that have initially been flagged as significant, if their significance is only
due to the overlap with another gene set. The algorithm is explained in detail and its …
Background
The purpose of gene set enrichment analysis (GSEA) is to find general trends in the huge lists of genes or proteins generated by many functional genomics techniques and bioinformatics analyses.
Results
Here we present SetRank, an advanced GSEA algorithm which is able to eliminate many false positive hits. The key principle of the algorithm is that it discards gene sets that have initially been flagged as significant, if their significance is only due to the overlap with another gene set. The algorithm is explained in detail and its performance is compared to that of other methods using objective benchmarking criteria. Furthermore, we explore how sample source bias can affect the results of a GSEA analysis.
Conclusions
The benchmarking results show that SetRank is a highly specific tool for GSEA. Furthermore, we show that the reliability of results can be improved by taking sample source bias into account. SetRank is available as an R package and through an online web interface.
Springer