Computing Publications

Publications Home » GoTag: A case study in using a sh...

GoTag: A case study in using a shared UK e-Science infrastructure

Moustafa Ghanem, Vasa Curcin, Yike Guo, Neil Davis, Rob Gaizauskas, Yikun Guo, Henk Harkema, Ian Roberts, Jonathan Ratcliffe

Conference or Workshop Paper
4th UK e-Science All Hands Meeting 2005
September, 2005
ISBN 1-904425-53-4

In this paper we describe our efforts and experience in constructing GoTag, a distributed system for automatically annotating Medline documents with relevant GO (Gene Ontology) terms. The system is built on top of a service-based text mining infrastructure that integrates tools developed within the Discovery Net and myGrid projects. Two baseline approaches to assigning GO terms have been developed. One assigns GO terms based on directly matching GO term names and synonyms in documents; the other uses a trainable document classifier trained over feature vector representations of documents with which GO terms can be associated using the manually curated yeast genome database. We present preliminary results of evaluating these two approaches and discuss proposals for enhancing both baselines, as well as for constructing a hybrid approach.

BibTEX file for the publication built & maintained by Ashok Argent-Katwala.