Shane C Burgess
Publications
PMID: 19811692;PMCID: PMC3226197;Abstract:
Background: The horse genome is sequenced, allowing equine researchers to use high-throughput functional genomics platforms such as microarrays; next-generation sequencing for gene expression and proteomics. However, for researchers to derive value from these functional genomics datasets, they must be able to model this data in biologically relevant ways; to do so requires that the equine genome be more fully annotated. There are two interrelated types of genomic annotation: structural and functional. Structural annotation is delineating and demarcating the genomic elements (such as genes, promoters, and regulatory elements). Functional annotation is assigning function to structural elements. The Gene Ontology (GO) is the de facto standard for functional annotation, and is routinely used as a basis for modelling and hypothesis testing, large functional genomics datasets. Results: An Equine Whole Genome Oligonucleotide (EWGO) array with 21,351 elements was developed at Texas A&M University. This 70-mer oligoarray was designed using the approximately 7× assembled and annotated sequence of the equine genome to be one of the most comprehensive arrays available for expressed equine sequences. To assist researchers in determining the biological meaning of data derived from this array, we have structurally annotated it by mapping the elements to multiple database accessions, including UniProtKB, Entrez Gene, NRPD (Non-Redundant Protein Database) and UniGene. We next provided GO functional annotations for the gene transcripts represented on this array. Overall, we GO annotated 14,531 gene products (68.1% of the gene products represented on the EWGO array) with 57,912 annotations. GAQ (GO Annotation Quality) scores were calculated for this array both before and after we added GO annotation. The additional annotations improved the meanGAQ score 16-fold. This data is publicly available at AgBase http://www.agbase.msstate.edu/. Conclusion: Providing additional information about the public databases which link to the gene products represented on the array allows users more flexibility when using gene expression modelling and hypothesis-testing computational tools. Moreover, since different databases provide different types of information, users have access to multiple data sources. In addition, our GO annotation underpins functional modelling for most gene expression analysis tools and enables equine researchers to model large lists of differentially expressed transcripts in biologically relevant ways. © 2009 Bright et al; licensee BioMed Central Ltd.
Abstract:
Structural annotation of genomes is one of major goals of genomics research. Most popular tools for structural annotation of genomes are determined by computational pipelines. It is well-known that these computational methods have a number of shortcomings including false identifications and incorrect identification of gene boundaries. Proteomic data can used to confirm the identification of genes identified by computational methods and correct mistakes. A Proteogenomic mapping method has been developed, which uses peptides identified from mass spectrometry for structural annotation of genomes. Spectra are matched against both a protein database and the genome database translated in all six reading frames. Those peptides that match the genome but not the protein database potentially represent novel protein coding genes, annotation errors. These short experimentally derived peptides are used to discover potential novel protein coding genes called expressed Protein Sequence Tags (ePSTs) by aligning the peptides to the genomic DNA and extending the translation in the 3' and 5' direction. In the paper, an enhanced pipeline, has been designed and developed for discovering and evaluating of potential novel protein coding genes: 1) a distance-based outlier detection method for validating peptides identified from MS/MS, 2) a proteogenomic mapping for discovery of potential novel protein coding genes, 3) collection of evidence from a number of sources and automatically evaluate potential novel protein coding genes by using machine learning techniques, such as Neural Network, Support Vector Machine, Naïve Bayes etc.
PMID: 19079471;Abstract:
A high spectral contrast is expected to be very important when laser-induced fluorescence (LIF) is employed for cancer diagnosis. We developed a LIF optical fiber sensor to achieve a very high spectral contrast between normal and malignant tissues. A comprehensive experimental investigation was carried out to study the role of two critically important parameters for sensor design, namely, the excitationcollection geometry and the excitation wavelength, and their effect on the autofluorescence spectral contrast. An optimum sensing configuration was determined in order to enhance the small, but consistent, spectral difference between the normal and the malignant tissue for improving the accuracy of LIF-based cancer diagnosis. With the optimum sensor configuration, we realized a spectral contrast of more than 22 times between normal and malignant tissue sample spectra. © 2008 Optical Society of America.
PMID: 19153687;Abstract:
Sequential detergent extraction of proteins from eukaryotic cells has been used to increase proteome coverage of 2D-PAGE. We have adapted sequential detergent extraction for use with the high-throughput non-electrophoretic proteomics method of liquid chromatography and electrospray ionisation tandem mass spectrometry. This method of extraction yields comprehensive proteomes that include up to twice as many membrane proteins as other published methods. Two thirds of these membrane proteins have more than one transmembrane domain and many of these have multiple transmembrane domains. Since sequential detergent extraction (SDE) separates proteins based upon their physicochemistry and sub-cellular localisation, this method also provides useful data about cellular localisation.
PMID: 19577474;Abstract:
Understanding the effects of viral infection has typically focused on specific virus-host interactions such as tissue tropism, immune responses and histopathology. However, modeling viral pathogenesis requires information about the functions of gene products from both virus and host, and how these products interact. Recent developments in the functional annotation of genomes using Gene Ontology (GO) and in modeling functional interactions among gene products, together with an increased interest in systems biology, provide an excellent opportunity to generate global interaction models for viral infection. Here, we review how the GO is being used to model viral pathogenesis, with a focus on animal viruses. © 2009 Elsevier Ltd. All rights reserved.