High-density tiling arrays provide closer view of transcription than regular microarrays and can also be used for annotating functional elements in genomes. The identified transcripts usually have a complex overlapping architecture when compared to the existing genome annotation. Therefore, there is a need for customized tiling array data analysis tools. Since most of the initial tiling arrays were conducted in eukaryotes, data analysis methods are well suited for eukaryotic genomes. For using whole-genome tiling arrays to identify previously unknown transcriptional elements like small RNA and antisense RNA in prokaryotes, existing data analysis tools need to be tailored for prokaryotic genome architecture. Furthermore, automation of such custom data analysis workflow is necessary for biologists to apply this powerful platform for knowledge discovery. Here we describe TAAPP, a web-based package that consists of two modules for prokaryotic tiling array data analysis. The transcript generation module works on normalized data to generate transcriptionally active regions (TARs). The feature extraction and annotation module then maps TARs to existing genome annotation. This module further categorizes the transcription profile into potential novel non-coding RNA, antisense RNA, gene expression and operon structures. The implemented workflow is microarray platform independent and is presented as a web-based service. The web interface is freely available for acedemic use at http://lims.lsbi.mafes.msstate.edu/TAAPP-HTML/. © 2011 Beijing Genomics Institute.
PMID: 18021451;PMCID: PMC2204016;Abstract:
Background: The chicken genome was sequenced because of its phylogenetic position as a non-mammalian vertebrate, its use as a biomedical model especially to study embryology and development, its role as a source of human disease organisms and its importance as the major source of animal derived food protein. However, genomic sequence data is, in itself, of limited value; generally it is not equivalent to understanding biological function. The benefit of having a genome sequence is that it provides a basis for functional genomics. However, the sequence data currently available is poorly structurally and functionally annotated and many genes do not have standard nomenclature assigned. Results: We analysed eight chicken tissues and improved the chicken genome structural annotation by providing experimental support for the in vivo expression of 7,809 computationally predicted proteins, including 30 chicken proteins that were only electronically predicted or hypothetical translations in human. To improve functional annotation (based on Gene Ontology), we mapped these identified proteins to their human and mouse orthologs and used this orthology to transfer Gene Ontology (GO) functional annotations to the chicken proteins. The 8,213 orthology-based GO annotations that we produced represent an 8% increase in currently available chicken GO annotations. Orthologous chicken products were also assigned standardized nomenclature based on current chicken nomenclature guidelines. Conclusion: We demonstrate the utility of high-throughput expression proteomics for rapid experimental structural annotation of a newly sequenced eukaryote genome. These experimentally-supported predicted proteins were further annotated by assigning the proteins with standardized nomenclature and functional annotation. This method is widely applicable to a diverse range of species. Moreover, information from one genome can be used to improve the annotation of other genomes and inform gene prediction algorithms. © 2007 Buza et al; licensee BioMed Central Ltd.
PMID: 22247176;PMCID: PMC3302605;Abstract:
The feasibility of short-read sequencing for genomic analysis was demonstrated for Fibroporia radiculosa, a copper-tolerant fungus that causes brown rot decay of wood. The effect of read quality on genomic assembly was assessed by filtering Illumina GAIIx reads from a single run of a paired-end library (75-nucleotide read length and 300-bp fragment size) at three different stringency levels and then assembling each data set with Velvet. A simple approach was devised to determine which filter stringency was "best." Venn diagrams identified the regions containing reads that were used in an assembly but were of a low-enough quality to be removed by a filter. By plotting base quality histograms of reads in this region, we judged whether a filter was too stringent or not stringent enough. Our best assembly had a genome size of 33.6 Mb, an N50 of 65.8 kb for a k-mer of 51, and a maximum contig length of 347 kb. Using GeneMark, 9,262 genes were predicted. TargetP and SignalP analyses showed that among the 1,213 genes with secreted products, 986 had motifs for signal peptides and 227 had motifs for signal anchors. Blast2GO analysis provided functional annotation for 5,407 genes. We identified 29 genes with putative roles in copper tolerance and 73 genes for lignocellulose degradation. A search for homologs of these 102 genes showed that F. radiculosa exhibited more similarity to Postia placenta than Serpula lacrymans. Notable differences were found, however, and their involvements in copper tolerance and wood decay are discussed. © 2012, American Society for Microbiology.
PMID: 12072527;PMCID: PMC136297;Abstract:
Understanding the interactions between herpesviruses and their host cells and also the interactions between neoplastically transformed cells and the host immune system is fundamental to understanding the mechanisms of herpesvirus oncology. However, this has been difficult as no animal models of herpesvirus-induced oncogenesis in the natural host exist in which neoplastically transformed cells are also definitively identified and may be studied in vivo. Marek's disease (MD) herpesvirus (MDV) of poultry, although a recognized natural oncogenic virus causing T-cell lymphomas, is no exception. In this work, we identify for the first time the neoplastically transformed cells in MD as the CD4(+) major histocompatibility complex (MHC) class I(hi), MHC class II(hi), interleukin-2 receptor alpha-chain-positive, CD28(lo/-), phosphoprotein 38-negative (pp38(-)), glycoprotein B-negative (gB(-)), alphabeta T-cell-receptor-positive (TCR(+)) cells which uniquely overexpress a novel host-encoded extracellular antigen that is also expressed by MDV-transformed cell lines and recognized by the monoclonal antibody (MAb) AV37. Normal uninfected leukocytes and MD lymphoma cells were isolated directly ex vivo and examined by flow cytometry with MAb recognizing AV37, known leukocyte antigens, and MDV antigens pp38 and gB. CD28 mRNA was examined by PCR. Cell cycle distribution and in vitro survival were compared for each lymphoma cell population. We demonstrate for the first time that the antigen recognized by AV37 is expressed at very low levels by small minorities of uninfected leukocytes, whereas particular MD lymphoma cells uniquely express extremely high levels of the AV37 antigen; the AV37(hi) MD lymphoma cells fulfill the accepted criteria for neoplastic transformation in vivo (protection from cell death despite hyperproliferation, presence in all MD lymphomas, and not supportive of MDV production); the lymphoma environment is essential for AV37(+) MD lymphoma cell survival; pp38 is an antigen expressed during MDV-productive infection and is not expressed by neoplastically transformed cells in vivo; AV37(+) MD lymphoma cells have the putative immune evasion mechanism of CD28 down-regulation; AV37(hi) peripheral blood leukocytes appear early after MDV infection in both MD-resistant and -susceptible chickens; and analysis of TCR variable beta chain gene family expression suggests that MD lymphomas have polyclonal origins. Identification of the neoplastically transformed cells in MD facilitates a detailed understanding of MD pathogenesis and also improves the utility of MD as a general model for herpesvirus oncology.