The University of Arizona
TCW Summary
AGCoL | TCW Home | Doc Index | singleTCW Guide | DE Guide | multiTCW Guide | Tour
Select a program name to view the graphical interface of the Demo Project.

singleTCW contains four major programs:

  1. runAS -- annotation setup for input to runSingleTCW
    • Downloads the UniProt taxonomic and/or full SwissProt and TrEMBL .dat files
    • Create fasta files from .dat files for searching against TCW sequences
    • Download GO mysql database and augment it with UniProt information about GO, KEGG, Pfam, and InterPro
    Note: runSingleTCW can take as input other databases such as Genbank nr, but these results will not have associated GO annotations.

  2. runSingleTCW -- create single TCW database
    • Input: sequences and optional counts, where any of the following are valid:
      1. Load RNA-seq transcripts and count data with optional replicates.
      2. Load protein sequences and spectra (count) data with optional replicates.
      3. Load sequences with location information (e.g. predicted genes).
      4. Assemble up to ~1M sequences, such as: transcript sets, paired-end Sanger ESTs, or a mix of transcripts and ESTs.
    • Annotation:
      1. Annotate sequences with one or more nucleotide or protein databases (called annoDBs). UniProt should be downloaded with the runAS program. The searching may be done with the super-fast diamond or ublast, or the standard blast.
      2. If UniProt is used, GO annotations along with EC, KEGG and Pfam identifiers are extracted from the GO database and entered into the sTCW database. The GO database is set up with the runAS program.
      3. Compute ORFS and GC content.
    • All data and results are stored in a MySQL database.

  3. runDE -- run Differential Expression analysis
    • An interface to several R packages (EdgeR, DESeq) for calculating differential of sequences. Additionally, it can execute a user-provided R script for DE calculation.
    • If UniProt is used and GO entered, the GOseq R program can be used to compute differential GO terms.

  4. viewSingleTCW -- view single TCW database
    • Query and view the results. There are various filters, for example, filters are provided specific to taxonomic databases, trimmed GOs, filter by annotation, etc. The initial view is the Overview, which summarizes the results.
multiTCW contains two major programs:
  1. runMultiTCW -- build species comparison TCW database
    • Builds a database of singleTCW projects.
    • Runs blast to compare the sequences from the input TCWs.
    • Clusters them into ortholog groups. They can be clustered with OrthoMCL, TCW Closure, TCW BBH (best bi-directional) and/or user-supplied clusters can be uploaded. Multiple ortholog clustering can be in the database for query.
    • If the input is from DNA singleTCW projects, coding statistics are calculated. Additionally, alignment files are output for input to KaKs_calculator, and the results of running the KaKs_calculator are input to runMultiTCW.

  2. viewMultiTCW -- view multi TCW database
    • Query and view the results. The results can be filtered on various attributes. A cluster can be viewed graphically with the results of Muscle alignment or pairwise alignment (e.g. Muscle and Pair).

Email: tcw@agcol.arizona.edu
References:
C. Soderlund, W. Nelson and S. Goff (2014) Allele Workbench: transcriptome pipeline and interactive graphics for allele-specific expression. PLOS ONE. Link
Describes a pipeline that can be used with TCW, plus a new GO trim algorithm.
 
C. Soderlund, W. Nelson, M. Willer and D. Gang (2013) TCW: Transcriptome Computational Workbench. PLOS ONE. Link
Describes the TCW package.
 
C. Soderlund, E. Johnson, M. Bomhoff, and A. Descour (2009) PAVE: Program for Assembling and Viewing ESTs. BMC Genomics. Link
Describes the assembly algorithm.

Email Comments To: tcw@agcol.arizona.edu