Changes

RNA-seq Pipeline for Known Transcripts

207 bytes added, 16:08, 8 November 2011
added section for aligning reads with bowtie
</pre>
# Filter for quality, if applicable
# Trim, if applicableusing Fast-x. The following keeps 100% of the reads with a quality of 25 or greater:<pre>fastq_quality_filter -v -q 25 -p 100 -i control-reads.txt -o control-reads-quality25.txt</pre>
==Generate a Reference GenomeAlign Reads==# Run bowtie-build to generate Burroughs Wheeler transformed reference genome (.ebwt format)This can be done with either TopHat or Bowtie, so choose one of the following. # httpThe reference genomes are located in the following locations:<pre>/database/bowtiedavebrid/RNAseq/reference-bio.sourceforge.netgenomes/index.shtml (bowtie, tophat, and cufflinks are here).hg19# [Optional input and parameter settings are in square brackets.] /database/davebrid/RNAseq/reference-genomes/mm9# <Required parameters are in greater than/less than brackets.pre> # This BW transformed These reference genome can be created once then used repeatedly in the futurealignments are pre-built UCSC genomes and downloaded from ftp://ftp. # $ is the command promptcbcb. umd.edu/pub/data/bowtie_indexes/
===Align Reads to Reference Genome with Bowtie===
Run bowtie to align reads to reference genomes. The following generates a sam formatted alignment using the best quality flag for reads aligned to hg19
<pre>
$ bowtie-build [-f specifies reference genome is in fasta format] <path to input reference genome (e.g. sam --best /ccmbdatabase/CoreBAdavebrid/BioinfCoreRNAseq/Commonreference-genomes/DATA/BowtieData/H_Sapienshg19/hg19control-reads-quality25.fa)> <base name for reference genome output txt control-aligned-quality25.ebwt files (e.g hg19)>sam
</pre>
 ===Align Reads to Reference Genome with Tophat===
Run tophat to align reads to the reference genome. I’ve included a pseudo command line as well as a “real” command line.
<pre>