RNA-seq Pipeline for Known Transcripts: Difference between revisions
added instructions for barcode splitting |
added section for aligning reads with bowtie |
||
| (2 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
==Reference Pages== | |||
* http://seqanswers.com/wiki/How-to/RNASeq_analysis | |||
==Sequence Quality and Trimming== | ==Sequence Quality and Trimming== | ||
# Run FASTQC to assess quality of reads from sequencer and: | # Run FASTQC to assess quality of reads from sequencer and: | ||
| Line 21: | Line 24: | ||
* Example command where s_2_100.txt is the original file, mybarcodes.txt is the barcode file, 2 mismatches are allowed (default is 1). This will generate files /tmp/bla_BC#.txt: | * Example command where s_2_100.txt is the original file, mybarcodes.txt is the barcode file, 2 mismatches are allowed (default is 1). This will generate files /tmp/bla_BC#.txt: | ||
<pre> | <pre> | ||
cat s_2_100.txt | /usr/local/bin/fastx_barcode_splitter.pl --bcfile mybarcodes.txt --bol --mismatches 2 | cat s_2_100.txt | /usr/local/bin/fastx_barcode_splitter.pl --bcfile mybarcodes.txt --bol --mismatches 2 --prefix /tmp/bla_ --suffix ".txt" | ||
</pre> | </pre> | ||
# Filter for quality, if applicable | # Filter for quality, if applicable | ||
# Trim, if applicable | # Trim, if applicable using Fast-x. The following keeps 100% of the reads with a quality of 25 or greater: | ||
<pre> | |||
fastq_quality_filter -v -q 25 -p 100 -i control-reads.txt -o control-reads-quality25.txt | |||
</pre> | |||
== | ==Align Reads== | ||
This can be done with either TopHat or Bowtie, so choose one of the following. The reference genomes are located in the following locations: | |||
<pre> | |||
/database/davebrid/RNAseq/reference-genomes/hg19 | |||
/database/davebrid/RNAseq/reference-genomes/mm9 | |||
</pre> | |||
These reference alignments are pre-built UCSC genomes and downloaded from ftp://ftp.cbcb.umd.edu/pub/data/bowtie_indexes/ | |||
===Align Reads to Reference Genome with Bowtie=== | |||
Run bowtie to align reads to reference genomes. The following generates a sam formatted alignment using the best quality flag for reads aligned to hg19 | |||
<pre> | <pre> | ||
bowtie --sam --best /database/davebrid/RNAseq/reference-genomes/hg19/hg19 control-reads-quality25.txt control-aligned-quality25.sam | |||
</pre> | </pre> | ||
==Align Reads to Reference Genome with Tophat== | |||
===Align Reads to Reference Genome with Tophat=== | |||
Run tophat to align reads to the reference genome. I’ve included a pseudo command line as well as a “real” command line. | Run tophat to align reads to the reference genome. I’ve included a pseudo command line as well as a “real” command line. | ||
<pre> | <pre> | ||