RNA-seq Pipeline for Known Transcripts: Difference between revisions
m added link to FastQC report analysis |
added instructions for barcode splitting |
||
| Line 6: | Line 6: | ||
===Filter Sequences Using FastX-Toolkit=== | ===Filter Sequences Using FastX-Toolkit=== | ||
# If samples are barcoded use Fastx barcode splitter (see http://hannonlab.cshl.edu/fastx_toolkit/commandline.html#fastx_barcode_splitter_usage for more details): | |||
<pre> | |||
/usr/local/bin/fastx_barcode_splitter.pl --bcfile FILE --prefix PREFIX [--suffix SUFFIX] [--bol|--eol] [--mismatches N] [--exact] [--partial N] [--help] [--quiet] [--debug] | |||
</pre> | |||
* This requires a barcode file in the format where BC# is the barcode number and the nucleotide names are the barcodes: | |||
<pre> | |||
#This line is a comment (starts with a 'number' sign) | |||
BC1 GATCT | |||
BC2 ATCGT | |||
BC3 GTGAT | |||
BC4 TGTCT | |||
</pre> | |||
* This file is the FILE for the --bcfile option | |||
* Example command where s_2_100.txt is the original file, mybarcodes.txt is the barcode file, 2 mismatches are allowed (default is 1). This will generate files /tmp/bla_BC#.txt: | |||
<pre> | |||
cat s_2_100.txt | /usr/local/bin/fastx_barcode_splitter.pl --bcfile mybarcodes.txt --bol --mismatches 2 \ | |||
--prefix /tmp/bla_ --suffix ".txt" | |||
</pre> | |||
# Filter for quality, if applicable | # Filter for quality, if applicable | ||
# Trim, if applicable | # Trim, if applicable | ||