#file ID
#fid="CgM1"
#TIMESTAMP
date=!date +%m%d_%H%M
#working directory (parent)
#wd="/Volumes/web/cnidarian/BiGo_larvae_merge/"
#where is bsmap
#bsmap="/Users/Shared/Apps/bsmap-2.73/"
bsmap="/Volumes/Bay3/Software/BSMAP/bsmap-2.74/"
#fastq files location R1 location
R1="/Volumes/web/trilobite/Crassostrea_gigas_HTSdata/filtered_174gm_A_BiGosperm_L006_R1.fastq"
#fastq files location R2 location
#comment out if SE
R2="/Volumes/web/trilobite/Crassostrea_gigas_HTSdata/filtered_174gm_A_BiGosperm_L006_R2.fastq"
#option - number of processes
!{bsmap}bsmap -a {R1} -b {R2} -d /Volumes/web/cnidarian/oyster.v9.fa -o /Volumes/web/cnidarian/BiGo_bsmap_v9_{date}.sam -p 3
Start at: Fri Apr 4 06:42:38 2014
Input reference file: oyster.v9.fa.gz (format: gzipped FASTA)
Load in 11969 db seqs, total size 558601156 bp. 13 secs passed
total_kmers: 43046721
Create seed table. 36 secs passed
max number of mismatches: read_length * 8% max gap size: 0
kmer cut-off ratio: 5e-07
max multi-hits: 100 max Ns: 5 seed size: 16 index interval: 4
quality cutoff: 0 base quality char: '!'
min fragment size:28 max fragemt size:500
start from read #1 end at read #4294967295
additional alignment: T in reads => C in reference
mapping strand (read_1): ++,-+
mapping strand (read_2): +-,--
Pair-end alignment(8 threads)
Input read file #1: filtered_174gm_A_BiGosperm_L006_R1.fastq (format: FASTQ)
Input read file #2: filtered_174gm_A_BiGosperm_L006_R2.fastq (format: FASTQ)
Output file: bsmap_out.sam (format: SAM)
Thread #2: 100000 read pairs finished. 52 secs passed
Total number of aligned reads:
pairs: 90102067 (53%)
single a: 17033002 (9.9%)
single b: 15975991 (9.3%)
Done.
Finished at Fri Apr 4 09:02:07 2014
Total time consumed: 8369 secs
!curl -o 'BiGo_sperm_v9bsmap_out.sam' 'http://de.iplantcollaborative.org/dl/d/AE429A54-2B38-44B0-8AD7-E92DCDA2DEE8/bsmap_out.sam'
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 36.7M 0 36.7M 0 0 1301k 0 --:--:-- 0:00:28 --:--:-- 1275k
running fastqc on iplant
!python {bsmap}methratio.py -d {genome} -u -z -g -o methratio_out.txt -s {bsmap}samtools bsmap_out.sam
#command for only obtaining the context '__CG_'
!grep "[A-Z][A-Z]CG[A-Z]" <methratio_out.txt> methratio_out_CG.txt
#5x coverage
!awk '{if ($8 >= 5) print $1,$2-1,$2+1,"CpG",$5}' <methratio_out_CG.txt> filt_methratio_out_CG.igv
!tr ' ' "\t" <filt_methratio_out_CG.igv> filt_methratio_{fid}.igv