http://dx.doi.org/10.1101/012831
To execute the IPython Notebook in its entirety you will need:
The analysis was originally done on the Mac OSX operating system.
The intent is to download the github repository locally and run so accompaning scripts, etc. are available. Detailed Instructions are provided in GitHub Repo Readme. Please note Data files that you will download and will produce are large (>20Gb).
The notebook is divided into 5 sections
Please post any comments and questions in issues.
Before getting started, set the location of BSMAP on your computer and hit shift-enter
bsmaploc="/Users/Shared/Apps/bsmap-2.74/"
#to confirm you current directory run the command and you should see a wd directory
!ls
BiGo_dev.ipynb README.md scripts wd
cd wd
/Users/Steven/Desktop/olson-ms-nb-master/wd
#This command downloads a archived file including six BS-seq libraries (4.3 Gb)
#!wget http://eagle.fish.washington.edu/trilobite/Crassostrea_gigas_HTSdata/BiGo_lar_fastq_mcf.tgz
!curl -O http://eagle.fish.washington.edu/trilobite/Crassostrea_gigas_HTSdata/BiGo_lar_fastq_mcf.tgz
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 4221M 100 4221M 0 0 84.2M 0 0:00:50 0:00:50 --:--:-- 85.0M
#uncompress files
!tar -zxvf BiGo_lar_fastq_mcf.tgz
x mcf_M1_R1.fastq x mcf_M1_R2.fastq x mcf_M3_R1.fastq x mcf_M3_R2.fastq x mcf_T1D3_R1.fastq x mcf_T1D3_R2.fastq x mcf_T1D5_R1.fastq x mcf_T1D5_R2.fastq x mcf_T3D3_R1.fastq x mcf_T3D3_R2.fastq x mcf_T3D5_R1.fastq x mcf_T3D5_R2.fastq
#remove BiGo_lar_fastq_mcf.tgz
#!rm BiGo_lar_fastq_mcf.tgz
#Downloading the oyster genome
#!wget http://eagle.fish.washington.edu/trilobite/Crassostrea_gigas_ensembl_tracks/Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa
!curl -O http://eagle.fish.washington.edu/trilobite/Crassostrea_gigas_ensembl_tracks/Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 541M 100 541M 0 0 99.1M 0 0:00:05 0:00:05 --:--:-- 100M
#bsmap will need to be downloaded if not already installed from https://code.google.com/p/bsmap/
#another option in running bsmap on iPlant (iplantcollaborative.org)
for i in ("M1","T1D3","T1D5", "M3", "T3D3", "T3D5"):
!{bsmaploc}bsmap \
-a mcf_{i}_R1.fastq \
-b mcf_{i}_R2.fastq \
-d Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa \
-o bsmap_out_{i}.sam \
-p 8
BSMAP v2.74 Start at: Fri Jan 30 11:44:33 2015 Input reference file: Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa (format: FASTA) Load in 7658 db seqs, total size 557717710 bp. 9 secs passed total_kmers: 43046721 Create seed table. 24 secs passed max number of mismatches: read_length * 8% max gap size: 0 kmer cut-off ratio: 5e-07 max multi-hits: 100 max Ns: 5 seed size: 16 index interval: 4 quality cutoff: 0 base quality char: '!' min fragment size:28 max fragemt size:500 start from read #1 end at read #4294967295 additional alignment: T in reads => C in reference mapping strand (read_1): ++,-+ mapping strand (read_2): +-,-- Pair-end alignment(8 threads) Input read file #1: mcf_M1_R1.fastq (format: FASTQ) Input read file #2: mcf_M1_R2.fastq (format: FASTQ) Output file: bsmap_out_M1.sam (format: SAM) Thread #3: 50000 read pairs finished. 74 secs passed Thread #2: 100000 read pairs finished. 75 secs passed Thread #1: 150000 read pairs finished. 77 secs passed Thread #0: 250000 read pairs finished. 78 secs passed Thread #7: 300000 read pairs finished. 79 secs passed Thread #5: 200000 read pairs finished. 79 secs passed Thread #4: 350000 read pairs finished. 80 secs passed Thread #6: 400000 read pairs finished. 80 secs passed Thread #3: 450000 read pairs finished. 127 secs passed Thread #2: 500000 read pairs finished. 128 secs passed Thread #1: 550000 read pairs finished. 131 secs passed Thread #0: 600000 read pairs finished. 131 secs passed Thread #7: 650000 read pairs finished. 132 secs passed Thread #5: 700000 read pairs finished. 133 secs passed Thread #4: 750000 read pairs finished. 134 secs passed Thread #6: 800000 read pairs finished. 135 secs passed Thread #3: 850000 read pairs finished. 180 secs passed Thread #2: 900000 read pairs finished. 181 secs passed Thread #0: 1000000 read pairs finished. 184 secs passed Thread #7: 1050000 read pairs finished. 185 secs passed Thread #1: 950000 read pairs finished. 185 secs passed Thread #5: 1100000 read pairs finished. 187 secs passed Thread #4: 1150000 read pairs finished. 188 secs passed Thread #6: 1200000 read pairs finished. 188 secs passed Thread #3: 1250000 read pairs finished. 233 secs passed Thread #2: 1300000 read pairs finished. 234 secs passed Thread #0: 1350000 read pairs finished. 237 secs passed Thread #7: 1400000 read pairs finished. 238 secs passed Thread #1: 1450000 read pairs finished. 239 secs passed Thread #5: 1500000 read pairs finished. 241 secs passed Thread #4: 1550000 read pairs finished. 243 secs passed Thread #6: 1600000 read pairs finished. 244 secs passed Thread #2: 1700000 read pairs finished. 289 secs passed Thread #3: 1650000 read pairs finished. 290 secs passed Thread #0: 1750000 read pairs finished. 292 secs passed Thread #7: 1800000 read pairs finished. 294 secs passed Thread #1: 1850000 read pairs finished. 295 secs passed Thread #5: 1900000 read pairs finished. 298 secs passed Thread #4: 1950000 read pairs finished. 299 secs passed Thread #6: 2000000 read pairs finished. 300 secs passed Thread #2: 2050000 read pairs finished. 345 secs passed Thread #3: 2100000 read pairs finished. 346 secs passed Thread #0: 2150000 read pairs finished. 347 secs passed Thread #7: 2200000 read pairs finished. 349 secs passed Thread #1: 2250000 read pairs finished. 352 secs passed Thread #5: 2300000 read pairs finished. 354 secs passed Thread #4: 2350000 read pairs finished. 355 secs passed Thread #6: 2400000 read pairs finished. 356 secs passed Thread #2: 2450000 read pairs finished. 407 secs passed Thread #3: 2500000 read pairs finished. 408 secs passed Thread #0: 2550000 read pairs finished. 408 secs passed Thread #7: 2600000 read pairs finished. 410 secs passed Thread #1: 2650000 read pairs finished. 416 secs passed Thread #5: 2700000 read pairs finished. 421 secs passed Thread #6: 2800000 read pairs finished. 422 secs passed Thread #4: 2750000 read pairs finished. 422 secs passed Thread #2: 2850000 read pairs finished. 473 secs passed Thread #0: 2950000 read pairs finished. 474 secs passed Thread #3: 2900000 read pairs finished. 474 secs passed Thread #7: 3000000 read pairs finished. 475 secs passed Thread #1: 3050000 read pairs finished. 480 secs passed Thread #5: 3100000 read pairs finished. 485 secs passed Thread #6: 3150000 read pairs finished. 486 secs passed Thread #4: 3200000 read pairs finished. 486 secs passed Thread #2: 3250000 read pairs finished. 532 secs passed Thread #0: 3300000 read pairs finished. 533 secs passed Thread #3: 3350000 read pairs finished. 534 secs passed Thread #7: 3400000 read pairs finished. 534 secs passed Thread #1: 3450000 read pairs finished. 539 secs passed Thread #5: 3500000 read pairs finished. 544 secs passed Thread #6: 3550000 read pairs finished. 550 secs passed Thread #4: 3600000 read pairs finished. 554 secs passed Thread #0: 3700000 read pairs finished. 592 secs passed Thread #3: 3750000 read pairs finished. 592 secs passed Thread #7: 3800000 read pairs finished. 593 secs passed Thread #2: 3650000 read pairs finished. 593 secs passed Thread #1: 3850000 read pairs finished. 596 secs passed Thread #5: 3900000 read pairs finished. 601 secs passed Thread #6: 3950000 read pairs finished. 607 secs passed Thread #4: 4000000 read pairs finished. 611 secs passed Thread #0: 4050000 read pairs finished. 646 secs passed Thread #3: 4100000 read pairs finished. 648 secs passed Thread #7: 4150000 read pairs finished. 649 secs passed Thread #2: 4200000 read pairs finished. 649 secs passed Thread #1: 4250000 read pairs finished. 651 secs passed Thread #5: 4300000 read pairs finished. 657 secs passed Thread #6: 4350000 read pairs finished. 662 secs passed Thread #4: 4400000 read pairs finished. 669 secs passed Thread #0: 4450000 read pairs finished. 702 secs passed Thread #3: 4500000 read pairs finished. 704 secs passed Thread #7: 4550000 read pairs finished. 705 secs passed Thread #2: 4600000 read pairs finished. 705 secs passed Thread #1: 4650000 read pairs finished. 706 secs passed Thread #5: 4700000 read pairs finished. 713 secs passed Thread #6: 4750000 read pairs finished. 717 secs passed Thread #4: 4800000 read pairs finished. 724 secs passed Thread #0: 4850000 read pairs finished. 756 secs passed Thread #3: 4900000 read pairs finished. 759 secs passed Thread #7: 4950000 read pairs finished. 760 secs passed Thread #2: 5000000 read pairs finished. 760 secs passed Thread #1: 5050000 read pairs finished. 761 secs passed Thread #5: 5100000 read pairs finished. 767 secs passed Thread #6: 5150000 read pairs finished. 771 secs passed Thread #4: 5200000 read pairs finished. 778 secs passed Thread #0: 5250000 read pairs finished. 809 secs passed Thread #3: 5300000 read pairs finished. 815 secs passed Thread #7: 5350000 read pairs finished. 816 secs passed Thread #2: 5400000 read pairs finished. 816 secs passed Thread #1: 5450000 read pairs finished. 817 secs passed Thread #5: 5500000 read pairs finished. 822 secs passed Thread #6: 5550000 read pairs finished. 826 secs passed Thread #4: 5600000 read pairs finished. 834 secs passed Thread #0: 5650000 read pairs finished. 863 secs passed Thread #3: 5700000 read pairs finished. 870 secs passed Thread #7: 5750000 read pairs finished. 871 secs passed Thread #2: 5800000 read pairs finished. 871 secs passed Thread #1: 5850000 read pairs finished. 872 secs passed Thread #5: 5900000 read pairs finished. 877 secs passed Thread #6: 5950000 read pairs finished. 881 secs passed Thread #4: 6000000 read pairs finished. 889 secs passed Thread #0: 6050000 read pairs finished. 915 secs passed Thread #3: 6100000 read pairs finished. 922 secs passed Thread #7: 6150000 read pairs finished. 923 secs passed Thread #2: 6200000 read pairs finished. 924 secs passed Thread #1: 6250000 read pairs finished. 933 secs passed Thread #5: 6300000 read pairs finished. 936 secs passed Thread #6: 6350000 read pairs finished. 938 secs passed Thread #4: 6400000 read pairs finished. 944 secs passed Thread #0: 6450000 read pairs finished. 970 secs passed Thread #3: 6500000 read pairs finished. 978 secs passed Thread #7: 6550000 read pairs finished. 978 secs passed Thread #2: 6600000 read pairs finished. 978 secs passed Thread #1: 6650000 read pairs finished. 988 secs passed Thread #5: 6700000 read pairs finished. 991 secs passed Thread #6: 6750000 read pairs finished. 992 secs passed Thread #4: 6800000 read pairs finished. 999 secs passed Thread #0: 6850000 read pairs finished. 1023 secs passed Thread #3: 6900000 read pairs finished. 1032 secs passed Thread #7: 6950000 read pairs finished. 1033 secs passed Thread #2: 7000000 read pairs finished. 1033 secs passed Thread #1: 7050000 read pairs finished. 1042 secs passed Thread #5: 7100000 read pairs finished. 1047 secs passed Thread #6: 7150000 read pairs finished. 1049 secs passed Thread #4: 7200000 read pairs finished. 1056 secs passed Thread #0: 7250000 read pairs finished. 1078 secs passed Thread #3: 7300000 read pairs finished. 1087 secs passed Thread #7: 7350000 read pairs finished. 1088 secs passed Thread #2: 7400000 read pairs finished. 1089 secs passed Thread #1: 7450000 read pairs finished. 1098 secs passed Thread #5: 7500000 read pairs finished. 1103 secs passed Thread #6: 7550000 read pairs finished. 1104 secs passed Thread #4: 7600000 read pairs finished. 1111 secs passed Thread #0: 7650000 read pairs finished. 1132 secs passed Thread #3: 7700000 read pairs finished. 1141 secs passed Thread #7: 7750000 read pairs finished. 1142 secs passed Thread #2: 7800000 read pairs finished. 1143 secs passed Thread #1: 7850000 read pairs finished. 1147 secs passed Thread #0: 8023643 read pairs finished. 1148 secs passed Thread #5: 7900000 read pairs finished. 1149 secs passed Thread #6: 7950000 read pairs finished. 1149 secs passed Thread #4: 8000000 read pairs finished. 1150 secs passed Total number of aligned reads: pairs: 3982551 (50%) single a: 1489442 (19%) single b: 1430646 (18%) Done. Finished at Fri Jan 30 12:03:43 2015 Total time consumed: 1150 secs BSMAP v2.74 Start at: Fri Jan 30 12:03:44 2015 Input reference file: Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa (format: FASTA) Load in 7658 db seqs, total size 557717710 bp. 9 secs passed total_kmers: 43046721 Create seed table. 25 secs passed max number of mismatches: read_length * 8% max gap size: 0 kmer cut-off ratio: 5e-07 max multi-hits: 100 max Ns: 5 seed size: 16 index interval: 4 quality cutoff: 0 base quality char: '!' min fragment size:28 max fragemt size:500 start from read #1 end at read #4294967295 additional alignment: T in reads => C in reference mapping strand (read_1): ++,-+ mapping strand (read_2): +-,-- Pair-end alignment(8 threads) Input read file #1: mcf_T1D3_R1.fastq (format: FASTQ) Input read file #2: mcf_T1D3_R2.fastq (format: FASTQ) Output file: bsmap_out_T1D3.sam (format: SAM) Thread #0: 50000 read pairs finished. 75 secs passed Thread #4: 100000 read pairs finished. 76 secs passed Thread #2: 150000 read pairs finished. 77 secs passed Thread #6: 200000 read pairs finished. 78 secs passed Thread #3: 250000 read pairs finished. 78 secs passed Thread #1: 300000 read pairs finished. 79 secs passed Thread #5: 350000 read pairs finished. 79 secs passed Thread #7: 400000 read pairs finished. 81 secs passed Thread #0: 450000 read pairs finished. 128 secs passed Thread #4: 500000 read pairs finished. 129 secs passed Thread #2: 550000 read pairs finished. 129 secs passed Thread #6: 600000 read pairs finished. 130 secs passed Thread #3: 650000 read pairs finished. 131 secs passed Thread #1: 700000 read pairs finished. 132 secs passed Thread #5: 750000 read pairs finished. 133 secs passed Thread #7: 800000 read pairs finished. 134 secs passed Thread #0: 850000 read pairs finished. 181 secs passed Thread #4: 900000 read pairs finished. 182 secs passed Thread #2: 950000 read pairs finished. 182 secs passed Thread #6: 1000000 read pairs finished. 183 secs passed Thread #3: 1050000 read pairs finished. 183 secs passed Thread #1: 1100000 read pairs finished. 185 secs passed Thread #5: 1150000 read pairs finished. 185 secs passed Thread #7: 1200000 read pairs finished. 187 secs passed Thread #0: 1250000 read pairs finished. 234 secs passed Thread #4: 1300000 read pairs finished. 235 secs passed Thread #2: 1350000 read pairs finished. 235 secs passed Thread #6: 1400000 read pairs finished. 236 secs passed Thread #3: 1450000 read pairs finished. 236 secs passed Thread #5: 1550000 read pairs finished. 238 secs passed Thread #1: 1500000 read pairs finished. 238 secs passed Thread #7: 1600000 read pairs finished. 240 secs passed Thread #0: 1650000 read pairs finished. 286 secs passed Thread #4: 1700000 read pairs finished. 288 secs passed Thread #2: 1750000 read pairs finished. 288 secs passed Thread #6: 1800000 read pairs finished. 288 secs passed Thread #3: 1850000 read pairs finished. 289 secs passed Thread #5: 1900000 read pairs finished. 290 secs passed Thread #1: 1950000 read pairs finished. 290 secs passed Thread #7: 2000000 read pairs finished. 292 secs passed Thread #0: 2050000 read pairs finished. 339 secs passed Thread #6: 2200000 read pairs finished. 340 secs passed Thread #2: 2150000 read pairs finished. 340 secs passed Thread #4: 2100000 read pairs finished. 340 secs passed Thread #3: 2250000 read pairs finished. 341 secs passed Thread #5: 2300000 read pairs finished. 341 secs passed Thread #1: 2350000 read pairs finished. 341 secs passed Thread #7: 2400000 read pairs finished. 342 secs passed Thread #0: 2450000 read pairs finished. 383 secs passed Thread #6: 2500000 read pairs finished. 389 secs passed Thread #2: 2550000 read pairs finished. 390 secs passed Thread #4: 2600000 read pairs finished. 392 secs passed Thread #5: 2700000 read pairs finished. 392 secs passed Thread #3: 2650000 read pairs finished. 392 secs passed Thread #1: 2750000 read pairs finished. 393 secs passed Thread #7: 2800000 read pairs finished. 393 secs passed Thread #0: 2850000 read pairs finished. 433 secs passed Thread #6: 2900000 read pairs finished. 439 secs passed Thread #2: 2950000 read pairs finished. 442 secs passed Thread #3: 3100000 read pairs finished. 444 secs passed Thread #5: 3050000 read pairs finished. 444 secs passed Thread #4: 3000000 read pairs finished. 445 secs passed Thread #1: 3150000 read pairs finished. 445 secs passed Thread #7: 3200000 read pairs finished. 445 secs passed Thread #0: 3250000 read pairs finished. 482 secs passed Thread #6: 3300000 read pairs finished. 488 secs passed Thread #2: 3350000 read pairs finished. 491 secs passed Thread #3: 3400000 read pairs finished. 494 secs passed Thread #5: 3450000 read pairs finished. 496 secs passed Thread #1: 3550000 read pairs finished. 499 secs passed Thread #4: 3500000 read pairs finished. 499 secs passed Thread #7: 3600000 read pairs finished. 499 secs passed Thread #0: 3650000 read pairs finished. 535 secs passed Thread #6: 3700000 read pairs finished. 539 secs passed Thread #2: 3750000 read pairs finished. 541 secs passed Thread #3: 3800000 read pairs finished. 544 secs passed Thread #5: 3850000 read pairs finished. 547 secs passed Thread #1: 3900000 read pairs finished. 550 secs passed Thread #4: 3950000 read pairs finished. 550 secs passed Thread #7: 4000000 read pairs finished. 553 secs passed Thread #0: 4050000 read pairs finished. 590 secs passed Thread #6: 4100000 read pairs finished. 592 secs passed Thread #2: 4150000 read pairs finished. 594 secs passed Thread #3: 4200000 read pairs finished. 596 secs passed Thread #5: 4250000 read pairs finished. 598 secs passed Thread #1: 4300000 read pairs finished. 600 secs passed Thread #4: 4350000 read pairs finished. 600 secs passed Thread #7: 4400000 read pairs finished. 604 secs passed Thread #0: 4450000 read pairs finished. 640 secs passed Thread #6: 4500000 read pairs finished. 644 secs passed Thread #2: 4550000 read pairs finished. 648 secs passed Thread #3: 4600000 read pairs finished. 649 secs passed Thread #5: 4650000 read pairs finished. 649 secs passed Thread #1: 4700000 read pairs finished. 651 secs passed Thread #4: 4750000 read pairs finished. 652 secs passed Thread #7: 4800000 read pairs finished. 654 secs passed Thread #5: 5016400 read pairs finished. 659 secs passed Thread #0: 4850000 read pairs finished. 667 secs passed Thread #6: 4900000 read pairs finished. 668 secs passed Thread #2: 4950000 read pairs finished. 668 secs passed Thread #3: 5000000 read pairs finished. 669 secs passed Total number of aligned reads: pairs: 2717832 (54%) single a: 1117376 (22%) single b: 1178712 (23%) Done. Finished at Fri Jan 30 12:14:53 2015 Total time consumed: 669 secs BSMAP v2.74 Start at: Fri Jan 30 12:14:54 2015 Input reference file: Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa (format: FASTA) Load in 7658 db seqs, total size 557717710 bp. 9 secs passed total_kmers: 43046721 Create seed table. 26 secs passed max number of mismatches: read_length * 8% max gap size: 0 kmer cut-off ratio: 5e-07 max multi-hits: 100 max Ns: 5 seed size: 16 index interval: 4 quality cutoff: 0 base quality char: '!' min fragment size:28 max fragemt size:500 start from read #1 end at read #4294967295 additional alignment: T in reads => C in reference mapping strand (read_1): ++,-+ mapping strand (read_2): +-,-- Pair-end alignment(8 threads) Input read file #1: mcf_T1D5_R1.fastq (format: FASTQ) Input read file #2: mcf_T1D5_R2.fastq (format: FASTQ) Output file: bsmap_out_T1D5.sam (format: SAM) Thread #5: 50000 read pairs finished. 78 secs passed Thread #0: 100000 read pairs finished. 80 secs passed Thread #3: 150000 read pairs finished. 80 secs passed Thread #4: 200000 read pairs finished. 81 secs passed Thread #1: 250000 read pairs finished. 82 secs passed Thread #2: 300000 read pairs finished. 82 secs passed Thread #6: 350000 read pairs finished. 83 secs passed Thread #7: 400000 read pairs finished. 83 secs passed Thread #5: 450000 read pairs finished. 133 secs passed Thread #0: 500000 read pairs finished. 135 secs passed Thread #3: 550000 read pairs finished. 135 secs passed Thread #4: 600000 read pairs finished. 136 secs passed Thread #1: 650000 read pairs finished. 137 secs passed Thread #2: 700000 read pairs finished. 138 secs passed Thread #6: 750000 read pairs finished. 138 secs passed Thread #7: 800000 read pairs finished. 139 secs passed Thread #5: 850000 read pairs finished. 189 secs passed Thread #0: 900000 read pairs finished. 190 secs passed Thread #3: 950000 read pairs finished. 190 secs passed Thread #4: 1000000 read pairs finished. 192 secs passed Thread #1: 1050000 read pairs finished. 192 secs passed Thread #2: 1100000 read pairs finished. 192 secs passed Thread #6: 1150000 read pairs finished. 193 secs passed Thread #7: 1200000 read pairs finished. 195 secs passed Thread #5: 1250000 read pairs finished. 243 secs passed Thread #0: 1300000 read pairs finished. 245 secs passed Thread #3: 1350000 read pairs finished. 246 secs passed Thread #4: 1400000 read pairs finished. 247 secs passed Thread #1: 1450000 read pairs finished. 247 secs passed Thread #2: 1500000 read pairs finished. 248 secs passed Thread #6: 1550000 read pairs finished. 248 secs passed Thread #7: 1600000 read pairs finished. 249 secs passed Thread #5: 1650000 read pairs finished. 297 secs passed Thread #0: 1700000 read pairs finished. 299 secs passed Thread #3: 1750000 read pairs finished. 300 secs passed Thread #4: 1800000 read pairs finished. 302 secs passed Thread #1: 1850000 read pairs finished. 304 secs passed Thread #2: 1900000 read pairs finished. 304 secs passed Thread #6: 1950000 read pairs finished. 304 secs passed Thread #7: 2000000 read pairs finished. 305 secs passed Thread #5: 2050000 read pairs finished. 352 secs passed Thread #0: 2100000 read pairs finished. 353 secs passed Thread #3: 2150000 read pairs finished. 354 secs passed Thread #4: 2200000 read pairs finished. 357 secs passed Thread #1: 2250000 read pairs finished. 358 secs passed Thread #2: 2300000 read pairs finished. 360 secs passed Thread #6: 2350000 read pairs finished. 360 secs passed Thread #7: 2400000 read pairs finished. 361 secs passed Thread #5: 2450000 read pairs finished. 407 secs passed Thread #0: 2500000 read pairs finished. 408 secs passed Thread #3: 2550000 read pairs finished. 409 secs passed Thread #4: 2600000 read pairs finished. 412 secs passed Thread #1: 2650000 read pairs finished. 414 secs passed Thread #2: 2700000 read pairs finished. 415 secs passed Thread #6: 2750000 read pairs finished. 416 secs passed Thread #7: 2800000 read pairs finished. 416 secs passed Thread #5: 2850000 read pairs finished. 462 secs passed Thread #0: 2900000 read pairs finished. 463 secs passed Thread #3: 2950000 read pairs finished. 464 secs passed Thread #4: 3000000 read pairs finished. 468 secs passed Thread #1: 3050000 read pairs finished. 469 secs passed Thread #2: 3100000 read pairs finished. 471 secs passed Thread #6: 3150000 read pairs finished. 471 secs passed Thread #7: 3200000 read pairs finished. 472 secs passed Thread #5: 3250000 read pairs finished. 517 secs passed Thread #0: 3300000 read pairs finished. 518 secs passed Thread #3: 3350000 read pairs finished. 519 secs passed Thread #4: 3400000 read pairs finished. 523 secs passed Thread #1: 3450000 read pairs finished. 524 secs passed Thread #2: 3500000 read pairs finished. 525 secs passed Thread #6: 3550000 read pairs finished. 528 secs passed Thread #7: 3600000 read pairs finished. 529 secs passed Thread #5: 3650000 read pairs finished. 572 secs passed Thread #0: 3700000 read pairs finished. 574 secs passed Thread #3: 3750000 read pairs finished. 574 secs passed Thread #4: 3800000 read pairs finished. 577 secs passed Thread #1: 3850000 read pairs finished. 578 secs passed Thread #2: 3900000 read pairs finished. 578 secs passed Thread #6: 3950000 read pairs finished. 581 secs passed Thread #7: 4000000 read pairs finished. 590 secs passed Thread #0: 4100000 read pairs finished. 630 secs passed Thread #3: 4150000 read pairs finished. 630 secs passed Thread #5: 4050000 read pairs finished. 630 secs passed Thread #4: 4200000 read pairs finished. 631 secs passed Thread #1: 4250000 read pairs finished. 631 secs passed Thread #2: 4300000 read pairs finished. 632 secs passed Thread #6: 4350000 read pairs finished. 633 secs passed Thread #7: 4400000 read pairs finished. 643 secs passed Thread #0: 4450000 read pairs finished. 684 secs passed Thread #3: 4500000 read pairs finished. 685 secs passed Thread #5: 4550000 read pairs finished. 685 secs passed Thread #4: 4600000 read pairs finished. 686 secs passed Thread #2: 4700000 read pairs finished. 686 secs passed Thread #1: 4650000 read pairs finished. 687 secs passed Thread #6: 4750000 read pairs finished. 687 secs passed Thread #7: 4800000 read pairs finished. 694 secs passed Thread #0: 4850000 read pairs finished. 738 secs passed Thread #3: 4900000 read pairs finished. 739 secs passed Thread #5: 4950000 read pairs finished. 740 secs passed Thread #4: 5000000 read pairs finished. 741 secs passed Thread #2: 5050000 read pairs finished. 742 secs passed Thread #6: 5150000 read pairs finished. 742 secs passed Thread #1: 5100000 read pairs finished. 742 secs passed Thread #7: 5200000 read pairs finished. 747 secs passed Thread #0: 5250000 read pairs finished. 790 secs passed Thread #3: 5300000 read pairs finished. 795 secs passed Thread #2: 5450000 read pairs finished. 797 secs passed Thread #6: 5500000 read pairs finished. 797 secs passed Thread #4: 5400000 read pairs finished. 798 secs passed Thread #5: 5350000 read pairs finished. 798 secs passed Thread #1: 5550000 read pairs finished. 799 secs passed Thread #7: 5600000 read pairs finished. 800 secs passed Thread #0: 5650000 read pairs finished. 842 secs passed Thread #3: 5700000 read pairs finished. 846 secs passed Thread #2: 5750000 read pairs finished. 851 secs passed Thread #6: 5800000 read pairs finished. 853 secs passed Thread #4: 5850000 read pairs finished. 853 secs passed Thread #5: 5900000 read pairs finished. 854 secs passed Thread #1: 5950000 read pairs finished. 855 secs passed Thread #7: 6000000 read pairs finished. 855 secs passed Thread #6: 6179274 read pairs finished. 864 secs passed Thread #0: 6050000 read pairs finished. 867 secs passed Thread #3: 6100000 read pairs finished. 868 secs passed Thread #2: 6150000 read pairs finished. 869 secs passed Total number of aligned reads: pairs: 2725017 (44%) single a: 1704213 (28%) single b: 1415455 (23%) Done. Finished at Fri Jan 30 12:29:23 2015 Total time consumed: 869 secs BSMAP v2.74 Start at: Fri Jan 30 12:29:24 2015 Input reference file: Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa (format: FASTA) Load in 7658 db seqs, total size 557717710 bp. 9 secs passed total_kmers: 43046721 Create seed table. 26 secs passed max number of mismatches: read_length * 8% max gap size: 0 kmer cut-off ratio: 5e-07 max multi-hits: 100 max Ns: 5 seed size: 16 index interval: 4 quality cutoff: 0 base quality char: '!' min fragment size:28 max fragemt size:500 start from read #1 end at read #4294967295 additional alignment: T in reads => C in reference mapping strand (read_1): ++,-+ mapping strand (read_2): +-,-- Pair-end alignment(8 threads) Input read file #1: mcf_M3_R1.fastq (format: FASTQ) Input read file #2: mcf_M3_R2.fastq (format: FASTQ) Output file: bsmap_out_M3.sam (format: SAM) Thread #1: 50000 read pairs finished. 76 secs passed Thread #0: 100000 read pairs finished. 77 secs passed Thread #3: 150000 read pairs finished. 78 secs passed Thread #4: 200000 read pairs finished. 79 secs passed Thread #5: 300000 read pairs finished. 80 secs passed Thread #2: 250000 read pairs finished. 80 secs passed Thread #6: 350000 read pairs finished. 81 secs passed Thread #7: 400000 read pairs finished. 81 secs passed Thread #1: 450000 read pairs finished. 128 secs passed Thread #4: 600000 read pairs finished. 129 secs passed Thread #3: 550000 read pairs finished. 129 secs passed Thread #0: 500000 read pairs finished. 130 secs passed Thread #5: 650000 read pairs finished. 131 secs passed Thread #2: 700000 read pairs finished. 133 secs passed Thread #6: 750000 read pairs finished. 134 secs passed Thread #7: 800000 read pairs finished. 134 secs passed Thread #1: 850000 read pairs finished. 181 secs passed Thread #4: 900000 read pairs finished. 183 secs passed Thread #3: 950000 read pairs finished. 183 secs passed Thread #5: 1050000 read pairs finished. 183 secs passed Thread #0: 1000000 read pairs finished. 184 secs passed Thread #2: 1100000 read pairs finished. 185 secs passed Thread #6: 1150000 read pairs finished. 185 secs passed Thread #7: 1200000 read pairs finished. 186 secs passed Thread #1: 1250000 read pairs finished. 234 secs passed Thread #4: 1300000 read pairs finished. 237 secs passed Thread #3: 1350000 read pairs finished. 238 secs passed Thread #5: 1400000 read pairs finished. 239 secs passed Thread #0: 1450000 read pairs finished. 239 secs passed Thread #2: 1500000 read pairs finished. 240 secs passed Thread #6: 1550000 read pairs finished. 240 secs passed Thread #7: 1600000 read pairs finished. 241 secs passed Thread #1: 1650000 read pairs finished. 288 secs passed Thread #4: 1700000 read pairs finished. 292 secs passed Thread #5: 1800000 read pairs finished. 293 secs passed Thread #3: 1750000 read pairs finished. 293 secs passed Thread #0: 1850000 read pairs finished. 293 secs passed Thread #6: 1950000 read pairs finished. 294 secs passed Thread #2: 1900000 read pairs finished. 295 secs passed Thread #7: 2000000 read pairs finished. 295 secs passed Thread #1: 2050000 read pairs finished. 340 secs passed Thread #4: 2100000 read pairs finished. 345 secs passed Thread #5: 2150000 read pairs finished. 346 secs passed Thread #3: 2200000 read pairs finished. 347 secs passed Thread #0: 2250000 read pairs finished. 347 secs passed Thread #6: 2300000 read pairs finished. 348 secs passed Thread #7: 2400000 read pairs finished. 350 secs passed Thread #2: 2350000 read pairs finished. 350 secs passed Thread #1: 2450000 read pairs finished. 392 secs passed Thread #4: 2500000 read pairs finished. 399 secs passed Thread #0: 2650000 read pairs finished. 401 secs passed Thread #6: 2700000 read pairs finished. 402 secs passed Thread #3: 2600000 read pairs finished. 402 secs passed Thread #5: 2550000 read pairs finished. 403 secs passed Thread #7: 2750000 read pairs finished. 403 secs passed Thread #2: 2800000 read pairs finished. 403 secs passed Thread #1: 2850000 read pairs finished. 444 secs passed Thread #4: 2900000 read pairs finished. 450 secs passed Thread #0: 2950000 read pairs finished. 453 secs passed Thread #6: 3000000 read pairs finished. 454 secs passed Thread #3: 3050000 read pairs finished. 457 secs passed Thread #5: 3100000 read pairs finished. 457 secs passed Thread #7: 3150000 read pairs finished. 457 secs passed Thread #2: 3200000 read pairs finished. 458 secs passed Thread #1: 3250000 read pairs finished. 496 secs passed Thread #4: 3300000 read pairs finished. 501 secs passed Thread #0: 3350000 read pairs finished. 504 secs passed Thread #6: 3400000 read pairs finished. 511 secs passed Thread #7: 3550000 read pairs finished. 513 secs passed Thread #2: 3600000 read pairs finished. 513 secs passed Thread #5: 3500000 read pairs finished. 514 secs passed Thread #3: 3450000 read pairs finished. 517 secs passed Thread #1: 3650000 read pairs finished. 550 secs passed Thread #4: 3700000 read pairs finished. 554 secs passed Thread #0: 3750000 read pairs finished. 558 secs passed Thread #6: 3800000 read pairs finished. 565 secs passed Thread #7: 3850000 read pairs finished. 566 secs passed Thread #5: 3950000 read pairs finished. 567 secs passed Thread #2: 3900000 read pairs finished. 568 secs passed Thread #3: 4000000 read pairs finished. 570 secs passed Thread #1: 4050000 read pairs finished. 603 secs passed Thread #4: 4100000 read pairs finished. 607 secs passed Thread #0: 4150000 read pairs finished. 611 secs passed Thread #6: 4200000 read pairs finished. 617 secs passed Thread #7: 4250000 read pairs finished. 621 secs passed Thread #5: 4300000 read pairs finished. 623 secs passed Thread #2: 4350000 read pairs finished. 623 secs passed Thread #3: 4400000 read pairs finished. 625 secs passed Thread #1: 4450000 read pairs finished. 658 secs passed Thread #4: 4500000 read pairs finished. 662 secs passed Thread #0: 4550000 read pairs finished. 665 secs passed Thread #6: 4600000 read pairs finished. 671 secs passed Thread #7: 4650000 read pairs finished. 675 secs passed Thread #5: 4700000 read pairs finished. 676 secs passed Thread #2: 4750000 read pairs finished. 677 secs passed Thread #3: 4800000 read pairs finished. 679 secs passed Thread #1: 4850000 read pairs finished. 710 secs passed Thread #4: 4900000 read pairs finished. 715 secs passed Thread #0: 4950000 read pairs finished. 719 secs passed Thread #6: 5000000 read pairs finished. 724 secs passed Thread #7: 5050000 read pairs finished. 728 secs passed Thread #5: 5100000 read pairs finished. 729 secs passed Thread #2: 5150000 read pairs finished. 733 secs passed Thread #3: 5200000 read pairs finished. 736 secs passed Thread #1: 5250000 read pairs finished. 765 secs passed Thread #4: 5300000 read pairs finished. 769 secs passed Thread #0: 5350000 read pairs finished. 773 secs passed Thread #6: 5400000 read pairs finished. 778 secs passed Thread #7: 5450000 read pairs finished. 782 secs passed Thread #5: 5500000 read pairs finished. 783 secs passed Thread #2: 5550000 read pairs finished. 787 secs passed Thread #3: 5600000 read pairs finished. 790 secs passed Thread #1: 5650000 read pairs finished. 820 secs passed Thread #4: 5700000 read pairs finished. 823 secs passed Thread #0: 5750000 read pairs finished. 827 secs passed Thread #6: 5800000 read pairs finished. 832 secs passed Thread #7: 5850000 read pairs finished. 836 secs passed Thread #5: 5900000 read pairs finished. 836 secs passed Thread #2: 5950000 read pairs finished. 841 secs passed Thread #3: 6000000 read pairs finished. 842 secs passed Thread #1: 6050000 read pairs finished. 870 secs passed Thread #4: 6100000 read pairs finished. 882 secs passed Thread #0: 6150000 read pairs finished. 885 secs passed Thread #6: 6200000 read pairs finished. 886 secs passed Thread #5: 6300000 read pairs finished. 888 secs passed Thread #7: 6250000 read pairs finished. 889 secs passed Thread #2: 6350000 read pairs finished. 892 secs passed Thread #3: 6400000 read pairs finished. 894 secs passed Thread #1: 6450000 read pairs finished. 922 secs passed Thread #4: 6500000 read pairs finished. 935 secs passed Thread #0: 6550000 read pairs finished. 938 secs passed Thread #6: 6600000 read pairs finished. 939 secs passed Thread #5: 6650000 read pairs finished. 941 secs passed Thread #7: 6700000 read pairs finished. 942 secs passed Thread #2: 6750000 read pairs finished. 945 secs passed Thread #3: 6800000 read pairs finished. 947 secs passed Thread #1: 6850000 read pairs finished. 975 secs passed Thread #4: 6900000 read pairs finished. 987 secs passed Thread #0: 6950000 read pairs finished. 991 secs passed Thread #6: 7000000 read pairs finished. 994 secs passed Thread #5: 7050000 read pairs finished. 995 secs passed Thread #7: 7100000 read pairs finished. 996 secs passed Thread #2: 7150000 read pairs finished. 999 secs passed Thread #3: 7200000 read pairs finished. 1000 secs passed Thread #1: 7250000 read pairs finished. 1028 secs passed Thread #4: 7300000 read pairs finished. 1041 secs passed Thread #0: 7350000 read pairs finished. 1045 secs passed Thread #6: 7400000 read pairs finished. 1047 secs passed Thread #5: 7450000 read pairs finished. 1048 secs passed Thread #7: 7500000 read pairs finished. 1048 secs passed Thread #2: 7550000 read pairs finished. 1051 secs passed Thread #3: 7600000 read pairs finished. 1052 secs passed Thread #7: 7867124 read pairs finished. 1061 secs passed Thread #1: 7650000 read pairs finished. 1068 secs passed Thread #4: 7700000 read pairs finished. 1072 secs passed Thread #0: 7750000 read pairs finished. 1073 secs passed Thread #6: 7800000 read pairs finished. 1073 secs passed Thread #5: 7850000 read pairs finished. 1073 secs passed Total number of aligned reads: pairs: 4545286 (58%) single a: 1331428 (17%) single b: 1233409 (16%) Done. Finished at Fri Jan 30 12:47:17 2015 Total time consumed: 1073 secs BSMAP v2.74 Start at: Fri Jan 30 12:47:18 2015 Input reference file: Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa (format: FASTA) Load in 7658 db seqs, total size 557717710 bp. 9 secs passed total_kmers: 43046721 Create seed table. 27 secs passed max number of mismatches: read_length * 8% max gap size: 0 kmer cut-off ratio: 5e-07 max multi-hits: 100 max Ns: 5 seed size: 16 index interval: 4 quality cutoff: 0 base quality char: '!' min fragment size:28 max fragemt size:500 start from read #1 end at read #4294967295 additional alignment: T in reads => C in reference mapping strand (read_1): ++,-+ mapping strand (read_2): +-,-- Pair-end alignment(8 threads) Input read file #1: mcf_T3D3_R1.fastq (format: FASTQ) Input read file #2: mcf_T3D3_R2.fastq (format: FASTQ) Output file: bsmap_out_T3D3.sam (format: SAM) Thread #3: 50000 read pairs finished. 77 secs passed Thread #4: 100000 read pairs finished. 78 secs passed Thread #2: 150000 read pairs finished. 79 secs passed Thread #6: 200000 read pairs finished. 80 secs passed Thread #5: 250000 read pairs finished. 80 secs passed Thread #1: 300000 read pairs finished. 82 secs passed Thread #7: 400000 read pairs finished. 82 secs passed Thread #0: 350000 read pairs finished. 83 secs passed Thread #3: 450000 read pairs finished. 130 secs passed Thread #4: 500000 read pairs finished. 131 secs passed Thread #2: 550000 read pairs finished. 132 secs passed Thread #6: 600000 read pairs finished. 133 secs passed Thread #5: 650000 read pairs finished. 134 secs passed Thread #1: 700000 read pairs finished. 135 secs passed Thread #7: 750000 read pairs finished. 136 secs passed Thread #0: 800000 read pairs finished. 138 secs passed Thread #3: 850000 read pairs finished. 183 secs passed Thread #4: 900000 read pairs finished. 184 secs passed Thread #2: 950000 read pairs finished. 186 secs passed Thread #6: 1000000 read pairs finished. 186 secs passed Thread #5: 1050000 read pairs finished. 187 secs passed Thread #1: 1100000 read pairs finished. 189 secs passed Thread #7: 1150000 read pairs finished. 190 secs passed Thread #0: 1200000 read pairs finished. 193 secs passed Thread #3: 1250000 read pairs finished. 236 secs passed Thread #4: 1300000 read pairs finished. 237 secs passed Thread #2: 1350000 read pairs finished. 238 secs passed Thread #6: 1400000 read pairs finished. 239 secs passed Thread #5: 1450000 read pairs finished. 239 secs passed Thread #1: 1500000 read pairs finished. 245 secs passed Thread #7: 1550000 read pairs finished. 246 secs passed Thread #0: 1600000 read pairs finished. 248 secs passed Thread #3: 1650000 read pairs finished. 289 secs passed Thread #4: 1700000 read pairs finished. 290 secs passed Thread #2: 1750000 read pairs finished. 292 secs passed Thread #6: 1800000 read pairs finished. 293 secs passed Thread #5: 1850000 read pairs finished. 293 secs passed Thread #1: 1900000 read pairs finished. 297 secs passed Thread #7: 1950000 read pairs finished. 298 secs passed Thread #0: 2000000 read pairs finished. 302 secs passed Thread #3: 2050000 read pairs finished. 342 secs passed Thread #4: 2100000 read pairs finished. 343 secs passed Thread #2: 2150000 read pairs finished. 345 secs passed Thread #6: 2200000 read pairs finished. 346 secs passed Thread #5: 2250000 read pairs finished. 346 secs passed Thread #1: 2300000 read pairs finished. 349 secs passed Thread #7: 2350000 read pairs finished. 351 secs passed Thread #0: 2400000 read pairs finished. 358 secs passed Thread #5: 2650000 read pairs finished. 398 secs passed Thread #6: 2600000 read pairs finished. 399 secs passed Thread #2: 2550000 read pairs finished. 399 secs passed Thread #4: 2500000 read pairs finished. 400 secs passed Thread #1: 2700000 read pairs finished. 400 secs passed Thread #3: 2450000 read pairs finished. 401 secs passed Thread #7: 2750000 read pairs finished. 402 secs passed Thread #0: 2800000 read pairs finished. 411 secs passed Thread #5: 2850000 read pairs finished. 451 secs passed Thread #6: 2900000 read pairs finished. 452 secs passed Thread #2: 2950000 read pairs finished. 453 secs passed Thread #4: 3000000 read pairs finished. 453 secs passed Thread #1: 3050000 read pairs finished. 453 secs passed Thread #3: 3100000 read pairs finished. 454 secs passed Thread #7: 3150000 read pairs finished. 455 secs passed Thread #0: 3200000 read pairs finished. 465 secs passed Thread #5: 3250000 read pairs finished. 502 secs passed Thread #6: 3300000 read pairs finished. 506 secs passed Thread #4: 3400000 read pairs finished. 506 secs passed Thread #2: 3350000 read pairs finished. 507 secs passed Thread #1: 3450000 read pairs finished. 507 secs passed Thread #3: 3500000 read pairs finished. 508 secs passed Thread #7: 3550000 read pairs finished. 508 secs passed Thread #0: 3600000 read pairs finished. 518 secs passed Thread #5: 3650000 read pairs finished. 554 secs passed Thread #6: 3700000 read pairs finished. 558 secs passed Thread #4: 3750000 read pairs finished. 559 secs passed Thread #2: 3800000 read pairs finished. 559 secs passed Thread #1: 3850000 read pairs finished. 560 secs passed Thread #3: 3900000 read pairs finished. 561 secs passed Thread #7: 3950000 read pairs finished. 562 secs passed Thread #0: 4000000 read pairs finished. 573 secs passed Thread #5: 4050000 read pairs finished. 607 secs passed Thread #6: 4100000 read pairs finished. 610 secs passed Thread #4: 4150000 read pairs finished. 610 secs passed Thread #2: 4200000 read pairs finished. 611 secs passed Thread #1: 4250000 read pairs finished. 615 secs passed Thread #3: 4300000 read pairs finished. 616 secs passed Thread #7: 4350000 read pairs finished. 617 secs passed Thread #0: 4400000 read pairs finished. 628 secs passed Thread #5: 4450000 read pairs finished. 660 secs passed Thread #6: 4500000 read pairs finished. 663 secs passed Thread #4: 4550000 read pairs finished. 664 secs passed Thread #2: 4600000 read pairs finished. 665 secs passed Thread #1: 4650000 read pairs finished. 669 secs passed Thread #3: 4700000 read pairs finished. 669 secs passed Thread #7: 4750000 read pairs finished. 670 secs passed Thread #0: 4800000 read pairs finished. 682 secs passed Thread #5: 4850000 read pairs finished. 713 secs passed Thread #6: 4900000 read pairs finished. 716 secs passed Thread #4: 4950000 read pairs finished. 716 secs passed Thread #2: 5000000 read pairs finished. 718 secs passed Thread #1: 5050000 read pairs finished. 721 secs passed Thread #3: 5100000 read pairs finished. 722 secs passed Thread #7: 5150000 read pairs finished. 723 secs passed Thread #0: 5200000 read pairs finished. 736 secs passed Thread #2: 5400000 read pairs finished. 771 secs passed Thread #4: 5350000 read pairs finished. 772 secs passed Thread #5: 5250000 read pairs finished. 773 secs passed Thread #6: 5300000 read pairs finished. 773 secs passed Thread #1: 5450000 read pairs finished. 774 secs passed Thread #3: 5500000 read pairs finished. 774 secs passed Thread #7: 5550000 read pairs finished. 774 secs passed Thread #0: 5600000 read pairs finished. 787 secs passed Thread #2: 5650000 read pairs finished. 824 secs passed Thread #4: 5700000 read pairs finished. 824 secs passed Thread #5: 5750000 read pairs finished. 825 secs passed Thread #6: 5800000 read pairs finished. 826 secs passed Thread #1: 5850000 read pairs finished. 827 secs passed Thread #3: 5900000 read pairs finished. 828 secs passed Thread #7: 5950000 read pairs finished. 828 secs passed Thread #0: 6000000 read pairs finished. 841 secs passed Thread #4: 6100000 read pairs finished. 877 secs passed Thread #2: 6050000 read pairs finished. 877 secs passed Thread #5: 6150000 read pairs finished. 879 secs passed Thread #1: 6250000 read pairs finished. 880 secs passed Thread #6: 6200000 read pairs finished. 880 secs passed Thread #3: 6300000 read pairs finished. 881 secs passed Thread #7: 6350000 read pairs finished. 881 secs passed Thread #0: 6400000 read pairs finished. 895 secs passed Thread #4: 6450000 read pairs finished. 929 secs passed Thread #2: 6500000 read pairs finished. 930 secs passed Thread #5: 6550000 read pairs finished. 931 secs passed Thread #1: 6600000 read pairs finished. 933 secs passed Thread #6: 6650000 read pairs finished. 934 secs passed Thread #3: 6700000 read pairs finished. 934 secs passed Thread #7: 6750000 read pairs finished. 935 secs passed Thread #7: 7109789 read pairs finished. 946 secs passed Thread #0: 6800000 read pairs finished. 948 secs passed Thread #4: 6850000 read pairs finished. 970 secs passed Thread #2: 6900000 read pairs finished. 971 secs passed Thread #5: 6950000 read pairs finished. 971 secs passed Thread #1: 7000000 read pairs finished. 971 secs passed Thread #6: 7050000 read pairs finished. 972 secs passed Thread #3: 7100000 read pairs finished. 972 secs passed Total number of aligned reads: pairs: 4047153 (57%) single a: 1332490 (19%) single b: 1205101 (17%) Done. Finished at Fri Jan 30 13:03:30 2015 Total time consumed: 972 secs BSMAP v2.74 Start at: Fri Jan 30 13:03:31 2015 Input reference file: Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa (format: FASTA) Load in 7658 db seqs, total size 557717710 bp. 9 secs passed total_kmers: 43046721 Create seed table. 27 secs passed max number of mismatches: read_length * 8% max gap size: 0 kmer cut-off ratio: 5e-07 max multi-hits: 100 max Ns: 5 seed size: 16 index interval: 4 quality cutoff: 0 base quality char: '!' min fragment size:28 max fragemt size:500 start from read #1 end at read #4294967295 additional alignment: T in reads => C in reference mapping strand (read_1): ++,-+ mapping strand (read_2): +-,-- Pair-end alignment(8 threads) Input read file #1: mcf_T3D5_R1.fastq (format: FASTQ) Input read file #2: mcf_T3D5_R2.fastq (format: FASTQ) Output file: bsmap_out_T3D5.sam (format: SAM) Thread #3: 50000 read pairs finished. 77 secs passed Thread #0: 100000 read pairs finished. 77 secs passed Thread #1: 150000 read pairs finished. 78 secs passed Thread #4: 200000 read pairs finished. 80 secs passed Thread #2: 250000 read pairs finished. 80 secs passed Thread #5: 300000 read pairs finished. 81 secs passed Thread #6: 350000 read pairs finished. 81 secs passed Thread #7: 400000 read pairs finished. 82 secs passed Thread #3: 450000 read pairs finished. 129 secs passed Thread #0: 500000 read pairs finished. 130 secs passed Thread #4: 600000 read pairs finished. 133 secs passed Thread #1: 550000 read pairs finished. 133 secs passed Thread #2: 650000 read pairs finished. 134 secs passed Thread #5: 700000 read pairs finished. 134 secs passed Thread #6: 750000 read pairs finished. 134 secs passed Thread #7: 800000 read pairs finished. 135 secs passed Thread #0: 900000 read pairs finished. 181 secs passed Thread #3: 850000 read pairs finished. 181 secs passed Thread #4: 950000 read pairs finished. 185 secs passed Thread #1: 1000000 read pairs finished. 187 secs passed Thread #2: 1050000 read pairs finished. 187 secs passed Thread #5: 1100000 read pairs finished. 188 secs passed Thread #6: 1150000 read pairs finished. 188 secs passed Thread #7: 1200000 read pairs finished. 188 secs passed Thread #0: 1250000 read pairs finished. 233 secs passed Thread #3: 1300000 read pairs finished. 234 secs passed Thread #4: 1350000 read pairs finished. 237 secs passed Thread #1: 1400000 read pairs finished. 239 secs passed Thread #2: 1450000 read pairs finished. 240 secs passed Thread #5: 1500000 read pairs finished. 241 secs passed Thread #6: 1550000 read pairs finished. 242 secs passed Thread #7: 1600000 read pairs finished. 243 secs passed Thread #0: 1650000 read pairs finished. 286 secs passed Thread #3: 1700000 read pairs finished. 287 secs passed Thread #4: 1750000 read pairs finished. 290 secs passed Thread #1: 1800000 read pairs finished. 292 secs passed Thread #2: 1850000 read pairs finished. 293 secs passed Thread #5: 1900000 read pairs finished. 295 secs passed Thread #6: 1950000 read pairs finished. 296 secs passed Thread #7: 2000000 read pairs finished. 297 secs passed Thread #0: 2050000 read pairs finished. 339 secs passed Thread #3: 2100000 read pairs finished. 341 secs passed Thread #4: 2150000 read pairs finished. 343 secs passed Thread #1: 2200000 read pairs finished. 345 secs passed Thread #2: 2250000 read pairs finished. 346 secs passed Thread #5: 2300000 read pairs finished. 348 secs passed Thread #6: 2350000 read pairs finished. 349 secs passed Thread #7: 2400000 read pairs finished. 352 secs passed Thread #0: 2450000 read pairs finished. 393 secs passed Thread #3: 2500000 read pairs finished. 395 secs passed Thread #4: 2550000 read pairs finished. 396 secs passed Thread #1: 2600000 read pairs finished. 398 secs passed Thread #2: 2650000 read pairs finished. 399 secs passed Thread #5: 2700000 read pairs finished. 401 secs passed Thread #6: 2750000 read pairs finished. 402 secs passed Thread #7: 2800000 read pairs finished. 405 secs passed Thread #0: 2850000 read pairs finished. 446 secs passed Thread #4: 2950000 read pairs finished. 450 secs passed Thread #3: 2900000 read pairs finished. 451 secs passed Thread #1: 3000000 read pairs finished. 451 secs passed Thread #2: 3050000 read pairs finished. 452 secs passed Thread #5: 3100000 read pairs finished. 453 secs passed Thread #6: 3150000 read pairs finished. 453 secs passed Thread #7: 3200000 read pairs finished. 456 secs passed Thread #0: 3250000 read pairs finished. 497 secs passed Thread #4: 3300000 read pairs finished. 501 secs passed Thread #3: 3350000 read pairs finished. 503 secs passed Thread #1: 3400000 read pairs finished. 503 secs passed Thread #2: 3450000 read pairs finished. 504 secs passed Thread #5: 3500000 read pairs finished. 504 secs passed Thread #6: 3550000 read pairs finished. 508 secs passed Thread #7: 3600000 read pairs finished. 517 secs passed Thread #0: 3650000 read pairs finished. 552 secs passed Thread #4: 3700000 read pairs finished. 555 secs passed Thread #3: 3750000 read pairs finished. 556 secs passed Thread #1: 3800000 read pairs finished. 556 secs passed Thread #2: 3850000 read pairs finished. 557 secs passed Thread #5: 3900000 read pairs finished. 557 secs passed Thread #6: 3950000 read pairs finished. 559 secs passed Thread #7: 4000000 read pairs finished. 569 secs passed Thread #0: 4050000 read pairs finished. 603 secs passed Thread #4: 4100000 read pairs finished. 606 secs passed Thread #3: 4150000 read pairs finished. 608 secs passed Thread #1: 4200000 read pairs finished. 609 secs passed Thread #2: 4250000 read pairs finished. 611 secs passed Thread #5: 4300000 read pairs finished. 612 secs passed Thread #6: 4350000 read pairs finished. 613 secs passed Thread #7: 4400000 read pairs finished. 623 secs passed Thread #0: 4450000 read pairs finished. 657 secs passed Thread #4: 4500000 read pairs finished. 659 secs passed Thread #1: 4600000 read pairs finished. 661 secs passed Thread #3: 4550000 read pairs finished. 661 secs passed Thread #2: 4650000 read pairs finished. 664 secs passed Thread #5: 4700000 read pairs finished. 664 secs passed Thread #6: 4750000 read pairs finished. 665 secs passed Thread #7: 4800000 read pairs finished. 675 secs passed Thread #0: 4850000 read pairs finished. 707 secs passed Thread #4: 4900000 read pairs finished. 710 secs passed Thread #1: 4950000 read pairs finished. 715 secs passed Thread #3: 5000000 read pairs finished. 716 secs passed Thread #2: 5050000 read pairs finished. 718 secs passed Thread #5: 5100000 read pairs finished. 718 secs passed Thread #6: 5150000 read pairs finished. 719 secs passed Thread #7: 5200000 read pairs finished. 728 secs passed Thread #0: 5250000 read pairs finished. 761 secs passed Thread #4: 5300000 read pairs finished. 763 secs passed Thread #1: 5350000 read pairs finished. 768 secs passed Thread #3: 5400000 read pairs finished. 769 secs passed Thread #2: 5450000 read pairs finished. 770 secs passed Thread #5: 5500000 read pairs finished. 771 secs passed Thread #6: 5550000 read pairs finished. 771 secs passed Thread #7: 5600000 read pairs finished. 781 secs passed Thread #0: 5650000 read pairs finished. 811 secs passed Thread #4: 5700000 read pairs finished. 820 secs passed Thread #2: 5850000 read pairs finished. 823 secs passed Thread #5: 5900000 read pairs finished. 823 secs passed Thread #6: 5950000 read pairs finished. 824 secs passed Thread #3: 5800000 read pairs finished. 824 secs passed Thread #1: 5750000 read pairs finished. 825 secs passed Thread #7: 6000000 read pairs finished. 831 secs passed Thread #0: 6050000 read pairs finished. 863 secs passed Thread #4: 6100000 read pairs finished. 872 secs passed Thread #2: 6150000 read pairs finished. 875 secs passed Thread #5: 6200000 read pairs finished. 876 secs passed Thread #6: 6250000 read pairs finished. 877 secs passed Thread #3: 6300000 read pairs finished. 878 secs passed Thread #1: 6350000 read pairs finished. 878 secs passed Thread #7: 6400000 read pairs finished. 886 secs passed Thread #0: 6450000 read pairs finished. 917 secs passed Thread #4: 6500000 read pairs finished. 925 secs passed Thread #2: 6550000 read pairs finished. 928 secs passed Thread #5: 6600000 read pairs finished. 929 secs passed Thread #6: 6650000 read pairs finished. 930 secs passed Thread #3: 6700000 read pairs finished. 931 secs passed Thread #1: 6750000 read pairs finished. 931 secs passed Thread #7: 6800000 read pairs finished. 938 secs passed Thread #1: 7125800 read pairs finished. 956 secs passed Thread #0: 6850000 read pairs finished. 962 secs passed Thread #4: 6900000 read pairs finished. 967 secs passed Thread #2: 6950000 read pairs finished. 968 secs passed Thread #5: 7000000 read pairs finished. 968 secs passed Thread #6: 7050000 read pairs finished. 968 secs passed Thread #3: 7100000 read pairs finished. 968 secs passed Total number of aligned reads: pairs: 4092568 (57%) single a: 1250715 (18%) single b: 1133306 (16%) Done. Finished at Fri Jan 30 13:19:39 2015 Total time consumed: 968 secs
methratio is a python script that accompanies BSMAP that determines methylation level on CpG loci
for i in ("M1","T1D3","T1D5", "M3", "T3D3", "T3D5"):
!python {bsmaploc}methratio.py \
-d Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa \
-u -z -g \
-o methratio_out_{i}.txt \
-s {bsmaploc}samtools \
bsmap_out_{i}.sam \
@ Fri Jan 30 13:19:42 2015: reading reference Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa ... @ Fri Jan 30 13:20:16 2015: reading bsmap_out_M1.sam ... [samopen] SAM header is present: 7658 sequences. @ Fri Jan 30 13:24:53 2015: read 10000000 lines @ Fri Jan 30 13:25:17 2015: combining CpG methylation from both strands ... @ Fri Jan 30 13:25:41 2015: writing methratio_out_M1.txt ... @ Fri Jan 30 13:34:47 2015: done. total 8716465 valid mappings, 48671764 covered cytosines, average coverage: 1.78 fold. @ Fri Jan 30 13:34:49 2015: reading reference Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa ... @ Fri Jan 30 13:35:22 2015: reading bsmap_out_T1D3.sam ... [samopen] SAM header is present: 7658 sequences. @ Fri Jan 30 13:37:50 2015: combining CpG methylation from both strands ... @ Fri Jan 30 13:38:14 2015: writing methratio_out_T1D3.txt ... @ Fri Jan 30 13:43:41 2015: done. total 5759214 valid mappings, 26507296 covered cytosines, average coverage: 1.32 fold. @ Fri Jan 30 13:43:43 2015: reading reference Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa ... @ Fri Jan 30 13:44:17 2015: reading bsmap_out_T1D5.sam ... [samopen] SAM header is present: 7658 sequences. @ Fri Jan 30 13:48:13 2015: combining CpG methylation from both strands ... @ Fri Jan 30 13:48:37 2015: writing methratio_out_T1D5.txt ... @ Fri Jan 30 13:57:04 2015: done. total 6974209 valid mappings, 45446473 covered cytosines, average coverage: 1.54 fold. @ Fri Jan 30 13:57:07 2015: reading reference Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa ... @ Fri Jan 30 13:57:40 2015: reading bsmap_out_M3.sam ... [samopen] SAM header is present: 7658 sequences. @ Fri Jan 30 14:02:22 2015: read 10000000 lines @ Fri Jan 30 14:03:08 2015: combining CpG methylation from both strands ... @ Fri Jan 30 14:03:32 2015: writing methratio_out_M3.txt ... @ Fri Jan 30 14:13:17 2015: done. total 9773223 valid mappings, 53389886 covered cytosines, average coverage: 1.78 fold. @ Fri Jan 30 14:13:19 2015: reading reference Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa ... @ Fri Jan 30 14:13:53 2015: reading bsmap_out_T3D3.sam ... [samopen] SAM header is present: 7658 sequences. @ Fri Jan 30 14:18:33 2015: read 10000000 lines @ Fri Jan 30 14:18:50 2015: combining CpG methylation from both strands ... @ Fri Jan 30 14:19:14 2015: writing methratio_out_T3D3.txt ... @ Fri Jan 30 14:28:54 2015: done. total 8847902 valid mappings, 52255860 covered cytosines, average coverage: 1.65 fold. @ Fri Jan 30 14:28:56 2015: reading reference Crassostrea_gigas.GCA_000297895.1.22.dna_sm.genome.fa ... @ Fri Jan 30 14:29:29 2015: reading bsmap_out_T3D5.sam ... [samopen] SAM header is present: 7658 sequences. @ Fri Jan 30 14:34:12 2015: read 10000000 lines @ Fri Jan 30 14:34:27 2015: combining CpG methylation from both strands ... @ Fri Jan 30 14:34:51 2015: writing methratio_out_T3D5.txt ... @ Fri Jan 30 14:44:26 2015: done. total 8808414 valid mappings, 51732152 covered cytosines, average coverage: 1.69 fold.
Converting methratio files for methylkit
#first methratio files are converted to filter for CG context, 3x coverage (mr3x.awk), and reformatting (mr_gg.awk.sh).
#due to issue passing variable to awk, simple scripts were used (included in repository)
for i in ("M1","T1D3","T1D5", "M3", "T3D3", "T3D5"):
!echo {i}
!grep "[A-Z][A-Z]CG[A-Z]" <methratio_out_{i}.txt> methratio_out_{i}CG.txt
!awk -f ../scripts/mr3x.awk methratio_out_{i}CG.txt > mr3x.{i}.txt
!awk -f ../scripts/mr_gg.awk.sh mr3x.{i}.txt > mkfmt_{i}.txt
M1 T1D3 T1D5 M3 T3D3 T3D5
Running R > methylkit
%pylab inline
Populating the interactive namespace from numpy and matplotlib
%load_ext rpy2.ipython
%%R
#UNCOMMENT IF YOU NEED TO INSTALL PACKAGES
# dependencies
#install.packages( c("data.table","devtools"))
#source("http://bioconductor.org/biocLite.R")
#biocLite(c("GenomicRanges","IRanges"))
# install the development version from github
#library(devtools)
#install_github("al2na/methylKit",build_vignettes=FALSE)
NULL
%R library(methylKit)
array(['methylKit', 'tools', 'stats', 'graphics', 'grDevices', 'utils', 'datasets', 'methods', 'base'], dtype='|S9')
%%R file.list <- list
('mkfmt_M1.txt',
'mkfmt_T1D3.txt',
'mkfmt_T1D5.txt',
'mkfmt_M3.txt',
'mkfmt_T3D3.txt',
'mkfmt_T3D5.txt'
)
%%R
myobj=read(file.list,sample.id=list("1_sperm","1_72hpf","1_120hpf","3_sperm","3_72hpf","3_120hpf"),assembly="v9",treatment=c(0,0,0,1,1,1))
%%R
meth<-unite(myobj)
#getCorrelation(meth,plot=T)
hc<- clusterSamples(meth, dist="correlation", method="ward", plot=T)
#PCA<-PCASamples(meth)
The "ward" method has been renamed to "ward.D"; note new "ward.D2"
Determining differentially methylated loci using methylkit
%%R
#Family-specific DMLs
#note that file.list was defined in prior section
DMLobj=read(file.list,sample.id=list("M1","T1D3","T1D5","M3","T3D3","T3D5"),assembly="v9",treatment=c(1,1,1,0,0,0), context="CpG")
lin<-unite(DMLobj)
lin.pooled <- pool(lin, sample.ids <- c("lin_1", "lin_3"))
lin_DML.fisher <- calculateDiffMeth(lin.pooled)
select(lin_DML.fisher, 1)
lin_DML_p <- getData(lin_DML.fisher)
lin_DML_filt <- lin_DML_p[lin_DML_p$pvalue < 0.01 & lin_DML_p$meth.diff > 25,]
write.csv(lin_DML_filt,file="lin_DML_filt")
!wc -l lin_DML_filt
190 lin_DML_filt
This is done by doing three pairwise comparisons
%%R file.list <- list
('mkfmt_M1.txt',
'mkfmt_T1D3.txt',
'mkfmt_M3.txt',
'mkfmt_T3D3.txt'
)
%%R
#Developmentally different DMLs (Males v Day3)
DMLobj=read(file.list,sample.id=list("M1","T1D3","M3","T3D3"), assembly="v9",treatment=c(1,0,1,0), context="CpG")
DevelMvD3<-unite(DMLobj)
DevelMvD3.pooled <- pool(DevelMvD3, sample.ids <- c("Males", "Day3"))
DevelMvD3_DML.fisher <- calculateDiffMeth(DevelMvD3.pooled)
select(DevelMvD3_DML.fisher, 1)
DevelMvD3_DML_p <- getData(DevelMvD3_DML.fisher)
DevelMvD3_DML_filt <- DevelMvD3_DML_p[DevelMvD3_DML_p$pvalue < 0.01 & DevelMvD3_DML_p$meth.diff > 25,]
write.csv(DevelMvD3_DML_filt,file="DevelMvD3_DML_filt")
!wc -l DevelMvD3_DML_filt
30 DevelMvD3_DML_filt
%%R file.list <- list
('mkfmt_M1.txt',
'mkfmt_T1D5.txt',
'mkfmt_M3.txt',
'mkfmt_T3D5.txt'
)
%%R
#Developmentally different DMLs (Males v Day5)
DMLobj=read(file.list,sample.id=list("M1","T1D5","M3","T3D5"), assembly="v9",treatment=c(1,0,1,0), context="CpG")
DevelMvD5<-unite(DMLobj)
DevelMvD5.pooled <- pool(DevelMvD5, sample.ids <- c("Males", "Day5"))
DevelMvD5_DML.fisher <- calculateDiffMeth(DevelMvD5.pooled)
select(DevelMvD5_DML.fisher, 1)
DevelMvD5_DML_p <- getData(DevelMvD5_DML.fisher)
DevelMvD5_DML_filt <- DevelMvD5_DML_p[DevelMvD5_DML_p$pvalue < 0.01 & DevelMvD5_DML_p$meth.diff > 25,]
write.csv(DevelMvD5_DML_filt,file="DevelMvD5_DML_filt")
!wc -l DevelMvD5_DML_filt
86 DevelMvD5_DML_filt
%%R file.list <- list
('mkfmt_T1D3.txt',
'mkfmt_T1D5.txt',
'mkfmt_T3D3.txt',
'mkfmt_T3D5.txt'
)
%%R
#Developmentally different DMLs (Day3 v Day5)
DMLobj=read(file.list,sample.id=list("T1D3","T1D5","T3D3","T3D5"), assembly="v9",treatment=c(1,0,1,0), context="CpG")
DevelD3vD5<-unite(DMLobj)
DevelD3vD5.pooled <- pool(DevelD3vD5, sample.ids <- c("Day3", "Day5"))
DevelD3vD5_DML.fisher <- calculateDiffMeth(DevelD3vD5.pooled)
select(DevelD3vD5_DML.fisher, 1)
DevelD3vD5_DML_p <- getData(DevelD3vD5_DML.fisher)
DevelD3vD5_DML_filt <- DevelD3vD5_DML_p[DevelD3vD5_DML_p$pvalue < 0.01 & DevelD3vD5_DML_p$meth.diff > 25,]
write.csv(DevelD3vD5_DML_filt,file="DevelD3vD5_DML_filt")
!wc -l DevelD3vD5_DML_filt
47 DevelD3vD5_DML_filt
#removing column titles
!tail -n +2 DevelMvD5_DML_filt > DevelMvD5_DML
!tail -n +2 DevelD3vD5_DML_filt > DevelD3vD5_DML
#Concatenate all developmetnally different DMLs to one file
!cat DevelMvD3_DML_filt DevelMvD5_DML DevelD3vD5_DML > Devel_DML_filt
!wc -l Devel_DML_filt
161 Devel_DML_filt
!tail -n +2 lin_DML_filt | awk -F, '{print $2, $3, $4, "DML_lin" }' | tr -d '"' | tr ' ' "\t" > lineage_dml.bed
!wc -l lineage_dml.bed
189 lineage_dml.bed
!tail -n +2 Devel_DML_filt | awk -F, '{print $2, $3, $4, "DML_dev" }' | tr -d '"' | tr ' ' "\t" > dev_dml.bed
!wc -l dev_dml.bed
160 dev_dml.bed
#In order to find location of DMLs oyster genome tracks will be downloaded
#and intersectbed (bedtools suite) run
#Note track with all CG's is large (~977mb)
cd genome_tracks
/Users/Steven/Desktop/olson-ms-nb-master/wd/genome_tracks
for i in ("exon","intron","TE","gene","1k5p_gene_promoter","CG"):
!curl -O http://eagle.fish.washington.edu/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_{i}.gff
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 11.7M 100 11.7M 0 0 20.3M 0 --:--:-- --:--:-- --:--:-- 48.7M % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 12.0M 100 12.0M 0 0 52.7M 0 --:--:-- --:--:-- --:--:-- 53.1M % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 6325k 100 6325k 0 0 44.8M 0 --:--:-- --:--:-- --:--:-- 45.4M % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1777k 100 1777k 0 0 32.2M 0 --:--:-- --:--:-- --:--:-- 33.3M % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1848k 100 1848k 0 0 32.8M 0 --:--:-- --:--:-- --:--:-- 34.0M % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 932M 100 932M 0 0 78.7M 0 0:00:11 0:00:11 --:--:-- 80.7M
for i in ("TE-TANDEMREPEAT", "TE-WUBLASTX"):
!curl -O http://eagle.fish.washington.edu/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_{i}.gff
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 3196k 100 3196k 0 0 38.5M 0 --:--:-- --:--:-- --:--:-- 40.0M % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 3129k 100 3129k 0 0 57.3M 0 --:--:-- --:--:-- --:--:-- 59.9M
cd ..
/Users/Steven/Desktop/olson-ms-nb-master/wd
for i in ("exon","intron","TE","1k5p_gene_promoter","TE-TANDEMREPEAT", "TE-WUBLASTX"):
!intersectbed \
-u \
-a lineage_dml.bed \
-b ./genome_tracks/Cgigas_v9_{i}.gff \
> {i}_intersect_DML_lin_u.txt
!wc -l {i}_intersect_DML_lin_u.txt > lin{i}
!head linTE
!head linTE-WUBLASTX
!head linTE-TANDEMREPEAT
!head linintron
!head linexon
!head lin1k5p_gene_promoter
27 TE_intersect_DML_lin_u.txt 24 TE-WUBLASTX_intersect_DML_lin_u.txt 3 TE-TANDEMREPEAT_intersect_DML_lin_u.txt 42 intron_intersect_DML_lin_u.txt 25 exon_intersect_DML_lin_u.txt 8 1k5p_gene_promoter_intersect_DML_lin_u.txt
#Concatenate counts of genomic regions into one table for lineage-specific DMLs
!cat linintron linexon lin1k5p_gene_promoter linTE-TANDEMREPEAT linTE-WUBLASTX> lintable
!awk 'FNR==NR{sum+=$1;next}; {print $0,sum}' lintable{,} > lin_total
!awk '{print $2, $1, $3, (($1/$3)*100)}' lin_total > lineage_DMLs
for i in ("exon","intron","TE-TANDEMREPEAT", "TE-WUBLASTX", "TE", "1k5p_gene_promoter"):
!intersectbed \
-u \
-a dev_dml.bed \
-b ./genome_tracks/Cgigas_v9_{i}.gff \
> {i}_intersect_DML_dev_u.txt
!wc -l {i}_intersect_DML_dev_u.txt > dev{i}
!head devTE
!head devTE-WUBLASTX
!head devTE-TANDEMREPEAT
!head devintron
!head devexon
!head dev1k5p_gene_promoter
20 TE_intersect_DML_dev_u.txt 11 TE-WUBLASTX_intersect_DML_dev_u.txt 9 TE-TANDEMREPEAT_intersect_DML_dev_u.txt 60 intron_intersect_DML_dev_u.txt 13 exon_intersect_DML_dev_u.txt 6 1k5p_gene_promoter_intersect_DML_dev_u.txt
#Concatenate counts of genomic regions into one table for developmentally different DMLs
!cat devintron devexon dev1k5p_gene_promoter devTE-TANDEMREPEAT devTE-WUBLASTX > devtable
!head devtable
60 intron_intersect_DML_dev_u.txt 13 exon_intersect_DML_dev_u.txt 6 1k5p_gene_promoter_intersect_DML_dev_u.txt 9 TE-TANDEMREPEAT_intersect_DML_dev_u.txt 11 TE-WUBLASTX_intersect_DML_dev_u.txt
!awk 'FNR==NR{sum+=$1;next}; {print $0,sum}' devtable{,} > dev_total
!head dev_total
60 intron_intersect_DML_dev_u.txt 99 13 exon_intersect_DML_dev_u.txt 99 6 1k5p_gene_promoter_intersect_DML_dev_u.txt 99 9 TE-TANDEMREPEAT_intersect_DML_dev_u.txt 99 11 TE-WUBLASTX_intersect_DML_dev_u.txt 99
!awk '{print $2, $1, $3, (($1/$3)*100)}' dev_total > developmental_DMLs
!head developmental_DMLs
intron_intersect_DML_dev_u.txt 60 99 60.6061 exon_intersect_DML_dev_u.txt 13 99 13.1313 1k5p_gene_promoter_intersect_DML_dev_u.txt 6 99 6.06061 TE-TANDEMREPEAT_intersect_DML_dev_u.txt 9 99 9.09091 TE-WUBLASTX_intersect_DML_dev_u.txt 11 99 11.1111
for i in ("exon","intron","TE-TANDEMREPEAT", "TE-WUBLASTX","gene","1k5p_gene_promoter"):
!intersectbed \
-u \
-a ./genome_tracks/Cgigas_v9_CG.gff \
-b ./genome_tracks/Cgigas_v9_{i}.gff \
> {i}_intersect_CG_u.txt
!wc -l {i}_intersect_CG_u.txt > CG{i}
!head CGintron
!head CGexon
!head CG1k5p_gene_promoter
!head CGTE
!head CGTE-TANDEMREPEAT
!head CGTE-WUBLASTX
2815997 intron_intersect_CG_u.txt 1129658 exon_intersect_CG_u.txt 593081 1k5p_gene_promoter_intersect_CG_u.txt 589509 TE_intersect_CG_u.txt 173095 TE-TANDEMREPEAT_intersect_CG_u.txt 416439 TE-WUBLASTX_intersect_CG_u.txt
#Concatenate counts of genomic regions into one table for all CGs in oyster genome
!cat CGintron CGexon CG1k5p_gene_promoter CGTE-TANDEMREPEAT CGTE-WUBLASTX > CGtable
!awk 'FNR==NR{sum+=$1;next}; {print $0,sum}' CGtable{,} > CG_total
!awk '{print $2, $1, $3, (($1/$3)*100)}' CG_total > all_CGs
!paste -d" " lineage_DMLs developmental_DMLs all_CGs > StackedBars
!awk '{print $4, $8, $12}' StackedBars | tr ' ' "\t" > StackedBars_DMLs
!head StackedBars
intron_intersect_DML_lin_u.txt 42 102 41.1765 intron_intersect_DML_dev_u.txt 60 99 60.6061 intron_intersect_CG_u.txt 2815997 5128270 54.9112 exon_intersect_DML_lin_u.txt 25 102 24.5098 exon_intersect_DML_dev_u.txt 13 99 13.1313 exon_intersect_CG_u.txt 1129658 5128270 22.0281 1k5p_gene_promoter_intersect_DML_lin_u.txt 8 102 7.84314 1k5p_gene_promoter_intersect_DML_dev_u.txt 6 99 6.06061 1k5p_gene_promoter_intersect_CG_u.txt 593081 5128270 11.5649 TE-TANDEMREPEAT_intersect_DML_lin_u.txt 3 102 2.94118 TE-TANDEMREPEAT_intersect_DML_dev_u.txt 9 99 9.09091 TE-TANDEMREPEAT_intersect_CG_u.txt 173095 5128270 3.37531 TE-WUBLASTX_intersect_DML_lin_u.txt 24 102 23.5294 TE-WUBLASTX_intersect_DML_dev_u.txt 11 99 11.1111 TE-WUBLASTX_intersect_CG_u.txt 416439 5128270 8.12046
!head StackedBars_DMLs
41.1765 60.6061 54.9112 24.5098 13.1313 22.0281 7.84314 6.06061 11.5649 2.94118 9.09091 3.37531 23.5294 11.1111 8.12046
%%R
DMLs<-as.matrix(read.table('StackedBars_DMLs', header=F))
colnames(DMLs)<-c("Lin DMLs","Devel DMLs", "All CpGs")
par(mar=c(5.1, 4.1, 4.1, 8.1), xpd=T)
par(xpd=T, mar=par()$mar+c(0,0,0,5))
barplot(as.matrix(DMLs), col=c("#99983B", "#2F583B", "#4A7958","#8DAB96","#B34321"), ylab="Proportion of CpG within a genomic region (%)")
legend("topright",inset=c(-0.63,-0), legend=c("Intron", "Exon", "Promoter Region", "Tandem", "TE-WUblast"), pch=c(19,19,19), col=c("#99983B", "#2F583B", "#4A7958","#8DAB96","#B34321"))
#Formatting family-specific DML files for stats
!wc -l lineage_dml.bed > lineage_countstotal
!head lineage_countstotal
189 lineage_dml.bed
!wc -l ./genome_tracks/Cgigas_v9_CG.gff > CG_countstotal
!head CG_countstotal
10035701 ./genome_tracks/Cgigas_v9_CG.gff
!cat linTE lineage_countstotal > Lineage_TEs
!head Lineage_TEs
27 TE_intersect_DML_lin_u.txt 189 lineage_dml.bed
!awk '{print $1}' Lineage_TEs > Lineage_TEs_counts
!head Lineage_TEs_counts
27 189
!cat CGTE CG_countstotal > CG_TEs
!head CG_TEs
589509 TE_intersect_CG_u.txt 10035701 ./genome_tracks/Cgigas_v9_CG.gff
!awk '{print $1}' CG_TEs > CG_TEs_counts
!head CG_TEs_counts
589509 10035701
!paste Lineage_TEs_counts CG_TEs_counts > LinTEs_combined
!head LinTEs_combined
27 589509 189 10035701
!awk '{print $1, $2}' LinTEs_combined > Lineage_TEs_stats
!head Lineage_TEs_stats
27 589509 189 10035701
%%R
#Stats for TEs: family-specific
LinStats<- read.table('Lineage_TEs_stats')
chisq.test(LinStats)
Pearson's Chi-squared test with Yates' continuity correction data: LinStats X-squared = 18.6144, df = 1, p-value = 1.6e-05
!cat linTE-WUBLASTX lineage_countstotal > Lineage_TE-WUs
!head Lineage_TE-WUs
24 TE-WUBLASTX_intersect_DML_lin_u.txt 189 lineage_dml.bed
!awk '{print $1}' Lineage_TE-WUs > Lineage_TE-WUs_counts
!head Lineage_TE-WUs_counts
24 189
!cat CGTE-WUBLASTX CG_countstotal > CG_TE_WUs
!head CG_TE_WUs
416439 TE-WUBLASTX_intersect_CG_u.txt 10035701 ./genome_tracks/Cgigas_v9_CG.gff
!awk '{print $1}' CG_TE_WUs > CG_TE_WUs_counts
!head CG_TE_WUs_counts
416439 10035701
!paste Lineage_TE-WUs_counts CG_TE_WUs_counts > LinTE_WUs_combined
!head LinTE_WUs_combined
24 416439 189 10035701
!awk '{print $1, $2}' LinTE_WUs_combined > Lineage_TE_WUs_stats
!head Lineage_TE_WUs_stats
24 416439 189 10035701
%%R
#Stats for TEs: family-specific
LinStats<- read.table('Lineage_TE_WUs_stats')
chisq.test(LinStats)
Pearson's Chi-squared test with Yates' continuity correction data: LinStats X-squared = 27.6614, df = 1, p-value = 1.445e-07
#formatting developmental DML files for stats
!wc -l dev_dml.bed > dev_countstotal
!cat devTE-WUBLASTX dev_countstotal > Dev_TE_WUs
!awk '{print $1}' Dev_TE_WUs > Dev_TE_WUs_counts
!paste Dev_TE_WUs_counts CG_TE_WUs_counts > DevTE_WUs_combined
!awk '{print $1, $2}' DevTE_WUs_combined > Dev_TE_WUs_stats
%%R
#Stats for TEs: developmentally different
DevStats<-read.table('Dev_TE_WUs_stats')
chisq.test(DevStats)
Pearson's Chi-squared test with Yates' continuity correction data: DevStats X-squared = 2.0779, df = 1, p-value = 0.1494
#formatting developmental DML files for stats
!wc -l dev_dml.bed > dev_countstotal
!head dev_countstotal
160 dev_dml.bed
!cat devTE-TANDEMREPEAT dev_countstotal > Dev_TE-TANDEM
!head Dev_TE-TANDEM
9 TE-TANDEMREPEAT_intersect_DML_dev_u.txt 160 dev_dml.bed
!awk '{print $1}' Dev_TE-TANDEM > Dev_TE-TANDEM_counts
!head Dev_TE-TANDEM_counts
9 160
!paste Dev_TE-TANDEM_counts CG_TE-_counts > DevTEs_combined
paste: Dev_TE-TANDEM_counts: No such file or directory
!awk '{print $1, $2}' DevTEs_combined > Dev_TEs_stats
%%R
#Stats for TEs: developmentally different
DevStats<-read.table('Dev_TEs_stats')
chisq.test(DevStats)
ERROR: Cell magic `%%R` not found.