The last several tries of this, it has gotten stuck somewhere along the way and hasn't gone all the way to the end of the commands on it's own. Here's another try at that.
#works well when choose "Run All" option under "Cell" tab.
wd="/Volumes/web/scaphapoda/Grace/Transcriptomes/mer_tst"
dircode="me"
cd {wd}
/Volumes/web/scaphapoda/Grace/Transcriptomes/mer_tst
!blastx \
-query query.fa \
-db /Volumes/Data/blast_db/uniprot_sprot \
-max_target_seqs 1 \
-max_hsps 1 \
-outfmt 6 \
-num_threads 8 \
-out blast_sprot.tab
!say hello
!wc -l blast_sprot.tab
35 blast_sprot.tab
!tr '|' "\t" <blast_sprot.tab> blast_sprot_sql.tab
!head blast_sprot_sql.tab
Mmercenaria_Contig_1 sp P06538 DPOL_ADE12 26.09 46 34 0 141 4 100 145 6.2 28.5 Mmercenaria_Contig_2 sp Q6DRI1 EI3EA_DANRE 75.00 68 17 0 5 208 114 181 2e-29 112 Mmercenaria_Contig_3 sp O94823 AT10B_HUMAN 61.11 18 7 0 162 215 99 116 2.2 29.6 Mmercenaria_Contig_5 sp P0A5H8 EFPP_MYCTU 63.16 19 7 0 117 61 20 38 0.64 31.2 Mmercenaria_Contig_6 sp Q9WU60 ATRN_MOUSE 28.85 52 33 1 168 13 808 855 0.12 33.9 Mmercenaria_Contig_8 sp P18547 VNCS_PAVPN 50.00 22 11 0 111 176 362 383 0.85 30.8 Mmercenaria_Contig_9 sp A8WGF4 IF122_XENTR 67.16 67 22 0 1 201 894 960 6e-24 99.4 Mmercenaria_Contig_10 sp Q4QK86 MUKB_HAEI8 29.79 47 33 0 16 156 262 308 1.6 30.0 Mmercenaria_Contig_11 sp Q0AQ76 THIG_MARMM 34.09 44 27 1 210 79 84 125 3.4 28.9 Mmercenaria_Contig_12 sp P15106 GLNA_STRCO 39.29 28 17 0 40 123 124 151 0.84 30.8
!python /Applications/sqlshare-pythonclient-master/tools/singleupload.py \
-d {dircode}_uniprot \
blast_sprot_sql.tab
processing chunk line 0 to 35 (0.000410795211792 s elapsed) pushing blast_sprot_sql.tab... parsing 19413EC4... finished me_uniprot
!python /Applications/sqlshare-pythonclient-master/tools/fetchdata.py \
-s "SELECT Column1, term, GOSlim_bin, aspect, ProteinName FROM [graceac9@washington.edu].[me_uniprot]me left join [samwhite@washington.edu].[UniprotProtNamesReviewed_yes20130610]sp on me.Column3=sp.SPID left join [sr320@washington.edu].[SPID and GO Numbers]go on me.Column3=go.SPID left join [sr320@washington.edu].[GO_to_GOslim]slim on go.GOID=slim.GO_id where aspect like 'P'" \
-f tsv \
-o {dircode}_descriptions.txt
!head {dircode}_descriptions.txt
!egrep --color "male|female|genitalia|gonad|ovarian|reproduction|estrogen|testosterone|gametogenesis|germination|ovulation|penile|prostate|vulval" {dircode}_descriptions.txt > {dircode}_reprot.txt
!wc -l {dircode}_reprot.txt
2 me_reprot.txt
!head me_reprot.txt
#now to insert a chart
pylab inline
Populating the interactive namespace from numpy and matplotlib
from pandas import *
jslim = read_table("me_reprot.txt", # name of the data file
#sep=",", # what character separates each column?
na_values=["", " "]) # what values should be considered "blank" values?
jslim.head
<bound method DataFrame.head of Mmercenaria_Contig_35 male meiosis I \ 0 Mmercenaria_Contig_35 female gonad development cell cycle and proliferation P \ 0 developmental processes P Breast cancer type 2 susceptibility protein (Fanconi anemia group D1 protein) 0 Breast cancer type 2 susceptibility protein (F... [1 rows x 5 columns]>
#how do i do the "groupby" part of following command... is there a way to do it my "egrep term"?...
jslim.groupby('').Column1.count().plot(kind='bar')
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-43-6d81b299b700> in <module>() ----> 1 jslim.groupby('egrep term').Column1.count().plot(kind='bar') /usr/local/bioinformatics/anaconda/lib/python2.7/site-packages/pandas/core/groupby.pyc in __getattr__(self, attr) 296 297 raise AttributeError("%r object has no attribute %r" % --> 298 (type(self).__name__, attr)) 299 300 def __getitem__(self, key): AttributeError: 'DataFrameGroupBy' object has no attribute 'Column1'