The last several tries of this, it has gotten stuck somewhere along the way and hasn't gone all the way to the end of the commands on it's own. Here's another try at that.

In [44]:

#works well when choose "Run All" option under "Cell" tab.

In [22]:

wd="/Volumes/web/scaphapoda/Grace/Transcriptomes/mer_tst"
dircode="me"

In [23]:

cd {wd}

/Volumes/web/scaphapoda/Grace/Transcriptomes/mer_tst

In [24]:

!blastx \
-query query.fa \
-db /Volumes/Data/blast_db/uniprot_sprot \
-max_target_seqs 1 \
-max_hsps 1 \
-outfmt 6 \
-num_threads 8 \
-out blast_sprot.tab

In [25]:

!say hello

In [26]:

!wc -l blast_sprot.tab

      35 blast_sprot.tab

In [27]:

!tr '|' "\t" <blast_sprot.tab> blast_sprot_sql.tab

In [28]:

!head blast_sprot_sql.tab

Mmercenaria_Contig_1	sp	P06538	DPOL_ADE12	26.09	46	34	0	141	4	100	145	6.2	28.5
Mmercenaria_Contig_2	sp	Q6DRI1	EI3EA_DANRE	75.00	68	17	0	5	208	114	181	2e-29	  112
Mmercenaria_Contig_3	sp	O94823	AT10B_HUMAN	61.11	18	7	0	162	215	99	116	2.2	29.6
Mmercenaria_Contig_5	sp	P0A5H8	EFPP_MYCTU	63.16	19	7	0	117	61	20	38	0.64	31.2
Mmercenaria_Contig_6	sp	Q9WU60	ATRN_MOUSE	28.85	52	33	1	168	13	808	855	0.12	33.9
Mmercenaria_Contig_8	sp	P18547	VNCS_PAVPN	50.00	22	11	0	111	176	362	383	0.85	30.8
Mmercenaria_Contig_9	sp	A8WGF4	IF122_XENTR	67.16	67	22	0	1	201	894	960	6e-24	99.4
Mmercenaria_Contig_10	sp	Q4QK86	MUKB_HAEI8	29.79	47	33	0	16	156	262	308	1.6	30.0
Mmercenaria_Contig_11	sp	Q0AQ76	THIG_MARMM	34.09	44	27	1	210	79	84	125	3.4	28.9
Mmercenaria_Contig_12	sp	P15106	GLNA_STRCO	39.29	28	17	0	40	123	124	151	0.84	30.8

In [29]:

!python /Applications/sqlshare-pythonclient-master/tools/singleupload.py \
-d {dircode}_uniprot \
blast_sprot_sql.tab

processing chunk line 0 to 35 (0.000410795211792 s elapsed)
pushing blast_sprot_sql.tab...
parsing 19413EC4...
finished me_uniprot

In [30]:

!python /Applications/sqlshare-pythonclient-master/tools/fetchdata.py \
-s "SELECT Column1, term, GOSlim_bin, aspect, ProteinName FROM [graceac9@washington.edu].[me_uniprot]me left join [samwhite@washington.edu].[UniprotProtNamesReviewed_yes20130610]sp on me.Column3=sp.SPID left join [sr320@washington.edu].[SPID and GO Numbers]go on me.Column3=go.SPID left join [sr320@washington.edu].[GO_to_GOslim]slim on go.GOID=slim.GO_id where aspect like 'P'" \
-f tsv \
-o {dircode}_descriptions.txt

In [31]:

!head {dircode}_descriptions.txt

In [34]:

!egrep --color "male|female|genitalia|gonad|ovarian|reproduction|estrogen|testosterone|gametogenesis|germination|ovulation|penile|prostate|vulval" {dircode}_descriptions.txt > {dircode}_reprot.txt

In [35]:

!wc -l {dircode}_reprot.txt

       2 me_reprot.txt

In [41]:

!head me_reprot.txt

In [36]:

#now to insert a chart

In [37]:

pylab inline

Populating the interactive namespace from numpy and matplotlib

In [38]:

from pandas import *

jslim = read_table("me_reprot.txt", # name of the data file
            #sep=",", # what character separates each column?
            na_values=["", " "]) # what values should be considered "blank" values?

In [39]:

jslim.head

Out[39]:

<bound method DataFrame.head of    Mmercenaria_Contig_35            male meiosis I  \
0  Mmercenaria_Contig_35  female gonad development   

  cell cycle and proliferation  P  \
0      developmental processes  P   

  Breast cancer type 2 susceptibility protein (Fanconi anemia group D1 protein)  
0  Breast cancer type 2 susceptibility protein (F...                             

[1 rows x 5 columns]>

In [42]:

#how do i do the "groupby" part of following command... is there a way to do it my "egrep term"?...

In [43]:

jslim.groupby('').Column1.count().plot(kind='bar')

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-43-6d81b299b700> in <module>()
----> 1 jslim.groupby('egrep term').Column1.count().plot(kind='bar')

/usr/local/bioinformatics/anaconda/lib/python2.7/site-packages/pandas/core/groupby.pyc in __getattr__(self, attr)
    296 
    297         raise AttributeError("%r object has no attribute %r" %
--> 298                              (type(self).__name__, attr))
    299 
    300     def __getitem__(self, key):

AttributeError: 'DataFrameGroupBy' object has no attribute 'Column1'

In [ ]: