对矢量(数组)中的每个元素进行同一种操作
e.g [1, 2, 3, 4] * 3 = [3, 6, 9, 12]
对块(矩阵)中的每个元素进行同一种操作
e.g. [[1, 2], [3, 4]] - 1 = [[0, 1], [2, 3]]
a = np.array([1, 2, 3, 4])
a * 3
和 [x * 3 for x in a]
在数值上是等价的
但是, 第一个操作是numpy内部函数,只在输入输出时与Python对象进行了转换,内部都是C运算,而第二种方法,每次从数组中取对象和运算都是Python操作。
于是优化的最重要的一个概念就是,能用矢量,块运算的绝不用循环
import numpy as np
N, F = 1000, 20
A = np.random.randn(N, F)
def sig(x, mean, std):
return (x-mean)/std
def sig_loop():
svals = []
for j in range(F):
a = A[:, j]
mean = np.mean(a)
std = np.std(a)
svals.append([sig(x, mean, std) for x in a])
return svals
def sig_vec():
svals = []
for j in range(F):
a = A[:, j]
mean = np.mean(a)
std = np.std(a)
svals.append(sig(a, mean, std))
return svals
def sig_mat():
return sig(A, np.mean(A, 0), np.std(A, 0))
%timeit sig_loop()
%timeit sig_vec()
%timeit sig_mat()
100 loops, best of 3: 17.8 ms per loop 100 loops, best of 3: 2.02 ms per loop 1000 loops, best of 3: 294 µs per loop
基本上全部
# 取矩阵的每列的均值与标准差
A.mean(axis=0)
A.std(axis=0)
# 取矩阵每列最大的元素
A.max(axis=0)
# 取矩阵每列最大的元素,忽略nan值
np.nanmax(A, axis=0)
# 取array的前50个最小元素的下标
a.argsort()[:50]
# 如果不在意顺序,还有更快的方法
a.argpartition(50)[:50]
# 比较矩阵任意两行之间的距离
scipy.spatial.distance.pdist(A)
# 上三角矩阵与方阵之间的坐标变换
numpy.triu_indices(N, k)
所以在写任何循环之前,先看看numpy/scipy有没有对应的矢量/块运算
from mongoengine import connect, Document, ListField
class Test(Document):
g = ListField()
t = Test(g=['3']*20)
class Test2:
g = ['3']*20
t2 = Test2()
%timeit t.g
%timeit t2.g
10000 loops, best of 3: 164 µs per loop 10000000 loops, best of 3: 50.8 ns per loop
如果一定要用mongodb的话,至少得这样
%timeit t._data['g']
10000000 loops, best of 3: 85.8 ns per loop
从2.6版本开始,mongodb已经支持批量插入一堆无序document来减少往返通信,加速性能
在pymongo, insert方法可以支持一个iterable的东东
pymongo.Collection.insert([doc1, doc2, ...])
在mongoengine, Document也支持一个class级别的插入
mongoengine.Document.insert([Doc1, Doc2, ...])
import pymongo
from mongoengine import connect, Document, ListField
connect('test')
db = pymongo.MongoClient().test
class Test(Document):
g = ListField()
docs1 = [Test(g=['3']*20) for _ in range(10000)]
docs2 = [{'g':['3']*20} for _ in range(10000)]
db.drop_collection('test')
%time Test.objects.insert(docs1)
db.drop_collection('test')
%time x=[doc.save() for doc in docs1]
CPU times: user 2.5 s, sys: 24.5 ms, total: 2.53 s Wall time: 2.7 s CPU times: user 3.83 s, sys: 218 ms, total: 4.05 s Wall time: 4.92 s
db.drop_collection('test')
%time db.test.insert(docs2)
db.drop_collection('test')
%time x=[db.test.insert(doc) for doc in docs2]
CPU times: user 301 ms, sys: 3.25 ms, total: 304 ms Wall time: 414 ms CPU times: user 1.82 s, sys: 209 ms, total: 2.03 s Wall time: 2.88 s