compute_distances_no_loops
by @yorkoWe need to compute matrix $D$ which stores all pairwise distances between test and training samples.
$$\Large D_{ij} = {\lVert X_{i,:} - X^{(tr)}_{j,:}\rVert }_2$$where $X_{i,:}$ designates the $i$-th row of matrix $X$.
Further, $$\Large D^2_{ij} = \sum_{k=1}^d {(X_{ik} - X^{(tr)}_{jk})} ^ 2$$
where $D^2_{ij}$ is a square of an element of $D$ at row $i$ and column $j$. $k$ iterates over $d$ features of the $i$-th test sample and $j$-th training sample.
$$\large D^2_{ij} = \sum_{k=1}^d (X^2_{ik} - 2 X_{ik} X^{(tr)}_{jk} + {X^{(tr)}_{jk}}^2) = \sum_{k=1}^d X^2_{ik} - 2\sum_{k=1}^d X_{ik} X^{(tr)}_{jk} + \sum_{k=1}^d {X^{(tr)}_{jk}}^2$$From here, we get a closed-form vectorized expression for $D * D$:
$$\Large D * D = {(X * X)}_{:,\Sigma} - 2X{X^{(tr)}}^{\text{T}} + {(X^{(tr)} * X^{(tr)})}_{:,\Sigma}^{\text{T}}$$where $*$ designates element-wise multiplication and $A_{:,\Sigma}$ stands for a vector derived from matrix $A$ by summing all values in each row.
Lastly, for all dimensions to match, we assume broadcasting here, that is, ${(X * X)}_{:,\Sigma}$ is comprised of $num\_train$ identical column vectors, stacked horizontally, and ${(X^{(tr)} * X^{(tr)})}_{:,\Sigma}^{\text{T}}$ is made of $num\_test$ identical row vectors, stacked vertically. Thus, each term in the above formula has shape $num\_test \times num\_train$.
Simple broadcasting example
$a$ is a column-vector of length 3, $b$ is a row-vector of length 4. $a - b$ is a matrix of shape $3 \times 4$.
import numpy as np
a, b = 3 * np.ones((3, 1)), 4 * np.ones((1, 4))
a
b
a - b
Testing the derived formula
X = np.array([[4, 3, 2, 1],
[1, 1, 1, 1]])
X_tr = np.array([[1, 2, 3, 4],
[0, 1, 2, 3],
[1, 1, 1, 1]])
np.sqrt((X * X).sum(axis=1, keepdims=True) - 2 * X.dot(X_tr.T) + \
(X_tr * X_tr).sum(axis=1, keepdims=True).T)
Double-check by calculating the same distances with two loops.
for test_id in range(X.shape[0]):
for train_id in range(X_tr.shape[0]):
print("Distance between {} test sample and {} train sample: {}".format(
test_id + 1, train_id + 1,
np.linalg.norm(X[test_id, :] - X_tr[train_id, :])))