Matplotlib is the "standard" Python plotting package (although there are a few other good contenders).
Let's get on to that all important step of visualizing data. We will be using the matplotlib Python package for that. Let's start by plotting the function $f(x) = x^2$.
First, let's generate the numbers using Numpy:
x = np.arange(0,10,0.4)
f = x ** 2
Now let's create a noisy version of $f$ by adding some random Gaussian noise:
mu, sigma = 0, 5 # mean and standard deviation
f_noisy = f + np.random.normal(mu, sigma, len(f))
f_noisy
array([ 3.46691539, 4.64636688, 0.59870757, -0.72098424, 3.03366575, -0.64962963, 3.470377 , -0.16024531, 10.31670909, 17.26267781, 12.06994539, 11.29261335, 22.07280293, 28.90578863, 31.00590777, 34.54253274, 39.84082801, 48.96084072, 51.45467624, 48.04451804, 61.52306571, 69.52322718, 78.19145804, 83.18332532, 94.00811677])
To plot the data, first import the pyplot module of matplotlib:
import matplotlib.pyplot as plt
%matplotlib inline
plt.plot(x, f, '-k', label="f(x)") # plot the function using a black line
plt.plot(x, f_noisy, 'ob', label="noisy f") # plot the noisy version using blue circles
[<matplotlib.lines.Line2D at 0x123cb5f60>]
[<matplotlib.lines.Line2D at 0x123217f98>]
We can add additional features to the plot such as a legend and axis labels, and note the use of semi-colons to suppress the output of the commands:
plt.plot(x, f, '-k', label="f(x)"); # plot the function using a black line
plt.plot(x, f_noisy, 'ob', label="noisy f"); # plot the noisy version using blue circles
plt.xlabel("x", size="xx-large");
plt.ylabel("f(x)", size="xx-large");
plt.ylim(-5, 100);
plt.legend(loc="upper left");
Let's generate two datasets from a normal distribution and plot their histograms:
mu = 100 # mean of distribution
sigma = 15 # standard deviation of distribution
x1 = mu + sigma * np.random.randn(10000)
x2 = mu + 10 + sigma * np.random.randn(10000)
Next, we'll generate a histogram of the two datasets, showing some features of Matplotlib's hist function. The 'normed' flag normalizes the bin heights such that it represents a probability distribution. 'alpha' is the opacity.
num_bins = 50
# the histogram of the data
plt.hist(x1, num_bins, normed=True, facecolor='green', alpha=0.4, label='first');
plt.hist(x2, num_bins, normed=True, facecolor='blue', alpha=0.4, label='second');
plt.xlabel('x',size="xx-large");
plt.ylabel('normalized counts', size="xx-large");
plt.legend(loc='upper right');