You hypothesise that there may be a link between temperature and the level of green house gasses in the atmosphere. As part of your investigation to establish if there is a correlation you analyise ice core data taken from Vostok Station in Antarctica. The data file that you will be using, VostokStation.csv, constains reconstructed temperature, CO2 gas concentration and CH4 gas concentrations stretching back 160,000 years.
Remember to label the x-axis and y-axis.
HINT: Take, for example, CO2 vs temperature. When you use linear regression to fit a straight line $y = mx+c$, where $m$ is the slope and $c$ is the $intercept$. The variation in C02 around the straight line model is therefore:
co2_variation = co2 - (m*temperature+c)
where co2_variation, co2 and temperature are all NumPy arrays.
Using the approperiate correlation test in each case, determine if there is a correlation between: temperature and CO2 concentration; temperature and CH4 concentration. Explain both your choice of correlation statistic and your conclusion.
Hint. Use the scipy.stats.normaltest to check if the samples are normally distributed.