Lecturer:José Pedro Silva1 - silva_at_math.uni-wuppertal.de
From their website "pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language."
from IPython.display import VimeoVideo
VimeoVideo('59324550',width=900,height=768)
%matplotlib inline
import os
from pandas import *
import pandas as pd
import pandas.io.data as web
import datetime
import matplotlib
import matplotlib.pyplot as plt
import statsmodels as sm
import seaborn
seaborn.set()
pd.__version__
'0.15.1'
startdate = datetime.datetime(2000,1,1)
enddate = datetime.datetime.today()
df = web.DataReader(['AAPL','GOOGL','TSLA','YNDX'],'yahoo',start=startdate,end=enddate)
normvol = df.Volume/df.Volume.YNDX.max()
normvol.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f935022c690>
goog = df.Close.GOOGL.dropna()
yndx = df.Close.YNDX.dropna()
googret = goog.pct_change()
yndxret = yndx.pct_change()
pd.rolling_corr(googret,yndxret,20).dropna().plot()
pd.rolling_corr(googret,yndxret,250).dropna().plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f93426df950>
from matplotlib.pyplot import *
goog.plot(alpha=0.45)
pd.rolling_mean(goog,50).plot(color='k')
pd.rolling_mean(goog,250).plot(color='k',linestyle='--')
ax = twinx()
pd.rolling_kurt(goog,250).plot(color='r')
<matplotlib.axes._subplots.AxesSubplot at 0x7f9342627f10>
%matplotlib inline
matplotlib.rcParams['figure.dpi'] = 300
Let's list all the data contained in the folder data/sicherung_eex_daten/energiespot
data_dir = './data/sicherung_eex_daten/energiespot/'
for filename in os.listdir(data_dir):
print filename
energy_spot_historie_2010.xls energy_spot_historie_2005.xls energy_spot_historie_2003.xls energy_intraday_history_2009.xls energy_spot_historie_2012.xls energy_spot_historie_2008.xls energy_intraday_history_2007.xls swiss_power_spot_market_2011.xls energy_intraday_history_2006.xls energy_spot_historie_2006.xls swiss_power_spot_market_2008.xls energy_intraday_history_2010.xls energy_spot_historie_end_20020731_xetra.xls energy_spot_historie_2004.xls swiss_power_spot_market_2009.xls energy_intraday_history_2012.xls energy_intraday_history_2011 - Konflikt.xls swiss_power_spot_market_2007.xls Phelix_Quarterly.xls energy_spot_historie_2011.xls energy_spot_historie_2012 - Konflikt.xls swiss_power_spot_market_2012.xls energy_spot_historie_2002.xls swiss_power_spot_market_2006.xls energy_spot_historie_2007.xls swiss_power_spot_market_2010.xls energy_intraday_history_2011.xls energy_intraday_history_2008.xls energy_spot_historie_2009.xls
We now read the xls file which contains intraday data from 2012 for energy prices. We use the read_excel method from pandas to read xls files
df = pd.read_excel(data_dir+'energy_intraday_history_2012.xls',sheetname='Intraday-Spot')
df.head()
EPEX Spot Intraday-Strom-Handel / EPEX Spot Intraday-Energy-Trading | Unnamed: 1 | Unnamed: 2 | Unnamed: 3 | Unnamed: 4 | Unnamed: 5 | Unnamed: 6 | Unnamed: 7 | Unnamed: 8 | |
---|---|---|---|---|---|---|---|---|---|
0 | Delivery Day | Hour\nfrom | Hour\nto | Volume\nMW | Volume (OTC)\nMW | Low Price\nEUR | High Price\nEUR | Last Price\nEUR | Average Price\nEUR |
1 | 2012-12-27 00:00:00 | 23:00 | 00:00 | 968.5 | NaN | 1 | 35 | 12 | 21.11 |
2 | 2012-12-27 00:00:00 | 22:00 | 23:00 | 1640.2 | NaN | 1 | 45 | 25 | 30.16 |
3 | 2012-12-27 00:00:00 | 21:00 | 22:00 | 1072.3 | NaN | 1 | 42.5 | 11 | 27.41 |
4 | 2012-12-27 00:00:00 | 20:00 | 21:00 | 1011.3 | NaN | 1 | 43 | 26 | 35.96 |
df = pd.read_excel(data_dir+'energy_intraday_history_2012.xls',sheetname='Intraday-Spot',header=1,index_col=0)
df.head()
Hour from | Hour to | Volume MW | Volume (OTC) MW | Low Price EUR | High Price EUR | Last Price EUR | Average Price EUR | |
---|---|---|---|---|---|---|---|---|
Delivery Day | ||||||||
2012-12-27 | 23:00 | 00:00 | 968.5 | NaN | 1 | 35.0 | 12.0 | 21.11 |
2012-12-27 | 22:00 | 23:00 | 1640.2 | NaN | 1 | 45.0 | 25.0 | 30.16 |
2012-12-27 | 21:00 | 22:00 | 1072.3 | NaN | 1 | 42.5 | 11.0 | 27.41 |
2012-12-27 | 20:00 | 21:00 | 1011.3 | NaN | 1 | 43.0 | 26.0 | 35.96 |
2012-12-27 | 19:00 | 20:00 | 2207.2 | NaN | 1 | 56.0 | 40.5 | 42.25 |
df.columns
Index([u'Hour\nfrom', u'Hour\nto', u'Volume\nMW', u'Volume (OTC)\nMW', u'Low Price\nEUR', u'High Price\nEUR', u'Last Price\nEUR', u'Average Price\nEUR'], dtype='object')
df.columns = [column.replace(' ','').replace('\n','') for column in df.columns]
df.columns
Index([u'Hourfrom', u'Hourto', u'VolumeMW', u'Volume(OTC)MW', u'LowPriceEUR', u'HighPriceEUR', u'LastPriceEUR', u'AveragePriceEUR'], dtype='object')
df = pd.read_excel(data_dir+'energy_intraday_history_2012.xls',sheetname='Intraday-Spot',\
header=1, parse_dates = [['Delivery Day','Hour\nfrom']],index_col=0)
try:
del df['Hour\nto']
except:
pass
df.columns = [column.replace(' ','').replace('\n','') for column in df.columns]
df.head()
VolumeMW | Volume(OTC)MW | LowPriceEUR | HighPriceEUR | LastPriceEUR | AveragePriceEUR | |
---|---|---|---|---|---|---|
Delivery Day_Hour from | ||||||
2012-12-27 23:00:00 | 968.5 | NaN | 1 | 35.0 | 12.0 | 21.11 |
2012-12-27 22:00:00 | 1640.2 | NaN | 1 | 45.0 | 25.0 | 30.16 |
2012-12-27 21:00:00 | 1072.3 | NaN | 1 | 42.5 | 11.0 | 27.41 |
2012-12-27 20:00:00 | 1011.3 | NaN | 1 | 43.0 | 26.0 | 35.96 |
2012-12-27 19:00:00 | 2207.2 | NaN | 1 | 56.0 | 40.5 | 42.25 |
df.VolumeMW.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f27621c9550>
df.index.get_duplicates()
<class 'pandas.tseries.index.DatetimeIndex'> [2012-10-28 02:00:00] Length: 1, Freq: None, Timezone: None
df.ix[df.index.get_duplicates()]
VolumeMW | Volume(OTC)MW | LowPriceEUR | HighPriceEUR | LastPriceEUR | AveragePriceEUR | |
---|---|---|---|---|---|---|
Delivery Day_Hour from | ||||||
2012-10-28 02:00:00 | 625 | 150 | 25 | 40 | 40 | 30.30 |
2012-10-28 02:00:00 | 752 | 150 | 18 | 36 | 33 | 26.67 |
dfgby = df.groupby(df.index).first()
dfgby.ix['2012-10-28 02:00']
VolumeMW 625.0 Volume(OTC)MW 150.0 LowPriceEUR 25.0 HighPriceEUR 40.0 LastPriceEUR 40.0 AveragePriceEUR 30.3 Name: 2012-10-28 02:00:00, dtype: float64
def wavg(group):
w = group['VolumeMW']*group['AveragePriceEUR']
d = group
return (d*w).sum()/w.sum()
grouped = df.groupby(df.index).apply(wavg)
grouped.ix['2012-10-28 02:00']
VolumeMW 690.321198 Volume(OTC)MW 150.000000 LowPriceEUR 21.399619 HighPriceEUR 37.942639 LastPriceEUR 36.399619 AveragePriceEUR 28.432945 Name: 2012-10-28 02:00:00, dtype: float64
df = grouped
df.VolumeMW.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f2762f19310>
df.AveragePriceEUR.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f27618721d0>
ts = df.VolumeMW
ts['10/2012'].plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f27681926d0>
ts['10-2012'].plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f2761fe5350>
ts['09-09-2012'].plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f27618ce150>
df.sort_index(inplace=True)
df.VolumeMW.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f276218c890>
ts = df.VolumeMW
with plt.xkcd():
ts.asfreq(freq='W').plot()
ts.asfreq(freq='W').plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f2762a793d0>
ts.describe()
count 8687.000000 mean 1510.502834 std 911.039358 min 43.000000 25% 840.000000 50% 1325.000000 75% 1960.000000 max 7369.000000 Name: VolumeMW, dtype: float64
ts.resample('W-FRI').plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f27626b8a50>
ts.resample('W-SUN',how=['mean','max','min']).plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f2761aade50>
resampled = ts.resample('30t')
resampled
Delivery Day_Hour\nfrom 2012-01-01 00:00:00 1161.0 2012-01-01 00:30:00 NaN 2012-01-01 01:00:00 791.0 2012-01-01 01:30:00 NaN 2012-01-01 02:00:00 911.0 2012-01-01 02:30:00 NaN 2012-01-01 03:00:00 666.0 2012-01-01 03:30:00 NaN 2012-01-01 04:00:00 694.0 2012-01-01 04:30:00 NaN 2012-01-01 05:00:00 730.0 2012-01-01 05:30:00 NaN 2012-01-01 06:00:00 587.9 2012-01-01 06:30:00 NaN 2012-01-01 07:00:00 1077.7 ... 2012-12-27 16:00:00 4034.7 2012-12-27 16:30:00 NaN 2012-12-27 17:00:00 3861.6 2012-12-27 17:30:00 NaN 2012-12-27 18:00:00 4029.4 2012-12-27 18:30:00 NaN 2012-12-27 19:00:00 2207.2 2012-12-27 19:30:00 NaN 2012-12-27 20:00:00 1011.3 2012-12-27 20:30:00 NaN 2012-12-27 21:00:00 1072.3 2012-12-27 21:30:00 NaN 2012-12-27 22:00:00 1640.2 2012-12-27 22:30:00 NaN 2012-12-27 23:00:00 968.5 Freq: 30T, Name: VolumeMW, Length: 17375
resampled.interpolate()
Delivery Day_Hour\nfrom 2012-01-01 00:00:00 1161.00 2012-01-01 00:30:00 976.00 2012-01-01 01:00:00 791.00 2012-01-01 01:30:00 851.00 2012-01-01 02:00:00 911.00 2012-01-01 02:30:00 788.50 2012-01-01 03:00:00 666.00 2012-01-01 03:30:00 680.00 2012-01-01 04:00:00 694.00 2012-01-01 04:30:00 712.00 2012-01-01 05:00:00 730.00 2012-01-01 05:30:00 658.95 2012-01-01 06:00:00 587.90 2012-01-01 06:30:00 832.80 2012-01-01 07:00:00 1077.70 ... 2012-12-27 16:00:00 4034.70 2012-12-27 16:30:00 3948.15 2012-12-27 17:00:00 3861.60 2012-12-27 17:30:00 3945.50 2012-12-27 18:00:00 4029.40 2012-12-27 18:30:00 3118.30 2012-12-27 19:00:00 2207.20 2012-12-27 19:30:00 1609.25 2012-12-27 20:00:00 1011.30 2012-12-27 20:30:00 1041.80 2012-12-27 21:00:00 1072.30 2012-12-27 21:30:00 1356.25 2012-12-27 22:00:00 1640.20 2012-12-27 22:30:00 1304.35 2012-12-27 23:00:00 968.50 Freq: 30T, Name: VolumeMW, Length: 17375
df.resample('D').AveragePriceEUR.plot(style='b')
df.resample('D').VolumeMW.plot(secondary_y=True, style='r', alpha = 0.4)
<matplotlib.axes._subplots.AxesSubplot at 0x7f27617e1cd0>
ax1 = df.resample('D').AveragePriceEUR.plot(style='b')
ax2 = df.resample('D').VolumeMW.plot(secondary_y=True, style='r', alpha = 0.4)
ax1.set_ylabel('Average Price (EUR)')
ax2.set_ylabel('Volume (MW)')
<matplotlib.text.Text at 0x7f276125c310>
rolling_std(df.AveragePriceEUR.resample('1D'), window = 20).plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f276115fcd0>
corr_vol_avgp20 = rolling_corr(df.VolumeMW, df.AveragePriceEUR, window=20)
corr_vol_avgp200 = rolling_corr(df.VolumeMW, df.AveragePriceEUR, window=200)
corr_vol_avgp20.plot(alpha=0.7,label='20day correlation')
corr_vol_avgp200.plot(style='red',label='200day correlation')
<matplotlib.axes._subplots.AxesSubplot at 0x7f2761066550>
rolling_mean(corr_vol_avgp20,window=100).plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f2761078110>
Merging Series
!ls $data_dir
energy_intraday_history_2006.xls energy_spot_historie_2004.xls energy_spot_historie_end_20020731_xetra.xls energy_intraday_history_2007.xls energy_spot_historie_2005.xls Phelix_Quarterly.xls energy_intraday_history_2008.xls energy_spot_historie_2006.xls swiss_power_spot_market_2006.xls energy_intraday_history_2009.xls energy_spot_historie_2007.xls swiss_power_spot_market_2007.xls energy_intraday_history_2010.xls energy_spot_historie_2008.xls swiss_power_spot_market_2008.xls energy_intraday_history_2011 - Konflikt.xls energy_spot_historie_2009.xls swiss_power_spot_market_2009.xls energy_intraday_history_2011.xls energy_spot_historie_2010.xls swiss_power_spot_market_2010.xls energy_intraday_history_2012.xls energy_spot_historie_2011.xls swiss_power_spot_market_2011.xls energy_spot_historie_2002.xls energy_spot_historie_2012 - Konflikt.xls swiss_power_spot_market_2012.xls energy_spot_historie_2003.xls energy_spot_historie_2012.xls
dseries = {}
for filename in os.listdir(data_dir):
if 'Konflikt' not in filename and 'energy_intraday' in filename:
dseries[filename.split('_')[-1][:4]] = pd.read_excel(data_dir+filename,sheetname='Intraday-Spot',\
header=1, parse_dates = [['Delivery Day','Hour\nfrom']],index_col=0)
df = concat(dseries.values())
list(df.index.get_duplicates())
[Timestamp('2006-10-29 02:00:00'), Timestamp('2007-10-28 02:00:00'), Timestamp('2008-10-26 02:00:00'), Timestamp('2009-10-25 02:00:00'), Timestamp('2010-10-31 02:00:00'), Timestamp('2011-10-30 02:00:00'), Timestamp('2012-10-28 02:00:00')]
df = df.groupby(df.index).first()
df.columns = [column.replace(' ','').replace('\n','') for column in df.columns]
from matplotlib.pyplot import *
df.VolumeMW.asfreq('W-FRI').plot()
df.VolumeMW.asfreq('W-MON').plot()
df.VolumeMW.asfreq('W-SUN').plot()
legend(['FRI','MON','SUN'],loc='best')
<matplotlib.legend.Legend at 0x7f2760872e90>
df.ix[:,['VolumeMW','HighPriceEUR']].plot(subplots=True)
array([<matplotlib.axes._subplots.AxesSubplot object at 0x7f2760802190>, <matplotlib.axes._subplots.AxesSubplot object at 0x7f2760748650>], dtype=object)
Meteorological historical data for Düsseldorf
!head './data/kl_10400_00_akt_txt.txt'
KL01192200001010000101701101861102201101924 691 251 444 38 1 511 641 271 424 41 6 60 6 24 6 754 914 714 794 864 944 954 924 861 951 95116 3124 2120 11 204 81-99-99 81-99-99 81-99-99 804 001 71 41 21 11 11 11 11 11 11 001 2161 161 2261 001 001 601-9999-99999-99999 KL01192200001020000102371102331102251102324 791 271 524 2 1 391 731 741 654 36 6 59 6 63 6 774 834 884 834 954 814 854 874 961 801 85116 3116 2116 31 274 41-99-99 71-99-99 71-99-99 604 711 61 81 81 11 11 11 01 11 01 001 001 001 001 001 001 701-9999-99999-99999 KL01192200001030000101981101771101581101784 911 691 224 67 1 781 871 711 774 61 6 70 6 54 6 834 894 784 834 784 794 784 784 781 791 78116 3120 5116 31 374 81-99-99 81-99-99 61-99-99 734 001 81 81 81 11 12 11 11 01 01 001 061 001 2861 001 0011401-9999-99999-99999 KL01192200001040000101301100891100971101054 921 661 264 58 1 681 771 831 784 63 6 72 6 64 6 924 984 834 914 934 934 764 874 931 931 76120 3116 3120 41 334 81-99-99 81-99-99 61-99-99 734 001 61 61 81 21 21 21 11 01 31 2861 6861 2161 9761 001 0011701-9999-99999-99999 KL01192200001050000101651101581101421101554 861 41 824 -22 1 71 801 611 524 2 6 55 6 47 6 594 744 764 704 914 684 814 804 921 681 80120 3116 3116 31 304 11-99-99 71-99-99 31-99-99 374 6511 81 81 81 11 11 11 01 11 01 861 001 001 001 001 0011101-9999-99999-99999 KL01192200001060000101221101091101351101224 1041 431 614 25 1 501 971 781 764 40 6 74 6 70 6 754 884 954 864 864 734 904 834 851 731 89116 4116 4116 31 374 31-99-99 71-99-99 81-99-99 604 2811 81 81 61 11 11 11 11 11 01 001 001 1561 1861 001 0011201-9999-99999-99999 KL01192200001070000101771101731101611101704 871 471 404 42 1 551 871 611 664 49 6 64 6 43 6 834 814 714 784 914 724 754 794 911 721 75120 3120 3116 31 304 71-99-99 51-99-99 71-99-99 634 2511 61 81 81 11 11 11 11 11 01 361 001 001 001 001 001 901-9999-99999-99999 KL01192200001080000101341101361101531101414 791 541 254 45 1 591 681 681 664 39 6 61 6 62 6 674 904 914 834 724 914 924 854 731 901 92116 4116 3120 21 304 71-99-99 81-99-99 81-99-99 774 001 83 53 73 11 11 11 11 01 12 001 261 1061 1261 001 0011001-9999-99999-99999 KL01192200001090000101701102051102541102104 681 51 634 4 1 261 431 161 254 23 6 40 6 12 6 704 794 644 714 954 954 934 944 951 951 94120 1132 21 0 01 104 41-99-99 71-99-99 71-99-99 604 001 51 51 31 11 11 11 11 11 11 061 001 001 061 001 001 501-9999-99999-99999 KL01192200001100000102991103171103261103144 241 41 204 10 1 81 121 121 114 6 6 10 6 9 6 624 644 634 634 964 974 954 964 961 961 95116 1124 1120 21 134 81-99-99 81-99-99 81-99-99 804 001 21 31 61 11 11 11 11 01 11 061 001 001 001 001 001 401-9999-99999-99999
from IPython.display import display, HTML
HTML('http://www.dwd.de/bvbw/generator/DWDWWW/Content/Oeffentlichkeit/KU/KU2/KU21/klimadaten/german/format__kl,templateId=raw,property=publicationFile.html/format_kl.html')
| ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Ke_Ind | Kennung | Label | Beschreibung | Einheit | Code-Tabellen | Zusatzinfo | Typ | Pos | Erlaubt | Fehlk |
KL | KE | KE | Kennung fuer das Datenkollektiv | Standardformat: Klimadaten aus Klimaroutine des DWD (3 Termine: 07,14,21 MOZ, ab 01.01.1987 07:30,14:30,21:30 MEZ) und Tageswerte jeweils nach Beobachteranleitung fuer Klimastationen (BAK). nebenamtl.Stationen auch teilweise nach 03.2001). Satzlaenge: 282 Zeichen | X(2) | 1 | siehe KE_IND | |||
KL | STAT | ST | Stationsnummer | CODE | STATIONSLISTE | 9(5) | 3 | 00001-99999 | ||
KL | JA | JA | Jahr | 9(4) | 8 | 1800-2100 | ||||
KL | MO | MO | Monat | 9(2) | 12 | 01-12 | ||||
KL | TA | TA | Tag | 9(2) | 14 | 01-31 | ||||
KL | NULL | numerisches Leerfeld (0) | 9(4) | 16 | 0000 | |||||
KL | P0 | P1 | Luftdruck in Stationshoehe | 0.1 hpa | 07:30 MEZ (bis 1986 MOZ) | 9(5) | 20 | 0-11000 | -9999 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 25 | 0-9 | 9 | |
KL | P0 | P2 | Luftdruck in Stationshoehe | 0.1 hpa | 14:30 MEZ (bis 1986 MOZ) | 9(5) | 26 | 0-11000 | -9999 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 31 | 0-9 | 9 | |
KL | P0 | P3 | Luftdruck in Stationshoehe | 0.1 hpa | 21:30 MEZ (bis 1986 MOZ) | 9(5) | 32 | 0-11000 | -9999 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 37 | 0-9 | 9 | |
KL | PM | PM | Tagesmittel des Luftdruckes in Stationshoehe | 0.1 hpa | arithm.Mittel aus 3 Terminwerten | 9(5) | 38 | 0-11000 | -9999 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 43 | 0-9 | 9 | |
KL | TXK | TXK | Tagesmaximum der Temperatur der Luft in 2m Hoehe | 0.1 ° C | 21:30 VT. - 21:30 MEZ (bis 1986 MOZ) | 9(4) | 44 | -500-500 | -999 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 48 | 0-9 | 9 | |
KL | TNK | TNK | Tagesminimum der Temperatur der Luft in 2m Hoehe | 0.1 ° C | 21:30 VT. - 21:30 MEZ (bis 1986 MOZ) | 9(4) | 49 | -500-500 | -999 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 53 | 0-9 | 9 | |
KL | TRK | TRK | Tagesschwankung der Temperatur der Luft in 2m Hoehe | 0.1 ° C | TXK-TNK | 9(3) | 54 | 500 | -99 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 57 | 0-9 | 9 | |
KL | TGK | TGK | Tagesminimum der Temperatur der Luft am Erdboden | 0.1 ° C | 21:30 VT. - 07:30 MEZ (bis 1986 MOZ) | 9(4) | 58 | -500-500 | -999 | |
KL | STG | S | Schnee ueber dem TG-Thermometer | CODE | S: ja, Blank: nein | 07:30 MEZ (bis 1986 MOZ) | X(1) | 62 | SP,S | 9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 63 | 0-9 | 9 | |
KL | TT | T1 | Temperatur der Luft in 2m Hoehe | 0.1 ° C | 07:30 MEZ (bis 1986 MOZ) | 9(4) | 64 | -500-500 | -999 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 68 | 0-9 | 9 | |
KL | TT | T2 | Temperatur der Luft in 2m Hoehe | 0.1 ° C | 14:30 MEZ (bis 1986 MOZ) | 9(4) | 69 | -500-500 | -999 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 73 | 0-9 | 9 | |
KL | TT | T3 | Temperatur der Luft in 2m Hoehe | 0.1 ° C | 21:30 MEZ (bis 1986 MOZ) | 9(4) | 74 | -500-500 | -999 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 78 | 0-9 | 9 | |
KL | TMK | TMK | Tagesmittel der Temperatur der Luft in 2m Hoehe | 0.1 ° C | TMK=(TT1+TT2+(TT3*2))/4 | 9(4) | 79 | -500-500 | -999 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 83 | 0-9 | 9 | |
KL | TF | TF1 | Feuchttemperatur | 0.1 ° C | 07:30 MEZ (bis 1986 MOZ) | 9(4) | 84 | -1000-1000 | -999 | |
KL | ETF | ETF1 | Eisansatz bei der Messung der Feuchttemperatur | CODE | E:ja, sonst blank | 07:30 MEZ (bis 1986 MOZ) | 9(1) | 88 | SP,E | 9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 89 | 0-9 | 9 | |
KL | TF | TF2 | Feuchttemperatur | 0.1 ° C | 14:30 MEZ (bis 1986 MOZ) | 9(4) | 90 | -1000-1000 | -999 | |
KL | ETF | ETF2 | Eisansatz bei der Messung der Feuchttemperatur | CODE | E:ja, sonst blank | 14:30 MEZ (bis 1986 MOZ) | 9(1) | 94 | SP,E | 9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 95 | 0-9 | 9 | |
KL | TF | TF3 | Feuchttemperatur | 0.1 ° C | 21:30 MEZ (bis 1986 MOZ) | 9(4) | 96 | -1000-1000 | -999 | |
KL | ETF | ETF3 | Eisansatz bei der Messung der Feuchttemperatur | CODE | E:ja, sonst blank | 21:30 MEZ (bis 1986 MOZ) | 9(1) | 100 | SP,E | 9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 101 | 0-9 | 9 | |
KL | VP | VP1 | Dampfdruck | 0.1 hpa | 07:30 MEZ (bis 1986 MOZ) | 9(3) | 102 | 9(3) | -99 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 105 | 0-9 | 9 | |
KL | VP | VP2 | Dampfdruck | 0.1 hpa | 14:30 MEZ (bis 1986 MOZ) | 9(3) | 106 | 9(3) | -99 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 109 | 0-9 | 9 | |
KL | VP | VP3 | Dampfdruck | 0.1 hpa | 21:30 MEZ (bis 1986 MOZ) | 9(3) | 110 | 9(3) | -99 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 113 | 0-9 | 9 | |
KL | VPM | VPM | Tagesmittel des Dampfdrucks | 0.1 hpa | arithm.Mittel aus 3 Terminwerten | 9(3) | 114 | 0-999 | -99 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 117 | 0-9 | 9 | |
KL | UP | UP1 | relative Feuchte (aus Feuchttemperatur) | % | 07:30 MEZ (bis 1986 MOZ) | 9(3) | 118 | 0-110 | -99 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 121 | 0-9 | 9 | |
KL | UP | UP2 | relative Feuchte (aus Feuchttemperatur) | % | 14:30 MEZ (bis 1986 MOZ) | 9(3) | 122 | 0-110 | -99 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 125 | 0-9 | 9 | |
KL | UP | UP3 | relative Feuchte (aus Feuchttemperatur) | % | 21:30 MEZ (bis 1986 MOZ) | 9(3) | 126 | 0-110 | -99 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 129 | 0-9 | 9 | |
KL | UPM | UPM | Tagesmittel der relativen Feuchte | % | arithm.Mittel aus 3 Terminwerten | 9(3) | 130 | 0-110 | -99 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 133 | 0-9 | 9 | |
KL | UR | UR1 | relative Feuchte (aus Registrierung) | % | 07:30 MEZ (bis 1986 MOZ) | 9(3) | 134 | 0-110 | -99 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 137 | 0-9 | 9 | |
KL | UR | UR2 | relative Feuchte (aus Registrierung) | % | 14:30 MEZ (bis 1986 MOZ) | 9(3) | 138 | 0-110 | -99 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 141 | 0-9 | 9 | |
KL | UR | UR3 | relative Feuchte (aus Registrierung) | % | 21:30 MEZ (bis 1986 MOZ) | 9(3) | 142 | 0-110 | -99 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 145 | 0-9 | 9 | |
KL | DK | D1 | Windrichtung (in 32-teiliger Windrose) | CODE | 07:30 MEZ (bis 1986 MOZ) | 9(2) | 146 | 0-32 | -9 | |
KL | FK | FK1 | Windstaerke | Bft | 07:30 MEZ (bis 1986 MOZ) | 9(2) | 148 | 0-12 | -9 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 150 | 0-9 | 9 | |
KL | DK | D2 | Windrichtung (in 32-teiliger Windrose) | CODE | 14:30 MEZ (bis 1986 MOZ) | 9(2) | 151 | 0-32 | -9 | |
KL | FK | FK2 | Windstaerke | Bft | 14:30 MEZ (bis 1986 MOZ) | 9(2) | 153 | 0-12 | -9 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 155 | 0-9 | 9 | |
KL | DK | D3 | Windrichtung (in 32-teiliger Windrose) | CODE | 21:30 MEZ (bis 1986 MOZ) | 9(2) | 156 | 0-32 | -9 | |
KL | FK | FK3 | Windstaerke | Bft | 21:30 MEZ (bis 1986 MOZ) | 9(2) | 158 | 0-12 | -9 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 160 | 0-9 | 9 | |
KL | FMK | FMK | Tagesmittel der Windstaerke | 0.1 Bft | arithm.Mittel aus 3 Terminwerten | 9(3) | 161 | 0-120 | -99 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 164 | 0-9 | 9 | |
KL | N | N1 | Bedeckungsgrad mit allen Wolken | Achtel | 07:30 MEZ (bis 1986 MOZ) | 9(2) | 165 | -1-9 | -9 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 167 | 0-9 | 9 | |
KL | CD | C1 | Wolkendichte | CODE | TAB_CD | 07:30 MEZ (bis 1986 MOZ) | 9(2) | 168 | 0-9 | -9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 170 | 0-9 | 9 | |
KL | WK | W1 | Wettererscheinungen in Klimameldung | CODE | TAB_WK | 07:30 MEZ (bis 1986 MOZ) | 9(2) | 171 | 0-99 | -9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 173 | 0-9 | 9 | |
KL | N | N2 | Bedeckungsgrad mit allen Wolken | Achtel | 14:30 MEZ (bis 1986 MOZ) | 9(2) | 174 | -1-9 | -9 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 176 | 0-9 | 9 | |
KL | CD | C2 | Wolkendichte | CODE | TAB_CD | 14:30 MEZ (bis 1986 MOZ) | 9(2) | 177 | 0-9 | -9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 179 | 0-9 | 9 | |
KL | WK | W2 | Wettererscheinungen in Klimameldung | CODE | TAB_WK | 14:30 MEZ (bis 1986 MOZ) | 9(2) | 180 | 0-99 | -9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 182 | 0-9 | 9 | |
KL | N | N3 | Bedeckungsgrad mit allen Wolken | Achtel | 21:30 MEZ (bis 1986 MOZ) | 9(2) | 183 | -1-9 | -9 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 185 | 0-9 | 9 | |
KL | CD | C3 | Wolkendichte | CODE | TAB_CD | 21:30 MEZ (bis 1986 MOZ) | 9(2) | 186 | 0-9 | -9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 188 | 0-9 | 9 | |
KL | WK | W3 | Wettererscheinungen in Klimameldung | CODE | TAB_WK | 21:30 MEZ (bis 1986 MOZ) | 9(2) | 189 | 0-99 | -9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 191 | 0-9 | 9 | |
KL | NM | NM | Tagesmittel des Bedeckungsgrades | 0.1 Achtel | arithm.Mittel aus 3 Terminwerten | 9(3) | 192 | 0-100 | -99 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 195 | 0-9 | 9 | |
KL | SDK | SDK | Tagessumme der Sonnenscheindauer | 0.1 Std | 00:00-24:00 MEZ (bis 1986 MOZ) | 9(3) | 196 | 0-240 | -99 | |
KL | SDJ | SDJ | Kennung, ob Sonne geschienen hat | CODE | 00:00-24:00 MEZ (bis 1986 MOZ) | 9(1) | 199 | 0,1 | 9 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 200 | 0-9 | 9 | |
KL | VK | V1 | Sichtweite | CODE | TAB_VK | 07:30 MEZ (bis 1986 MOZ) | 9(2) | 201 | 0-9 | -9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 203 | 0-9 | 9 | |
KL | VK | V2 | Sichtweite | CODE | TAB_VK | 14:30 MEZ (bis 1986 MOZ) | 9(2) | 204 | 0-9 | -9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 206 | 0-9 | 9 | |
KL | VK | V3 | Sichtweite | CODE | TAB_VK | 21:30 MEZ (bis 1986 MOZ) | 9(2) | 207 | 0-9 | -9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 209 | 0-9 | 9 | |
KL | E | E1 | Erdbodenzustand | CODE | TAB_E | 07:30 MEZ (bis 1986 MOZ) | 9(2) | 210 | 0-9 | -9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 212 | 0-9 | 9 | |
KL | E | E2 | Erdbodenzustand | CODE | TAB_E | 14:30 MEZ (bis 1986 MOZ) | 9(2) | 213 | 0-9 | -9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 215 | 0-9 | 9 | |
KL | E | E3 | Erdbodenzustand | CODE | TAB_E | 21:30 MEZ (bis 1986 MOZ) | 9(2) | 216 | 0-9 | -9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 218 | 0-9 | 9 | |
KL | VAK | VAK | Kennung fuer Art des gefallenen Niederschlags | CODE | TAB_VAK | 00:00-24:00 MEZ (bis 1986 MOZ) | 9(2) | 219 | 0-99 | -9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 221 | 0-9 | 9 | |
KL | VBK | VBK | Kennung fuer Art des abgesetzten Niederschlags | CODE | TAB_VBK | 00:00-24:00 MEZ (bis 1986 MOZ) | 9(2) | 222 | 0-99 | -9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 224 | 0-9 | 9 | |
KL | VCK | VCK | Kennung fuer besondere Wettererscheinungen | CODE | TAB_VCK | 00:00-24:00 MEZ (bis 1986 MOZ) | 9(2) | 225 | 0-99 | -9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 227 | 0-9 | 9 | |
KL | RK | R1 | Niederschlagshoehe zwischen den Hauptterminen | 0.1 mm | 21:30 VT. - 07:30 MEZ (bis 1986 MOZ) | 9(4) | 228 | 0-5000 | -999 | |
KL | RKF | RF1 | Kennung fuer die Art des Niederschlags | CODE | TAB_RF | 21:30 VT. - 07:30 MEZ (bis 1986 MOZ) | 9(1) | 232 | 0,1,4,6,7,8 | 9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 233 | 0-9 | 9 | |
KL | RK | R2 | Niederschlagshoehe zwischen den Hauptterminen | 0.1 mm | 07:30 - 14:30 MEZ (bis 1986 MOZ) | 9(4) | 234 | 0-5000 | -999 | |
KL | RKF | RF2 | Kennung fuer die Art des Niederschlags | CODE | TAB_RF | 07:30 - 14:30 MEZ (bis 1986 MOZ) | 9(1) | 238 | 0,1,4,6,7,8 | 9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 239 | 0-9 | 9 | |
KL | RK | R3 | Niederschlagshoehe zwischen den Hauptterminen | 0.1 mm | 14:30 - 21:30 MEZ (bis 1986 MOZ) | 9(4) | 240 | 0-5000 | -999 | |
KL | RKF | RF3 | Kennung fuer die Art des Niederschlags | CODE | TAB_RF | 14:30 - 21:30 MEZ (bis 1986 MOZ) | 9(1) | 244 | 0,1,4,6,7,8 | 9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 245 | 0-9 | 9 | |
KL | RSK | RSK | Tagessumme des Niederschlags (24Std.) | 0.1 mm | 07:30 - 07:30 FT. MEZ (bis 1986 MOZ) | 9(4) | 246 | 0-5000 | -999 | |
KL | RSKF | RSKF | Kennung fuer die Art des Niederschlags (24Std.) | CODE | TAB_RF | 07:30 - 07:30 FT. MEZ (bis 1986 MOZ) | 9(1) | 250 | 0,1,4,6,7,8 | 9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 251 | 0-9 | 9 | |
KL | SHK | SHK | Schneehoehe | cm | 07:30 MEZ (bis 1986 MOZ) | 9(3) | 252 | 0-999 | -99 | |
KL | SA | SA | Art der Schneedecke | CODE | TAB_SA | 07:30 MEZ (bis 1986 MOZ) | X(1) | 255 | 0,1,D,F,R | 9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 256 | 0-9 | 9 | |
KL | NSH | NSH | Neuschneehoehe | cm | 21:30 VT. - 07:30 MEZ (bis 1986 MOZ) | 9(3) | 257 | 0-999 | -99 | |
KL | NSHJ | NSHJ | Kennung ob Neuschnee gefallen ist | CODE | 1:ja, 0:nein | 21:30 VT. - 07:30 MEZ (bis 1986 MOZ) | 9(1) | 260 | 0,1 | 9 |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 261 | 0-9 | 9 | |
KL | FXK | FXK | Tagesmaximum der Windgeschwindigkeit | 0.1 m/s | 00:00-24:00 MEZ (bis 1986 MOZ) | 9(3) | 262 | 0-999 | -99 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 265 | 0-9 | 9 | |
KL | ASH | ASH | Hoehe des ausgestochenen Schnees | cm | 07:30 MEZ (bis 1986 MOZ) | 9(4) | 266 | 0-999 | -999 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 270 | 0-9 | 9 | |
KL | WAAS | WAAS | Wasseraequivalent des ausgestochenen Schnees (ASH) | 0.1 mm | 07:30 MEZ (bis 1986 MOZ) | 9(5) | 271 | 0-9999 | -9999 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 276 | 0-9 | 9 | |
KL | WASH | WASH | Wasseraequivalent der Gesamtschneedecke (SH) | 0.1 mm | 07:30 MEZ (bis 1986 MOZ) | 9(5) | 277 | 0-99999 | -9999 | |
KL | Q | Q | Qualitaetsbyte | CODE | TAB_Q | 9(1) | 282 | 0-9 | 9 |
table_description = pd.read_html('http://www.dwd.de/bvbw/generator/DWDWWW/Content/Oeffentlichkeit/KU/KU2/KU21/klimadaten/german/format__kl,templateId=raw,property=publicationFile.html/format_kl.html',skiprows=1,header=1)[0]
table_description.head()
KL | KE | KE.1 | Kennung fuer das Datenkollektiv | Unnamed: 4 | Unnamed: 5 | Standardformat: Klimadaten aus Klimaroutine des DWD (3 Termine: 07,14,21 MOZ, ab 01.01.1987 07:30,14:30,21:30 MEZ) und Tageswerte jeweils nach Beobachteranleitung fuer Klimastationen (BAK). nebenamtl.Stationen auch teilweise nach 03.2001). Satzlaenge: 282 Zeichen | X(2) | 1 | siehe KE_IND | Unnamed: 10 | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | KL | STAT | ST | Stationsnummer | CODE | STATIONSLISTE | NaN | 9(5) | 3 | 00001-99999 | NaN |
1 | KL | JA | JA | Jahr | NaN | NaN | NaN | 9(4) | 8 | 1800-2100 | NaN |
2 | KL | MO | MO | Monat | NaN | NaN | NaN | 9(2) | 12 | 01-12 | NaN |
3 | KL | TA | TA | Tag | NaN | NaN | NaN | 9(2) | 14 | 01-31 | NaN |
4 | KL | NaN | NaN | numerisches Leerfeld (0) | NaN | NaN | NaN | 9(4) | 16 | 0000 | NaN |
widths = table_description.iloc[:,8].diff().values
widths
array([ nan, 5., 4., 2., 2., 4., 5., 1., 5., 1., 5., 1., 5., 1., 4., 1., 4., 1., 3., 1., 4., 1., 1., 4., 1., 4., 1., 4., 1., 4., 1., 4., 1., 1., 4., 1., 1., 4., 1., 1., 3., 1., 3., 1., 3., 1., 3., 1., 3., 1., 3., 1., 3., 1., 3., 1., 3., 1., 3., 1., 3., 1., 2., 2., 1., 2., 2., 1., 2., 2., 1., 3., 1., 2., 1., 2., 1., 2., 1., 2., 1., 2., 1., 2., 1., 2., 1., 2., 1., 2., 1., 3., 1., 3., 1., 1., 2., 1., 2., 1., 2., 1., 2., 1., 2., 1., 2., 1., 2., 1., 2., 1., 2., 1., 4., 1., 1., 4., 1., 1., 4., 1., 1., 4., 1., 1., 3., 1., 1., 3., 1., 1., 3., 1., 4., 1., 5., 1., 5.])
col_names = table_description.ix[:,3]
widths=[2,5,4,2,2,4,5,1,5,1,5,1,5,1,4,1,4,1,3,1,4,1,1,4,1,4,1,4,1,4,1,4,1,1,4,1,1,4,1,1,3,1,3,1,3,1,3,1,3,1,3,1,3,1,3,1,3,1,3,1,3,1]
df_temp = pd.read_fwf('./data/kl_10400_00_akt_txt.txt',widths=widths)
df_temp.head()
KL | 01192 | 2000 | 01 | 01.1 | 0000 | 10170 | 1 | 10186 | 1.1 | ... | 95 | 4.9 | 92 | 4.10 | 86.1 | 1.9 | 95.1 | 1.10 | 95.2 | 1.11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | KL | 1192 | 2000 | 1 | 2 | 0 | 10237 | 1 | 10233 | 1 | ... | 85 | 4 | 87 | 4 | 96 | 1 | 80 | 1 | 85 | 1 |
1 | KL | 1192 | 2000 | 1 | 3 | 0 | 10198 | 1 | 10177 | 1 | ... | 78 | 4 | 78 | 4 | 78 | 1 | 79 | 1 | 78 | 1 |
2 | KL | 1192 | 2000 | 1 | 4 | 0 | 10130 | 1 | 10089 | 1 | ... | 76 | 4 | 87 | 4 | 93 | 1 | 93 | 1 | 76 | 1 |
3 | KL | 1192 | 2000 | 1 | 5 | 0 | 10165 | 1 | 10158 | 1 | ... | 81 | 4 | 80 | 4 | 92 | 1 | 68 | 1 | 80 | 1 |
4 | KL | 1192 | 2000 | 1 | 6 | 0 | 10122 | 1 | 10109 | 1 | ... | 90 | 4 | 83 | 4 | 85 | 1 | 73 | 1 | 89 | 1 |
5 rows × 62 columns
df_temp = pd.read_fwf('./data/kl_10400_00_akt_txt.txt',widths=widths,header=None,parse_dates = [[2,3,4]], index_col = 0)
df_temp.ix[:5,[14,16,29]]
14 | 16 | 29 | |
---|---|---|---|
2_3_4 | |||
2000-01-01 | 69 | 25 | 42 |
2000-01-02 | 79 | 27 | 65 |
2000-01-03 | 91 | 69 | 77 |
2000-01-04 | 92 | 66 | 78 |
2000-01-05 | 86 | 4 | 52 |
df_temp = df_temp[[14,16,29]].apply(lambda x: x/10.)
df_temp.head()
14 | 16 | 29 | |
---|---|---|---|
2_3_4 | |||
2000-01-01 | 6.9 | 2.5 | 4.2 |
2000-01-02 | 7.9 | 2.7 | 6.5 |
2000-01-03 | 9.1 | 6.9 | 7.7 |
2000-01-04 | 9.2 | 6.6 | 7.8 |
2000-01-05 | 8.6 | 0.4 | 5.2 |
df_temp.columns = ['HighTemp','LowTemp','MeanTemp']
df_temp.head()
HighTemp | LowTemp | MeanTemp | |
---|---|---|---|
2_3_4 | |||
2000-01-01 | 6.9 | 2.5 | 4.2 |
2000-01-02 | 7.9 | 2.7 | 6.5 |
2000-01-03 | 9.1 | 6.9 | 7.7 |
2000-01-04 | 9.2 | 6.6 | 7.8 |
2000-01-05 | 8.6 | 0.4 | 5.2 |
df_temp.index.name = 'Date'
df_temp.head()
HighTemp | LowTemp | MeanTemp | |
---|---|---|---|
Date | |||
2000-01-01 | 6.9 | 2.5 | 4.2 |
2000-01-02 | 7.9 | 2.7 | 6.5 |
2000-01-03 | 9.1 | 6.9 | 7.7 |
2000-01-04 | 9.2 | 6.6 | 7.8 |
2000-01-05 | 8.6 | 0.4 | 5.2 |
df2 = df.join(df_temp, how='left')
from statsmodels.tsa.api import *
df2['AmpTemp'] = df2.HighTemp-df2.LowTemp
data = df2[['AveragePriceEUR','VolumeMW','AmpTemp']].asfreq('D')
model = VAR(data,missing='drop') # NaN will produce LinalgError, hence the missing='drop'
results= model.fit()
results.plot()
results.plot_acorr()
lag_order = results.k_ar
results.forecast(data.values[-lag_order:],2)
array([[ 25.47955327, 844.24536267, 5.46105252], [ 31.77881602, 731.28635887, 6.682358 ]])
results.plot_forecast(2)
legend(loc='best')
<matplotlib.legend.Legend at 0x7f274ffa3a10>
model = pd.ols(y=df2.AveragePriceEUR, x = df2[['AmpTemp','VolumeMW']])
print(model)
-------------------------Summary of Regression Analysis------------------------- Formula: Y ~ <AmpTemp> + <VolumeMW> + <intercept> Number of Observations: 2011 Number of Degrees of Freedom: 3 R-squared: 0.0013 Adj R-squared: 0.0003 Rmse: 17.0065 F-stat (2, 2008): 1.3320, p-value: 0.2642 Degrees of Freedom: model 2, resid 2008 -----------------------Summary of Estimated Coefficients------------------------ Variable Coef Std Err t-stat p-value CI 2.5% CI 97.5% -------------------------------------------------------------------------------- AmpTemp 0.1553 0.0954 1.63 0.1039 -0.0318 0.3423 VolumeMW 0.0002 0.0006 0.27 0.7891 -0.0010 0.0013 intercept 35.0410 0.9885 35.45 0.0000 33.1035 36.9784 ---------------------------------End of Summary---------------------------------
model.y_fitted.plot()
model.y.plot(style='k')
model.y_predict.plot(style='red')
<matplotlib.axes._subplots.AxesSubplot at 0x7f275416e710>
%reload_ext version_information
%version_information numpy, scipy, matplotlib, pandas, statsmodels
Software | Version |
---|---|
Python | 2.7.8 |Anaconda 2.1.0 (64-bit)| (default, Aug 21 2014, 18:22:21) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] |
IPython | 2.3.0 |
OS | posix [linux2] |
numpy | 1.9.1 |
scipy | 0.14.0 |
matplotlib | 1.4.2 |
pandas | 0.15.0 |
statsmodels | 0.5.0 |
Thu Nov 13 10:39:56 2014 CET |
df = pd.read_html('lista.html')[6]
df.head()
Stations-Kennziffer | Klima-Kennung | ICAO-Kennung | Stationsname | Stationshöhe in Metern | geogr. Breite | geogr. Länge | Automat für Lufttemperatur seit: | Beginn Klimareihe | |
---|---|---|---|---|---|---|---|---|---|
0 | 10501 | 2205 | NaN | Aachen | 202 | 50° 47' | 06° 05' | 01.07.1993 | 1891 |
1 | 10505 | 2206 | NaN | Aachen-Orsbach | 231 | 50° 47' | 06° 01' | NaN | 2011 |
2 | 10291 | 3058 | NaN | Angermünde | 54 | 53° 01' | 13° 59' | NaN | 1947 |
3 | 10091 | 3005 | NaN | Arkona | 42 | 54° 40' | 13° 26' | NaN | 1947 |
4 | 10852 | 4128 | EDMA | Augsburg | 462 | 48° 25' | 10° 56' | 10.11.1996 | 1947 |
from IPython.core.display import HTML
def css_styling():
styles = open("./styles/custom.css", "r").read()
return HTML(styles)
css_styling()