I like watching the Phillies. I do not have cable. Some Phillies games are broadcast on national television. This is how I made a list of those games.
Pandas is a data analysis tool for the Python programming language. It can do a tremendous amount of really powerful data analysis and visualization. It's a gun in this CSV knife fight.
import pandas as pd
A downloadable CSV schedule is available from mlb.com. Here is a direct link to the Phillies schedule.
The CSV schedule will be used to instantiate a Pandas DataFrame object.
schedule = pd.DataFrame.from_csv("phillies.csv")
schedule.info()
<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 162 entries, 2014-03-31 00:00:00 to 2014-09-28 00:00:00 Data columns (total 16 columns): START_TIME 162 non-null object START_TIME_ET 162 non-null object SUBJECT 162 non-null object LOCATION 162 non-null object DESCRIPTION 162 non-null object END_DATE 162 non-null object END_DATE_ET 162 non-null object END_TIME 162 non-null object END_TIME_ET 162 non-null object REMINDER_OFF 162 non-null bool REMINDER_ON 162 non-null bool REMINDER_DATE 162 non-null object REMINDER_TIME 162 non-null object REMINDER_TIME_ET 162 non-null object SHOWTIMEAS_FREE 162 non-null object SHOWTIMEAS_BUSY 162 non-null object dtypes: bool(2), object(14)
162 games and 16 columns of data for each game.
schedule.head()
START_TIME | START_TIME_ET | SUBJECT | LOCATION | DESCRIPTION | END_DATE | END_DATE_ET | END_TIME | END_TIME_ET | REMINDER_OFF | REMINDER_ON | REMINDER_DATE | REMINDER_TIME | REMINDER_TIME_ET | SHOWTIMEAS_FREE | SHOWTIMEAS_BUSY | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
START_DATE | ||||||||||||||||
2014-03-31 | 02:05 PM | 02:05 PM | Phillies at Rangers | Globe Life Park in Arlington | Local TV: CSN ----- Local Radio: 94 WIP -- SBP... | 03/31/14 | 03/31/14 | 05:05 PM | 05:05 PM | False | True | 03/31/14 | 01:05 PM | 01:05 PM | FREE | BUSY |
2014-04-01 | 08:05 PM | 08:05 PM | Phillies at Rangers | Globe Life Park in Arlington | Local TV: TCN ----- Local Radio: 94 WIP -- SBP... | 04/01/14 | 04/01/14 | 11:05 PM | 11:05 PM | False | True | 04/01/14 | 07:05 PM | 07:05 PM | FREE | BUSY |
2014-04-02 | 08:05 PM | 08:05 PM | Phillies at Rangers | Globe Life Park in Arlington | Local TV: ESPN2 -- TCN ----- Local Radio: 94 W... | 04/02/14 | 04/02/14 | 11:05 PM | 11:05 PM | False | True | 04/02/14 | 07:05 PM | 07:05 PM | FREE | BUSY |
2014-04-04 | 02:20 PM | 02:20 PM | Phillies at Cubs | Wrigley Field | Local TV: MLBN -- CSN ----- Local Radio: 94 WI... | 04/04/14 | 04/04/14 | 05:20 PM | 05:20 PM | False | True | 04/04/14 | 01:20 PM | 01:20 PM | FREE | BUSY |
2014-04-05 | 02:20 PM | 02:20 PM | Phillies at Cubs | Wrigley Field | Local TV: TCN ----- Local Radio: 94 WIP -- 121... | 04/05/14 | 04/05/14 | 05:20 PM | 05:20 PM | False | True | 04/05/14 | 01:20 PM | 01:20 PM | FREE | BUSY |
5 rows × 16 columns
The DESCRIPTION
column contains the broadcast information. Less interesting columns can be removed.
schedule.drop(["REMINDER_OFF",
"REMINDER_ON",
"START_TIME_ET",
"END_DATE",
"END_DATE_ET",
"END_TIME",
"END_TIME_ET",
"REMINDER_TIME",
"REMINDER_TIME_ET",
"SHOWTIMEAS_FREE",
"SHOWTIMEAS_BUSY",
"REMINDER_DATE"], axis=1, inplace=True)
schedule.head()
START_TIME | SUBJECT | LOCATION | DESCRIPTION | |
---|---|---|---|---|
START_DATE | ||||
2014-03-31 | 02:05 PM | Phillies at Rangers | Globe Life Park in Arlington | Local TV: CSN ----- Local Radio: 94 WIP -- SBP... |
2014-04-01 | 08:05 PM | Phillies at Rangers | Globe Life Park in Arlington | Local TV: TCN ----- Local Radio: 94 WIP -- SBP... |
2014-04-02 | 08:05 PM | Phillies at Rangers | Globe Life Park in Arlington | Local TV: ESPN2 -- TCN ----- Local Radio: 94 W... |
2014-04-04 | 02:20 PM | Phillies at Cubs | Wrigley Field | Local TV: MLBN -- CSN ----- Local Radio: 94 WI... |
2014-04-05 | 02:20 PM | Phillies at Cubs | Wrigley Field | Local TV: TCN ----- Local Radio: 94 WIP -- 121... |
5 rows × 4 columns
The DESCRIPTION
column is nice because it mentions the stations that games are broadcast on. Sometimes a game is broadcast on two channels at once. There is also radio broadcast information that I'm not interested in right now.
schedule.DESCRIPTION.head()
START_DATE 2014-03-31 Local TV: CSN ----- Local Radio: 94 WIP -- SBP... 2014-04-01 Local TV: TCN ----- Local Radio: 94 WIP -- SBP... 2014-04-02 Local TV: ESPN2 -- TCN ----- Local Radio: 94 W... 2014-04-04 Local TV: MLBN -- CSN ----- Local Radio: 94 WI... 2014-04-05 Local TV: TCN ----- Local Radio: 94 WIP -- 121... Name: DESCRIPTION, dtype: object
DESCRIPTION
¶Thankfully, the DESCRIPTION
column data is parseable. Getting a list of television broadcast stations for each game is not difficult. Picking a game that is broadcast on multiple channels should cover all cases.
description = schedule.DESCRIPTION[2]
print description
Local TV: ESPN2 -- TCN ----- Local Radio: 94 WIP -- 1210 WPHT
def tv_stations_from_description(description):
"""Return a list of television stations embedded in the given description."""
return [station.strip() for station in description.split(":")[1].split("-----")[0].split("--")]
result = tv_stations_from_description(description)
print result
assert(len(result) == 2)
['ESPN2', 'TCN']
Picking a game broadcast on a single channel to test the parsing function.
description = schedule.DESCRIPTION[0]
print description
result = tv_stations_from_description(description)
print result
assert(len(result) == 1)
Local TV: CSN ----- Local Radio: 94 WIP -- SBP 1480 -- 1210 WPHT ['CSN']
Applying this function to the DataFrame yields a Series
of all television stations on which the Phillies are broadcast this season.
stations_series = schedule.DESCRIPTION.apply(
lambda description: [station.strip() for station in
description.split(":")[1].split("-----")[0].split("--")])
stations_series
START_DATE 2014-03-31 [CSN] 2014-04-01 [TCN] 2014-04-02 [ESPN2, TCN] 2014-04-04 [MLBN, CSN] 2014-04-05 [TCN] 2014-04-06 [CSN] 2014-04-08 [NBC 10] 2014-04-09 [TCN] 2014-04-10 [TCN, MLBN] 2014-04-11 [TCN] 2014-04-12 [NBC 10] 2014-04-13 [TCN] 2014-04-14 [TCN, ESPN] 2014-04-15 [MLBN, CSN] 2014-04-16 [TCN] ... 2014-09-13 [CSN] 2014-09-14 [CSN] 2014-09-15 [CSN] 2014-09-16 [CSN] 2014-09-17 [CSN] 2014-09-18 [CSN] 2014-09-19 [CSN] 2014-09-20 [CSN] 2014-09-21 [CSN] 2014-09-23 [CSN] 2014-09-24 [CSN] 2014-09-25 [CSN] 2014-09-26 [CSN] 2014-09-27 [CSN] 2014-09-28 [CSN] Name: DESCRIPTION, Length: 162
Creating a set
of stations from that Series
will yield a concise list of distinct television broadcast stations.
set([station for stations in stations_series.values for station in stations])
{'CSN', 'ESPN', 'ESPN2', 'FOX', 'MLBN', 'NBC 10', 'TCN'}
The 162 regular season Phillies games are broadcast on 7 television channels. Unfortunately only 2 of those 7 stations are available without a cable television subscription. This means that I can only watch games on NBC and FOX.
Filtering the DESCRIPTION
column to national television broadcast stations yields only the games which I can watch over the air with my HD antenna.
schedule[(schedule.DESCRIPTION.str.contains("NBC 10")) |
(schedule.DESCRIPTION.str.contains("FOX"))]
START_TIME | SUBJECT | LOCATION | DESCRIPTION | |
---|---|---|---|---|
START_DATE | ||||
2014-04-08 | 04:05 PM | Brewers at Phillies | Citizens Bank Park | Local TV: NBC 10 ----- Local Radio: 94 WIP -- ... |
2014-04-12 | 07:05 PM | Marlins at Phillies | Citizens Bank Park | Local TV: NBC 10 ----- Local Radio: 94 WIP -- ... |
2014-05-23 | 07:05 PM | Dodgers at Phillies | Citizens Bank Park | Local TV: NBC 10 ----- Local Radio: 94 WIP -- ... |
2014-06-06 | 07:10 PM | Phillies at Reds | Great American Ball Park | Local TV: NBC 10 ----- Local Radio: 94 WIP -- ... |
2014-06-18 | 12:10 PM | Phillies at Braves | Turner Field | Local TV: NBC 10 ----- Local Radio: 94 WIP -- ... |
2014-07-05 | 04:05 PM | Phillies at Pirates | PNC Park | Local TV: NBC 10 ----- Local Radio: 94 WIP -- ... |
2014-07-12 | 07:15 PM | Nationals at Phillies | Citizens Bank Park | Local TV: FOX ----- Local Radio: 94 WIP -- 121... |
2014-07-19 | 07:10 PM | Phillies at Braves | Turner Field | Local TV: NBC 10 ----- Local Radio: 94 WIP -- ... |
2014-07-26 | 07:05 PM | D-backs at Phillies | Citizens Bank Park | Local TV: NBC 10 ----- Local Radio: 94 WIP -- ... |
2014-08-02 | 07:05 PM | Phillies at Nationals | Nationals Park | Local TV: NBC 10 ----- Local Radio: 94 WIP -- ... |
2014-08-09 | 07:05 PM | Mets at Phillies | Citizens Bank Park | Local TV: NBC 10 ----- Local Radio: 94 WIP -- ... |
2014-08-22 | 07:05 PM | Cardinals at Phillies | Citizens Bank Park | Local TV: NBC 10 ----- Local Radio: 94 WIP -- ... |
2014-09-05 | 07:05 PM | Phillies at Nationals | Nationals Park | Local TV: NBC 10 ----- Local Radio: 94 WIP -- ... |
13 rows × 4 columns
This means that I have the possibility to watch 13 out of 162 regular season Phillies games this season which is roughly 8%.