To drop all empty columns (but still keeping the headers) using the Python Pandas library we can use the following 4-line script to read in the csv file, drop the columns where all the elements are missing, and save the data to a new csv file.

In [1]:

from pandas.io.parsers import read_csv
data = read_csv('remove-empty-columns.csv')
filtered_data = data.dropna(axis='columns', how='all')
filtered_data.to_csv('empty-columns-removed.csv', index=False)

As shown below, the sample data included in the csv file has 3 columns which contain missing values.

The second column, labeled bar, is completely empty except the header; columns like this should be dropped. The other columns contain data, but should not be dropped even though they contain some missing values.

In [2]:

data

Out[2]:

	foo	bar	baz
0	a	NaN	1.0
1	b	NaN	2.0
2	c	NaN	NaN
3	NaN	NaN	4.0
4	e	NaN	5.0

Using the pandas.DataFrame.dropna() function with the columns axis we can drop any column where all the entries are NaN (missing values).

In [3]:

filtered_data = data.dropna(axis='columns', how='all')
filtered_data

Out[3]:

	foo	baz
0	a	1.0
1	b	2.0
2	c	NaN
3	NaN	4.0
4	e	5.0