Tag Archives: pandas

Import CSV into Python using Pandas

One of the features I like about R is when you read in a CSV file into a data frame you can access columns using names from the header file. The Python Data Analysis Library (pandas) aims to provide a similar data frame structure to Python and also has a function to read a CSV. Once pandas has been installed a CSV file can be read using:

import pandas
data_df = pandas.read_csv('in_data.csv')

To get the names of the columns use:

print(data_df.columns)

And to access columns use:

colHH = data_df['colHH']

Or if the column name is a valid Python variable name:

colHH = data_df.colHH

This is only a tiny part of pandas, there are lots of features available (which I’m just getting into). One interesting one is the ability to create pivot table reports from a data frame, similar to Excel.