Import CSV into Python using Pandas

One of the features I like about R is when you read in a CSV file into a data frame you can access columns using names from the header file. The Python Data Analysis Library (pandas) aims to provide a similar data frame structure to Python and also has a function to read a CSV. Once pandas has been installed a CSV file can be read using:

import pandas
data_df = pandas.read_csv('in_data.csv')

To get the names of the columns use:

print(data_df.columns)

And to access columns use:

colHH = data_df['colHH']

Or if the column name is a valid Python variable name:

colHH = data_df.colHH

This is only a tiny part of pandas, there are lots of features available (which I’m just getting into). One interesting one is the ability to create pivot table reports from a data frame, similar to Excel.

5 thoughts on “Import CSV into Python using Pandas

  1. Daniela Forero

    Thank you! really helpful!
    When im printing the heathers says: [u’date[]’, u’clock_time[]’….
    Why shows ‘u’ and how can I remove it?
    The CVS file doesnt have that ‘u’ at the beginning of each heather.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s