One of the features I like about R is when you read in a CSV file into a data frame you can access columns using names from the header file. The Python Data Analysis Library (pandas) aims to provide a similar data frame structure to Python and also has a function to read a CSV. Once pandas has been installed a CSV file can be read using:
import pandas data_df = pandas.read_csv('in_data.csv')
To get the names of the columns use:
And to access columns use:
colHH = data_df['colHH']
Or if the column name is a valid Python variable name:
colHH = data_df.colHH
This is only a tiny part of pandas, there are lots of features available (which I’m just getting into). One interesting one is the ability to create pivot table reports from a data frame, similar to Excel.