# HIDDEN
# Clear previously defined variables
%reset -f
# Set directory for data loading to work properly
import os
os.chdir(os.path.expanduser('~/notebooks/03'))
Working with Tabular Data¶
Tabular data, like the datasets we have worked with in Data 8, are one of the
most common and useful forms of data for analysis. We introduce tabular data
manipulation using pandas
, the standard Python library for working with
tabular data. Although pandas
's syntax is more challenging to use than the
datascience
package used in Data 8, pandas
provides significant performance
improvements and is the current tool of choice in both industry and academia
for working with tabular data.
It is more important that you understand the types of useful operations on data
than the exact details of pandas
syntax. For example, knowing when to use a
group or a join is more useful than knowing how to call the pandas
function
to group data. It is relatively easy to look up the function you need once you
know the right operation to use. All of the table manipulations in this chapter
will also appear again in a new syntax when we cover SQL, so it will help you
to understand them now.
Because we will cover only the most important pandas
functions in this
textbook, you should bookmark the pandas
documentation for reference
when you conduct your own data analyses.