Open on DataHub
# HIDDEN
# Clear previously defined variables
%reset -f

# Set directory for data loading to work properly
import os
os.chdir(os.path.expanduser('~/notebooks/08'))

Working with Text

A great quantity of data resides not as numbers in CSVs but as free-form text in books, documents, blog posts, and Internet comments. While numerical and categorical data are often collected from physical phenomena, textual data arises from human communication and expression. As with most types of data, there are a multitude of techniques for working with text that would take multiple books to explain in full detail. In this chapter, we introduce a small subset of these techniques that provide a variety of useful operations for working with text: Python string manipulation and regular expressions.