Open on DataHub
# HIDDEN
# Clear previously defined variables
%reset -f

# Set directory for data loading to work properly
import os
os.chdir(os.path.expanduser('~/notebooks/14'))

Feature Engineering

Feature engineering refers to the practice of creating and adding new features to the dataset itself in order to add complexity to our models.

So far we have only conducted linear regression using numerical features as the input—we used the (numeric) total bill in order to predict the tip amount. However, the tip dataset also contained categorical data, such as the day of week and the meal type. Feature engineering allows us to convert categorical variables into numerical features for linear regression.

Feature engineering also allows us to use our linear regression model to conduct polynomial regression by creating new variables in our dataset.