Preprocessing
General Info
Useful Snippets

1.0.0 Use a Scikit-Learn Transformer

Example:

# Create an example dataframe

import pandas as pd
DF = pd.DataFrame()

# Now create new columns and set them equal to your lists
DF['c_one']  = [1,2,3,6,45,4,4,4,5,3,2,2,3,2,4,4,2,3,4,3,3,2,3,4,3,2,6,6,7,6,5,4,4,3,2,4,8,8,6,5,5]
DF['c_two']   = [2,3,6,45,4,4,4,5,3,2,2,3,2,4,4,2,3,4,3,3,2,3,4,3,2,6,6,7,6,5,4,4,3,2,4,8,8,6,5,5,4]
DF['c_three'] = [3,6,45,4,4,4,5,3,2,2,3,2,4,4,2,3,4,3,3,2,3,4,3,2,6,6,7,6,5,4,4,3,2,4,8,8,6,5,5,2,3]

########################################
# Libraries
from sklearn.preprocessing import StandardScaler

# Option 1:  #######
# 1. Instantiate the transformer, in this case it's a StandardScaler preprocessor
scale_my_column = StandardScaler()

# 2. Fit your transformer to the data column that you want to transform
scale_my_column.fit(DF[['c_three']])

# 3. Use the fitted transformer to transform the column and save the transformed column as a new column
DF[['transformed_c_three']] = scale_my_column.transform(DF[['c_three']])


# Option 2:  ########
# Or do steps 2 and 3 at the same time with .fit_transform

# 1. Instantiate the StandardScaler preprocessor
scale_my_column = StandardScaler()

# 2 & 3. Use the fitted transformer to transform the column and save the transformed column as a new column
DF[['transformed_c_three']] = scale_my_column.fit_transform(DF[['c_three']])