10 TuringBot Advanced Mode Tips

This tutorial covers the following points:

  • 00:11 – fit a line to your data
  • 00:21 – fit a polynomial
  • 00:41 – using base formulas and constants
  • 00:52 – choosing input variables for each term
  • 01:05 – use all variables as input except for one
  • 01:23 – custom left side of equation
  • 01:42 – classification problems
  • 02:02 – calculating an average
  • 02:13 – command line usage
  • 02:38 – fine-tuning a function

How to normalize data using NumPy or Pandas

When it comes to machine learning, working with normalized numbers may lead to faster convergence while training the models. Here we will show how you can normalize your dataset in Python using either NumPy or Pandas.

NumPy

To normalize a NumPy array, you can use:

import numpy as np

data = np.loadtxt('data.txt')

for col in range(data.shape[1]):
    data[:,col] -= np.average(data[:,col])
    data[:,col] /= np.std(data[:,col])

Here data.shape[1] is the number of columns in the dataset, and we are using NumPy to normalize the average and standard deviation of each column to 0 and 1 respectively.

Pandas

Normalizing a Pandas dataframe is even easier:

import pandas as pd

df = pd.read_csv('data.csv')

df = (df-df.mean())/df.std()

This will normalize each column of the dataframe.