In the 3rd part of this series I am going to complete the description started in part 2 of the data preparation process for training and evaluation purposes of a LSTM model in time series forecasting. The data set used is the same as for part 1 and part 2. Same as for all of the post of this series, I am referring to Python 3.

In part 2 we have learned how to transform the time series into a supervised model. That isn’t enough yet to feed our LSTM model: other two transformations are needed.

The first thing to do is to transform the time series data in order to make it stationary. The input time series used for these posts presents values that are dependent on time. Looking at its plotting in part 1, we can notice an increasing trend up to January 2006, then a decreasing trend up to February 2010 and finally a second increasing trend from there to date. Transforming this data to stationary makes life easier: we can remove the trends from the observed values before training and finally add them back to the forecasts in order to return the prediction to the original scale. Trends can be removed by differencing the data: we subtract the value at time t-1 from the current value (at time t). The pandas DataFrame provides a function, diff(), for this purpose. We need to implement two functions, one which returns the difference time series:

`def difference(timeseries, interval=1):     diff_list = list()     for idx in range(interval, len(timeseries)):        diff_value = timeseries[idx] - timeseries[idx - interval]        diff_list.append(diff_value)     return Series(diff_list)`

and another one to invert the process before making forecasts:

`def invert_difference(historical, inverted_scale_ts, interval=1):     return inverted_scale_ts + historical[-interval]`

The last thing to do is to transform the input time series observations to have a specific scale. The neural network model we are going to use is a LSTM. The default activation function for LSTMs is the hyperbolic tangent, which output values are in the range between -1 and 1, so the best range for the time series used for this example is in the same range. The min and max scaling coefficients need to be calculated on the training data set and then used to scale the test data set. The scikit-learn package comes with a specific class for this,
MinMaxScaler. We need to implement two functions, one to calculate the scaling coefficients:

`from sklearn.preprocessing import MinMaxScaler`

` def scale_data_set(train_data, test_data):    min_max_scaler = MinMaxScaler(feature_range=(-1, 1))    min_max_scaler = min_max_scaler.fit(train_data)    train_data = train_data.values.reshape(train_data.shape[0], train_data.shape[1])    scaled_train_data = min_max_scaler.transform(train_data)    test_data = test_data.values.reshape(test_data.shape[0], test_data.shape[1])    scaled_test_data = min_max_scaler.transform(test_data)     return min_max_scaler, scaled_train_data, `
scaled_test_data

and a second one to invert the scaling for the forecasted values:

`def invert_scale(min_max_scaler, scaled_array, value):    row = [elem for elem in scaled_array] + [value]     array = numpy.array(row)     array = array.reshape(1, len(array))     inverted_array = min_max_scaler.inverse_transform(array)     return inverted_array[0, -1]`

Now we can put all together. We first make the input data stationary:

`raw_values = series.values diff_values = difference(raw_values, 1)`

then transform the data to make the problem like a supervised learning case:

`supervised = tsToSupervised(diff_values, 1)`

and finally split the data for training (years from 1992 to 2010) and test (years from 2011 to date):

`train, test = supervised[0:-98], supervised[-98:]`

and transform the scale of the training data:

`scaler, train_scaled, test_scaled = scale(train, test)`

At this stage we can now build and train our LSTM. But this will be the core topic of the next post of this series.

The complete example would be released as a Jupyter notebook at the end of the first part of this series.