Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Machine Learning Classification Strategy In Python

By Ishan Shah

Machine Learning Classification Strategy In PythonClick To Tweet

Now, let’s implement the Machine Learning classification strategy in Python.

Step 1: Import the libraries

In this step, we will import the necessary libraries that will be needed to create the strategy.

# machine learning classification

  • from sklearn.svm import SVC
  • from sklearn.metrics import scorer
  • from sklearn.metrics import accuracy_score

# For data manipulation

  • import pandas as pd
  • import numpy as np

# To plot

  • import matplotlib.pyplot as plt
  • import seaborn

# To fetch data

  • from pandas_datareader import data as pdr

Step 2: Fetch data

 We will download the S&P500 data from google finance using pandas_datareader.

After that, we will drop the missing values from the data and plot the S&P500 close price series.

Df = pdr.get_data_google('SPY', start="2012-01-01", end="2017-10-01")        

Df= Df.dropna()

Df.Close.plot(figsize=(10,5))

plt.ylabel("S&P500 Price")

plt.show()

Step 3: Determine the target variable

The target variable is the variable which the machine learning classification algorithm will predict. In this example, the target variable is whether S&P500 price will close up or close down on the next trading day.

We will first determine the actual trading signal using the following logic – if next trading day’s close price is greater than today’s close price then, we will buy the S&P500 index, else we will sell the S&P500 index. We will store +1 for the buy signal and -1 for the sell signal.

y = np.where(Df['Close'].shift(-1) > Df['Close'],1,-1)

Step 4: Creation of predictors variables

The X is a dataset that holds the predictor’s variables which are used to predict target variable, ‘y’. The X consists of variables such as ‘Open – Close’ and ‘High – Low’. These can be understood as indicators based on which the algorithm will predict the option price.

Df['Open-Close'] = Df.Open - Df.Close

Df['High-Low'] = Df.High - Df.Low

X=Df[['Open-Close','High-Low']]

In the later part of the code, the machine learning classification algorithm will use the predictors and target variable in the training phase to create the model and then, predict the target variable in the test dataset.

Step 5: Test and train dataset split

In this step, we will split data into the train dataset and the test dataset.

  1. First, 80% of data is used for training and remaining data for testing
  2. X_train and y_train are train dataset
  3. X_test and y_test are test dataset
split_percentage = 0.8

split = int(split_percentage*len(Df))
# Train data set

X_train = X[:split]

y_train = y[:split]
# Test data set

X_test = X[split:]

y_test = y[split:]

Step 6: Create the machine learning classification model using the train dataset

We will create the machine learning classification model based on the train dataset. This model will be later used to predict the trading signal in the test dataset.

cls = SVC().fit(X_train, y_train)

Step 7: The classification model accuracy

We will compute the accuracy of the classification model on the train and test dataset, by comparing the actual values of the trading signal with the predicted values of the trading signal. The function accuracy_score() will be used to calculate the accuracy.

Syntax: accuracy_score(target_actual_value,target_predicted_value)

  1. target_actual_value: correct signal values
  2. target_predicted_value: predicted signal values
accuracy_train = accuracy_score(y_train, cls.predict(X_train))

accuracy_test = accuracy_score(y_test, cls.predict(X_test))
print('\nTrain Accuracy:{: .2f}%'.format(accuracy_train*100))

print('Test Accuracy:{: .2f}%'.format(accuracy_test*100))

An accuracy of 50%+ in test data suggests that the classification model is effective.

Step 8: Prediction

We will predict the signal (buy or sell) for the test data set, using the cls.predict() function. Then, we will compute the strategy returns based on the signal predicted by the model in the test dataset. We save it in the column ‘Strategy_Return’ and then, plot the cumulative strategy returns.

Df[‘Predicted_Signal’] = cls.predict(X)

# Calculate log returns

Df['Return'] = np.log(Df.Close.shift(-1) / Df.Close)*100

Df['Strategy_Return'] = Df.Return * Df.Predicted_Signal

Df.Strategy_Return.iloc[split:].cumsum().plot(figsize=(10,5))

plt.ylabel("Strategy Returns (%)")

plt.show()

As seen from the graph, the machine learning classification strategy generates a return of around 15% in the test data set.

Next Step

If you want to learn various aspects of Algorithmic trading then check out the Executive Programme in Algorithmic Trading (EPAT™). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. EPAT™ equips you with the required skill sets to be a successful trader. Enroll now!

The post Machine Learning Classification Strategy In Python appeared first on .



This post first appeared on Best Algo Trading Platforms Used In Indian Market, please read the originial post: here

Share the post

Machine Learning Classification Strategy In Python

×

Subscribe to Best Algo Trading Platforms Used In Indian Market

Get updates delivered right to your inbox!

Thank you for your subscription

×