Skip to content

pynip/analysis-and-prediction-of-stock-price-using-LSTM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 

Repository files navigation

Analysis and Prediction of stock price using LSTM

Stock price prediction refers to understanding various aspects of the stock market that can influence the price of a stock, and based on these potential factors, build a model to predict the stock's price. This can help individuals and institutions speculate on the stock price trend and help them decide whether to buy or short the stock price to maximize their profit. While using Machine Learning and Time Series helps us to discover the future value of a particular stock and other financial assets traded on an exchange. The entire idea of analysis and prediction is to gain significant profits.


Focus areas for Analysis:
  • The change in closing price of the stock over time.
  • Visualization of Candlestick Monthly data.
  • The % daily return of the stock.
  • The moving average of various stocks.

Prediction:

  • We will be predicting future stock behaviour by predicting the closing price of the stock using LSTM.

#Import Libraries

!pip install datetime numpy pandas yfinance seaborn matplotlib
import datetime
import numpy as np
import pandas as pd
import yfinance as yf
import seaborn as sns
import matplotlib.pyplot as plt

Exploratory Data Analysis (EDA) using ELT (Extract, Load, Transform)

Dataset

I have taken the stock price data of United Breweries Holdings Limited from Yahoo Finance from 1st Jan 2022 to 1st Jan 2023.

Time Period of Data: Define the timeframe for which you want to fetch data.

start_date = datetime.datetime(2020, 1, 15)
end_date = datetime.datetime(2023, 12, 31)

Loading Data from Yahoo Finance

 df = yf.download('UBL.NS', start_date, end_date)

View Dataframe

df
image

Check index

print(df.index)
image

Reset Index

 df1 = df.reset_index()
df1['Date'] = pd.to_datetime(df1['Date'])
df1
image

Converting from Daily to Monthly Frequency data

monthly_data = df.resample('M').agg({'Open': 'first', 'High': 'max', 'Low': 'min', 'Close': 'last', 'Volume': 'sum'})
monthly_data.head()
image

Plot - Line and Frequency -Daily closing Price

 # Plotting
plt.figure(figsize=(10, 6))
#plt.plot(df['Date'], df['Open'], label='Open')
#plt.plot(df['Date'], df['High'], label='High')
#plt.plot(df['Date'], df['Low'], label='Low')
plt.plot(df.index, df['Close'], label='Close')

plt.title('UBL Stock Prices Over Time') plt.xlabel('Date') plt.ylabel('Closing Price') plt.legend() plt.grid(True) plt.show()
image

Plot - Candlestick and Frequency - Monthly OHLC Volume Data

 #Plotting monthly candlestick chart with a separate volume plot with MA(20)
#mpf.plot(monthly_data, type='candle', style='charles', volume=True, mav=(20), show_nontrading=True, addplot=mpf.make_addplot(monthly_data['Volume'], panel=1, ylabel='Volume'),tight_layout=True, figratio=(16, 9), scale_width_adjustment=dict(volume=0.7, candle=1))

#Plotting monthly candlestick chart with a separate volume plot mpf.plot(monthly_data, type='candle', style='charles', volume=True, show_nontrading=True, tight_layout=True, figratio=(16, 9), scale_width_adjustment=dict(volume=0.7, candle=1))
image

Total Rows & Columns

df.shape
image

Data Information

 df.info()
image

Data Quality Check

Duplicate Values

 len(df[df.duplicated()])
image

Missing Values/Null Values

 print(df.isnull().sum())
image

Variable Information

 # Columns
df.columns
image
 #Describe
df.describe()
image
 # Check unique values for each variable
for i in df.columns.tolist():
  print("No. of unique values in ",i,"is",df[i].nunique(),".")
image

Analysis

Plotting Moving Average (50,200) is a simple technal analysis that smooths out price data.

 ma_day = [50, 200]

plt.figure(figsize=(10, 6))

#Plot Close price plt.plot(df.index, df['Close'], label='Close')

#Plot Moving Averages for ma in ma_day: column_name = f"MA for {ma} days" df[column_name]=df['Close'].rolling(ma).mean() plt.plot(df.index, df[column_name], label=column_name)
plt.title('UBL Daily Close Prices and Moving Averages') plt.xlabel('Date') plt.ylabel('Price') plt.legend() plt.grid(True) plt.show()

image

Average Daily Returns

 #Calculate daily return percentage
df['Daily Return'] = df['Close'].pct_change()

plt.figure(figsize=(10, 6)) plt.plot(df.index, df['Daily Return'], linestyle='--', marker='o', label='Daily Return')
plt.title('Daily Return Percentage') plt.xlabel('Date') plt.ylabel('Percentage') plt.legend() plt.grid(True) plt.show()
image
plt.figure(figsize=(12, 9))
df['Daily Return'].hist(bins=50, alpha=0.5, label='UBL')

plt.xlabel('Daily Return') plt.ylabel('Counts') plt.title('Daily Return of UBL using histogram') plt.legend() plt.grid(True) plt.tight_layout() plt.show()
image

2. Prediction using LTSM (Long Short-Term Memory) is a type of recurrent neural network (RNN) architecture well-suited for sequence prediction problems.


2.0 Before Prediction

 plt.figure(figsize=(16,6))
plt.title('Close Price History')
plt.plot(df['Close'])
plt.xlabel('Date', fontsize=18)
plt.ylabel('Close Price', fontsize=18)
plt.show()
image

2.1 Data Prepartion

 #Create a new dataframe with only the 'Close column 
data = df.filter(['Close'])
 
#Convert the dataframe to a numpy array because ML/DL libraries requires numpy arrays as inputs dataset = data.values
#Get the number of rows to train the model on training_data_len = int(np.ceil( len(dataset) * .80 ))

training_data_len

image

2.2 Data Scaling

 #Scale the data
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler(feature_range=(0,1)) scaled_data = scaler.fit_transform(dataset)

scaled_data

image

2.3 Creating Training Data

 #Create the scaled training data set
train_data = scaled_data[0:int(training_data_len), :]
# Split the data into x_train and y_train data sets
x_train = []
y_train = []

for i in range(60, len(train_data)): x_train.append(train_data[i-60:i, 0]) y_train.append(train_data[i, 0]) if i<= 61: print(x_train) print(y_train) print()
# Convert the x_train and y_train to numpy arrays x_train, y_train = np.array(x_train), np.array(y_train)
# Reshape the data x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
# x_train.shape
image

2.4 Model Building

 from keras.models import Sequential
from keras.layers import Dense, LSTM

#Build the LSTM model model = Sequential() model.add(LSTM(128, return_sequences=True, input_shape= (x_train.shape[1], 1))) model.add(LSTM(64, return_sequences=False)) model.add(Dense(25)) model.add(Dense(1))
#Compile the model model.compile(optimizer='adam', loss='mean_squared_error')

2.5 Model Training

 #Train the model
model.fit(x_train, y_train, batch_size=1, epochs=1)
image

2.6 Creating Testing Data

 #Create the testing data set
 #Create a new array containing scaled values from index 1543 to 2002 
test_data = scaled_data[training_data_len - 60: , :]
 
#Create the data sets x_test and y_test x_test = [] y_test = dataset[training_data_len:, :] for i in range(60, len(test_data)): x_test.append(test_data[i-60:i, 0])
#Convert the data to a numpy array x_test = np.array(x_test)
#Reshape the data x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1 ))

2.7 Making Predictions

 #Get the models predicted price values 
predictions = model.predict(x_test)
predictions = scaler.inverse_transform(predictions)
image

2.8 Model Evaluations

 #Get the root mean squared error (RMSE)
rmse = np.sqrt(np.mean(((predictions - y_test) ** 2)))
print("Root Mean Squared Error (RMSE):", rmse)

#Calculate accuracy percentage accuracy_percentage = (1 - (rmse / valid['Close'].mean())) * 100 print("Accuracy Percentage:", accuracy_percentage)
image

2.9 Visualization

 #Plot the data
train = data[:training_data_len]
valid = data[training_data_len:]
valid['Predictions'] = predictions
 
#Visualize the data plt.figure(figsize=(16,6)) plt.title('Model') plt.xlabel('Date', fontsize=18) plt.ylabel('Close Price USD ($)', fontsize=18) plt.plot(train['Close']) plt.plot(valid[['Close', 'Predictions']]) plt.legend(['Train', 'Val', 'Predictions'], loc='lower right') plt.show()
image

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published