Analysis and Prediction of stock price using LSTM

Stock price prediction refers to understanding various aspects of the stock market that can influence the price of a stock, and based on these potential factors, build a model to predict the stock's price. This can help individuals and institutions speculate on the stock price trend and help them decide whether to buy or short the stock price to maximize their profit. While using Machine Learning and Time Series helps us to discover the future value of a particular stock and other financial assets traded on an exchange. The entire idea of analysis and prediction is to gain significant profits.

Focus areas for Analysis:

The change in closing price of the stock over time.
Visualization of Candlestick Monthly data.
The % daily return of the stock.
The moving average of various stocks.

Prediction:

We will be predicting future stock behaviour by predicting the closing price of the stock using LSTM.

#Import Libraries

!pip install datetime numpy pandas yfinance seaborn matplotlib
import datetime
import numpy as np
import pandas as pd
import yfinance as yf
import seaborn as sns
import matplotlib.pyplot as plt

Exploratory Data Analysis (EDA) using ELT (Extract, Load, Transform)

Dataset

I have taken the stock price data of United Breweries Holdings Limited from Yahoo Finance from 1st Jan 2022 to 1st Jan 2023.

Time Period of Data: Define the timeframe for which you want to fetch data.

start_date = datetime.datetime(2020, 1, 15)
end_date = datetime.datetime(2023, 12, 31)

Loading Data from Yahoo Finance

 df = yf.download('UBL.NS', start_date, end_date)

View Dataframe

df

Check index

print(df.index)

Reset Index

 df1 = df.reset_index()
df1['Date'] = pd.to_datetime(df1['Date'])
df1

Converting from Daily to Monthly Frequency data

monthly_data = df.resample('M').agg({'Open': 'first', 'High': 'max', 'Low': 'min', 'Close': 'last', 'Volume': 'sum'})
monthly_data.head()

Plot - Line and Frequency -Daily closing Price

 # Plotting
plt.figure(figsize=(10, 6))
#plt.plot(df['Date'], df['Open'], label='Open')
#plt.plot(df['Date'], df['High'], label='High')
#plt.plot(df['Date'], df['Low'], label='Low')
plt.plot(df.index, df['Close'], label='Close')


plt.title('UBL Stock Prices Over Time')
plt.xlabel('Date')
plt.ylabel('Closing Price')
plt.legend()
plt.grid(True)
plt.show()

Plot - Candlestick and Frequency - Monthly OHLC Volume Data

 #Plotting monthly candlestick chart with a separate volume plot with MA(20)
#mpf.plot(monthly_data, type='candle', style='charles', volume=True, mav=(20), show_nontrading=True, addplot=mpf.make_addplot(monthly_data['Volume'], panel=1, ylabel='Volume'),tight_layout=True, figratio=(16, 9), scale_width_adjustment=dict(volume=0.7, candle=1))


 #Plotting monthly candlestick chart with a separate volume plot
mpf.plot(monthly_data, type='candle', style='charles', volume=True, show_nontrading=True, tight_layout=True, figratio=(16, 9), scale_width_adjustment=dict(volume=0.7, candle=1))

Total Rows & Columns

df.shape

Data Information

 df.info()

Data Quality Check

Duplicate Values

 len(df[df.duplicated()])

Missing Values/Null Values

 print(df.isnull().sum())

Variable Information

 # Columns
df.columns

 #Describe
df.describe()

 # Check unique values for each variable
for i in df.columns.tolist():
  print("No. of unique values in ",i,"is",df[i].nunique(),".")

Analysis

Plotting Moving Average (50,200) is a simple technal analysis that smooths out price data.

 ma_day = [50, 200]
plt.figure(figsize=(10, 6))
#Plot Close price
plt.plot(df.index, df['Close'], label='Close')
#Plot Moving Averages
for ma in ma_day:
column_name = f"MA for {ma} days"
df[column_name]=df['Close'].rolling(ma).mean()
plt.plot(df.index, df[column_name], label=column_name)


plt.title('UBL Daily Close Prices and Moving Averages')
plt.xlabel('Date')
plt.ylabel('Price')
plt.legend()
plt.grid(True)
plt.show()

Average Daily Returns

 #Calculate daily return percentage
df['Daily Return'] = df['Close'].pct_change()


plt.figure(figsize=(10, 6))
plt.plot(df.index, df['Daily Return'], linestyle='--', marker='o', label='Daily Return')


plt.title('Daily Return Percentage')
plt.xlabel('Date')
plt.ylabel('Percentage')
plt.legend()
plt.grid(True)
plt.show()

plt.figure(figsize=(12, 9))
df['Daily Return'].hist(bins=50, alpha=0.5, label='UBL')


plt.xlabel('Daily Return')
plt.ylabel('Counts')
plt.title('Daily Return of UBL using histogram')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

2. Prediction using LTSM (Long Short-Term Memory) is a type of recurrent neural network (RNN) architecture well-suited for sequence prediction problems.

2.0 Before Prediction

 plt.figure(figsize=(16,6))
plt.title('Close Price History')
plt.plot(df['Close'])
plt.xlabel('Date', fontsize=18)
plt.ylabel('Close Price', fontsize=18)
plt.show()

2.1 Data Prepartion

 #Create a new dataframe with only the 'Close column 
data = df.filter(['Close'])
 

 #Convert the dataframe to a numpy array because ML/DL libraries requires numpy arrays as inputs
dataset = data.values
 

 #Get the number of rows to train the model on
training_data_len = int(np.ceil( len(dataset) * .80 ))
training_data_len

2.2 Data Scaling

 #Scale the data
from sklearn.preprocessing import MinMaxScaler


scaler = MinMaxScaler(feature_range=(0,1))
scaled_data = scaler.fit_transform(dataset)
scaled_data

2.3 Creating Training Data

 #Create the scaled training data set
train_data = scaled_data[0:int(training_data_len), :]
# Split the data into x_train and y_train data sets
x_train = []
y_train = []


for i in range(60, len(train_data)):
    x_train.append(train_data[i-60:i, 0])
    y_train.append(train_data[i, 0])
    if i<= 61:
        print(x_train)
        print(y_train)
        print()
 
   
 # Convert the x_train and y_train to numpy arrays 
x_train, y_train = np.array(x_train), np.array(y_train)


 # Reshape the data
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
 

 # x_train.shape

2.4 Model Building

 from keras.models import Sequential
from keras.layers import Dense, LSTM


 #Build the LSTM model
model = Sequential()
model.add(LSTM(128, return_sequences=True, input_shape= (x_train.shape[1], 1)))
model.add(LSTM(64, return_sequences=False))
model.add(Dense(25))
model.add(Dense(1))


 #Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

2.5 Model Training

 #Train the model
model.fit(x_train, y_train, batch_size=1, epochs=1)

2.6 Creating Testing Data

 #Create the testing data set
 #Create a new array containing scaled values from index 1543 to 2002 
test_data = scaled_data[training_data_len - 60: , :]
 

 #Create the data sets x_test and y_test
x_test = []
y_test = dataset[training_data_len:, :]
for i in range(60, len(test_data)):
    x_test.append(test_data[i-60:i, 0])
    

 #Convert the data to a numpy array
x_test = np.array(x_test)


 #Reshape the data
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1 ))

2.7 Making Predictions

 #Get the models predicted price values 
predictions = model.predict(x_test)
predictions = scaler.inverse_transform(predictions)

2.8 Model Evaluations

 #Get the root mean squared error (RMSE)
rmse = np.sqrt(np.mean(((predictions - y_test) ** 2)))
print("Root Mean Squared Error (RMSE):", rmse)


 #Calculate accuracy percentage
accuracy_percentage = (1 - (rmse / valid['Close'].mean())) * 100
print("Accuracy Percentage:", accuracy_percentage)

2.9 Visualization

 #Plot the data
train = data[:training_data_len]
valid = data[training_data_len:]
valid['Predictions'] = predictions
 

#Visualize the data
plt.figure(figsize=(16,6))
plt.title('Model')
plt.xlabel('Date', fontsize=18)
plt.ylabel('Close Price USD ($)', fontsize=18)
plt.plot(train['Close'])
plt.plot(valid[['Close', 'Predictions']])
plt.legend(['Train', 'Val', 'Predictions'], loc='lower right')
plt.show()

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
Analysis_and_Prediction_of_stock_price_using_LSTM.ipynb		Analysis_and_Prediction_of_stock_price_using_LSTM.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Analysis and Prediction of stock price using LSTM

Analysis

2. Prediction using LTSM (Long Short-Term Memory) is a type of recurrent neural network (RNN) architecture well-suited for sequence prediction problems.

About

Uh oh!

Releases

Packages

Languages

pynip/analysis-and-prediction-of-stock-price-using-LSTM

Folders and files

Latest commit

History

Repository files navigation

Analysis and Prediction of stock price using LSTM

Analysis

2. Prediction using LTSM (Long Short-Term Memory) is a type of recurrent neural network (RNN) architecture well-suited for sequence prediction problems.

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages