Ad Code

Predicting Exam marks according to the study hours

 Predicting Exam marks according to the study hours



Hi.. this another machine learning exercise, this time we will predict the marks fo our exam according to the study hours. 

The machine learning technique which we gonna use is Linear Regression.

This method comes under supervised machine learning. It is one of the basic foundations of machine learning.

Let's quickly recap what is linear regression.

Linear Regression

  • Linear regression comes under the category of supervised machine learning.
  • Simple linear regression is useful for finding the relationship between two continuous variables.
  • It looks for a statistical relationship but the deterministic relationship.


Our main task for this project is:-

What marks I will get if I make a habit to study 9.25 hour every day???  


Our task will be completed on 6 steps.

1. Collecting all data and importing required libraries  

2. Observing or Wrangling Data

3. Preparing Data

4. Creating the model

5. Checking accuracy

6. Plotting model 



1. Collecting all data and importing required libraries  

First of all, we will collect all the required libraries and modules which gonna need for our task.

For this task, the modules we need are pandas, NumPy, sklearn, matplotlib.

After importing the modules now read the data from the dataset (CSV file)

The dataset consists of 25 rows and 2 columns.


code  


import pandas as pd

import numpy as np

from matplotlib import pyplot as plt

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression

from sklearn.metrics import mean_absolute_error  

read=pd.read_csv('studentperformance.csv')

read.head(5),read.shape   





2. Observing or Wrangling Data  

In this section, we will observe the data and try to find out insight it, and also remove the null values and if there any.

In this section, we will also observe the correlation between the variables. 

Through the correlation, we are able to understand how strongly the variables are related to each other.


code 


read.plot(x='hours',y='scores',style='o')

plt.title('Hours vs Scores',color='red')

plt.xlabel('Hours of study',color='green')

plt.ylabel('marks from class test',color='green')

plt.show() 


read.corr()



3. Preparing Data

In this section, we will prepare the data for our model.

we will use train_test_split built-in function of the sklearn and create the train and test data.

we will create 20 train data and 5 test data.


code  


x=np.array(read[['hours']])

y=np.array(read[['scores']])

x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2,random_state=0)

x_train.size,y_train.size,x_test.size,y_test.size  


4. Creating the model  

In this section, we will create our machine learning model and also train and test the model.

we know that the line equation is 

               y=mx+c,

where m = slope or gradient

           c = intercept on y-axis 

           x and y are the variables

we will also find those values to predict our model.


code  

model=LinearRegression()

model.fit(x_train,y_train)

predict=model.predict(x_test)

model.coef_        

model.intercept_   


5. Checking accuracy

Here we will calculate R squared value and mean absolute error.

R squared value also known as the coefficient of determination. it is a statistical measure that explains how much variance of a dependent variable is explained by the independent variable.

Mean absolute error is the measure of error between the paired observation between the same phenomenon.


code   


r_square=model.score(x_train,y_train)

r_square 


mean_absolute_error=mean_absolute_error(y_test,predict)

mean_absolute_error


6. Plotting model

Now let's plot our model.


code 


line=model.coef_*x + model.intercept_

plt.scatter(x,y)

plt.plot(x,line,color='red')

plt.title('Hours vs Scores',color='red')

plt.xlabel('Hours of study',color='green')

plt.ylabel('marks from class test',color='green')

plt.legend(['slope','score'])

plt.show()



 Now its time to find the answer to our question.

What marks I will get if I make a habit to study 9.25 hour every day??? 


code 

hours=9.25

my_predict=model.predict([[hours]])

my_predict[0][0]

 


our result is 93.691173 marks


Video demonstration  






Follow me on: 

Linkedin - https://www.linkedin.com/in/somen-das-6a933115a/  

Instagram - https://www.instagram.com/somen912/?hl=en

And don't forget to subscribe to the blog.

so...

Thanks for your time and stay creative...