Predicting Exam marks according to the study hours
The machine learning technique which we gonna use is Linear Regression.
This method comes under supervised machine learning. It is one of the basic foundations of machine learning.
Let's quickly recap what is linear regression.
Linear Regression
- Linear regression comes under the category of supervised machine learning.
- Simple linear regression is useful for finding the relationship between two continuous variables.
- It looks for a statistical relationship but the deterministic relationship.
Our main task for this project is:-
What marks I will get if I make a habit to study 9.25 hour every day???
Our task will be completed on 6 steps.
1. Collecting all data and importing required libraries
2. Observing or Wrangling Data
3. Preparing Data
4. Creating the model
5. Checking accuracy
6. Plotting model
1. Collecting all data and importing required libraries
First of all, we will collect all the required libraries and modules which gonna need for our task.
For this task, the modules we need are pandas, NumPy, sklearn, matplotlib.
After importing the modules now read the data from the dataset (CSV file)
The dataset consists of 25 rows and 2 columns.
code
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error
read=pd.read_csv('studentperformance.csv')
read.head(5),read.shape
2. Observing or Wrangling Data
In this section, we will observe the data and try to find out insight it, and also remove the null values and if there any.
In this section, we will also observe the correlation between the variables.
Through the correlation, we are able to understand how strongly the variables are related to each other.
code
read.plot(x='hours',y='scores',style='o')
plt.title('Hours vs Scores',color='red')
plt.xlabel('Hours of study',color='green')
plt.ylabel('marks from class test',color='green')
plt.show()
read.corr()
3. Preparing Data
In this section, we will prepare the data for our model.
we will use train_test_split built-in function of the sklearn and create the train and test data.
we will create 20 train data and 5 test data.
code
x=np.array(read[['hours']])
y=np.array(read[['scores']])
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2,random_state=0)
x_train.size,y_train.size,x_test.size,y_test.size
4. Creating the model
In this section, we will create our machine learning model and also train and test the model.
we know that the line equation is
y=mx+c,
where m = slope or gradient
c = intercept on y-axis
x and y are the variables
we will also find those values to predict our model.
code
model=LinearRegression()
model.fit(x_train,y_train)
predict=model.predict(x_test)
model.coef_
model.intercept_
5. Checking accuracy
Here we will calculate R squared value and mean absolute error.
R squared value also known as the coefficient of determination. it is a statistical measure that explains how much variance of a dependent variable is explained by the independent variable.
Mean absolute error is the measure of error between the paired observation between the same phenomenon.
code
r_square=model.score(x_train,y_train)
r_square
mean_absolute_error=mean_absolute_error(y_test,predict)
mean_absolute_error
6. Plotting model
Now let's plot our model.
code
line=model.coef_*x + model.intercept_
plt.scatter(x,y)
plt.plot(x,line,color='red')
plt.title('Hours vs Scores',color='red')
plt.xlabel('Hours of study',color='green')
plt.ylabel('marks from class test',color='green')
plt.legend(['slope','score'])
plt.show()
Now its time to find the answer to our question.
What marks I will get if I make a habit to study 9.25 hour every day???
code
hours=9.25
my_predict=model.predict([[hours]])
my_predict[0][0]
Video demonstration
Follow me on:
Linkedin - https://www.linkedin.com/in/somen-das-6a933115a/
Instagram - https://www.instagram.com/somen912/?hl=en
And don't forget to subscribe to the blog.
so...
Thanks for your time and stay creative...
CONNECT WITH US