Skip to main content

The future of AI in business and its potential to transform industries

Predicting the sales from advertising budget using Linear Regression model

Predicting the sales from advertising budget using Linear Regression model         


Hi, in this project I tried to create a prediction model for sales analysis. In this model, we need to feed the advertising budget the model will predict the possible sales. For designing the model the machine learning method I used is linear regression and the tool I used for coding is jupyter notebook. 

For testing and training the model the dataset, I used is from an advertising agency which contains records of the budget for TV advertising, Radio advertising and News advertising and also some sales record.

The process for the model building will be completed in the following steps:

1. Importing all the required modules for the model

2. Extracting all the data from the dataset

3  Data cleaning and wrangling 

4. Preparing training data for model

5. Preparing testing data for model

6. Creating, training and testing the model

7. Checking the accuracy of the model

8. Plotting the model graph to analyse


https://somenplus.blogspot.com/2020/09/predicting-sales-from-advertising.html



1. Importing all required modules.    

Here the python modules I used are pandas, seaborn, sklearn, and matplotlib.


code

import pandas as pd

import seaborn as sns

from sklearn.linear_model import LinearRegression

from sklearn.metrics import mean_squared_error              

from matplotlib import pyplot as plt


2. Extracting data from the dataset  

The dataset was in the form of a CSV file, so I used the read_CSV file function from the pandas module. The picture of the dataset I just have given below, you can observe that it consists of four columns of TV advertising budget, Radio advertising budget, Newspaper Advertising budget and Sales records.


code

read_data=pd.read_csv("tvradioadvertising.csv")

read_data

https://somenplus.blogspot.com/2020/09/predicting-sales-from-advertising.html

3. Data cleaning and Wrangling 

After extracting the data from the dataset now its time to check the purity of the data. Try to find the missing values, null values which are hidden inside the dataset and remove that this will make the model more accurate. 

To check the purity of the data set I use the heatmap function from the seaborn module. This will show the impurities hidden on which column.


code

sns.heatmap(read_data.isnull(),yticklabels=False,cmap='viridis')  

https://somenplus.blogspot.com/2020/09/predicting-sales-from-advertising.html

Luckily my dataset is already clean so we can't see any impurity. but to do if their any impurity present let me show you. 

https://somenplus.blogspot.com/2020/09/predicting-sales-from-advertising.html     


Here is the picture of a heat map of another dataset. In this heatmap, on the rightmost column, you can see the impurities present which is shown by the yellow colour. 
Now to remove this null values you just need to write the following code,

read_data.dropna(inplace=True)

This will remove all the null values rows from the dataset.

And after that, if you again see the heat map this yellow colour will not present.


4. Preparing data from the dataset for the training of the model  

Ok, now its time to prepare our training data for machine learning model. This dataset consists of 200 rows and For these, I will take the first 100 rows of the column TV and store it to the tvadvertising_train variable. 

The same thing also is done with the radio, newspaper and sales column.


code

tvadvertising_train=read_data.iloc[0:100,0:1]

radioadvertising_train=read_data.iloc[0:100,1:2]

newsadvertising_train=read_data.iloc[0:100,2:3]

sales_train=read_data.iloc[0:100,3:4]    


5. Creating test data for the model 

After creating the training data now its time to prepare the testing data for the model. For these, I will take the rest of the 100 rows as test data, from this data we will get the predicted sales values. 

The same thing also done with the radio and newspaper columns. 

 

code

tvadvertising_test=read_data.iloc[100:200,0:1] 

radioadvertising_test=read_data.iloc[100:200,1:2]

newsadvertising_test=read_data.iloc[100:200,2:3]   


6. Creating, training, and testing the model 

Now its time to create the machine learning model, for that all we need to import the Linear regression from the sklearn module. After importing the module fit the tv advertising train data and sales train data to the model this is for training the model.

After training the model now its time to test the model and get the predicted value. For that, we will feed the tv advertising test data and collect the predicted value.

The picture for the output predicted value I had given below.


code

model=LinearRegression()

model.fit(tvadvertising_train,sales_train)

prediction_tv=model.predict(tvadvertising_test)

https://somenplus.blogspot.com/2020/09/predicting-sales-from-advertising.html


7. Checking the accuracy of the model 

So, our model is created now let's check its accuracy. For that we will use the R square method, the sklearn module also consists of r square function to calculate its accuracy. 


code 

r2_score = model.score(tvadvertising_train,sales_train)

f"your R square score is {r2_score*100} %" 

 

We can see that the accuracy of our model TV advertising is 81.950 % and I think it's pretty good.


8. Plotting the model. 

lets plot our model how its look. For plotting the graph I will use matplotlib module.


code

plt.scatter(tvadvertising_train,sales_train)

plt.plot(tvadvertising_test,prediction_tv,color='red')

plt.xlabel("TV advertising budget")

plt.ylabel("sales record")

plt.show()  

https://somenplus.blogspot.com/2020/09/predicting-sales-from-advertising.html 
Here is the graph for the linear regression model looks like.
In this linear graph, we can see the scattered graph which is created by the train data and the model line which is denoted by the red line is created by our test data and predicted output. We can observe that most of the dots are on the line hence we can say that our model is just created the best fit line.


Predicting sales for Radio advertising 

The entire same process also done radio and sales columns to create the model and get the predicted value and plot it.


code

model.fit(radioadvertising_train,sales_train)

prediction_radio=model.predict(radioadvertising_test)  

https://somenplus.blogspot.com/2020/09/predicting-sales-from-advertising.html


r2_score = model.score(radioadvertising_train,sales_train)

f"your R square score is {r2_score*100} %"  

 

plt.scatter(radioadvertising_train,sales_train)

plt.plot(radioadvertising_test,prediction_radio,color='red')

plt.xlabel("radio advertising budget")

plt.ylabel("sales record")

plt.show()

https://somenplus.blogspot.com/2020/09/predicting-sales-from-advertising.html

Predicting sales for News advertising  

Similarly, the same process also is done with the newspaper column and the sales column.


code

model.fit(newsadvertising_train,sales_train)
prediction_news=model.predict(newsadvertising_test)  
https://somenplus.blogspot.com/2020/09/predicting-sales-from-advertising.html



r2_score = model.score(newsadvertising_train,sales_train)
f"your R square score is {r2_score*100} %"   


plt.scatter(newsadvertising_train,sales_train)
plt.plot(newsadvertising_test,prediction_news,color='red')
plt.xlabel("news advertising budget")
plt.ylabel("sales record")
plt.show()

https://somenplus.blogspot.com/2020/09/predicting-sales-from-advertising.html

From the entire model we observed

accuracy for TV advertising model is 81.950 % 
accuracy for Radio advertising model is 15.583 % 
accuracy for TV advertising model is 1.286 % 

Hence the model for TV advertising is the best for sales prediction.  



Follow me on: 

Linkedin - https://www.linkedin.com/in/somen-das-6a933115a/  

Instagram - https://www.instagram.com/somen912/?hl=en

And don't forget to subscribe to the blog.

so...

Thanks for your time and stay creative...






Popular posts from this blog

Top 7 domains to expertise after learning python

Top 7 domains to expertise after learning python Python is one of the most popular programming languages. It is an object-oriented, dynamic semantics and high-level programming language. It's high-level built-in data structure combined with dynamic binding and dynamic typing makes it attractive for rapid application development. Often programmers fall in love with python because of the increased productivity it provides. Python is one of the most readable languages in the world right now. This language is very popular among developers and is widely used by many programmers to create application and programs. The implementation of this programming language is simple and at the same time, the language has a very clean structure as compared to other languages.  So if you mastered your python concepts and skills, you can dominate these 7 domains. Machine learning / Artificial intelligence Desktop GUI Data analytics and data visualization  Web development Game development Mobile ap...

Different domains of Artificial intelligence(AI)

Artificial intelligence is a computer system that is able to perform tasks that ordinarily require human intelligence. Artificial intelligence systems are critical for companies that wish to extract value from data by automating and optimizing processes or producing actionable insights. There are certain domains of artificial intelligence on which we can create our expertise Machine learning Deep learning Robotics Expert systems Fuzzy logic Natural language processing Computer vision  1. Machine learning Machine learning is a subset of artificial intelligence. Machine learning enables computers or machines to make data-driven decisions rather than being explicitly programmed for a certain task. These programs or algorithms are designed in a way that they learn and improve over time when are exposed to new data. Different types of machine learning models Supervised learning Unsupervised learning Reinforcement learning Use cases Product recommendation on a shopping website. spam fil...

Top 5 free machine learning courses with certificate

In the era of 21st-century artificial intelligence, machine learning and data science became the most demanding and highest paying skills. After the covid-19 pandemic situation, the working style of the corporate sectors and the business had completely changed, now most of the business deals are made on the basis of data analysis, or when it comes to making the businesses automation the people hire a  machine learning engineer.    Hence it becomes really important for those who work in the corporate sector or a student pursuing a degree to get a job should update himself with these skills.  So, I listed you Top 5 machine learning courses from one of the leading organisations with completion certificates at free of cost. 1. Machine learning in the cloud with AWS batch About This course describes how to run and accelerate your machine learning applications in the cloud using AWS batch. AWS Batch is a fully managed service that allows you to easily and efficiently run b...