Skip to main content

The future of AI in business and its potential to transform industries

Making Data clusters for iris dataset using k-means algorithm



Hi... In this project, we will create a machine learning model and classify data clusters for different category of flowers.

The dataset we gonna use is Iris dataset.

The category of the flower will be decided by the features (sepal length, sepal width, petal length, petal width)

But first, let's quickly recap what is k-mean.

K-means 

  • K-means comes under the category of unsupervised machine learning.
  • The k-means algorithm creates clusters of data by separating samples in n groups of equal variance, minimizing criterion known as inertia.


We will cover this project on the following points.

1. Importing all required libraries 

2. Loading and preparing data

3. Using Elbow Method

4. Creating a K-means classifier model

5. Plotting graph for Iris Dataset and observe

6. Using Principle component method (PCA)

7. Plotting graph for Classified dataset


1. Importing all required libraries

 First of all, we will import all the required modules and libraries for our project. In this project, we will use the pandas, NumPy, matplotlib and sklearn. 


code.  

import pandas as pd

import numpy as np

from matplotlib import pyplot as plt

from sklearn.datasets import load_iris

from sklearn.decomposition import PCA  

from sklearn.cluster import KMeans



2. Loading and preparing data

This section we will use the built-in iris dataset in the sklearn module. we will also observe all its features and arrange the data inside the data frame. 


code.

iris=load_iris()

iris['data'][:5]



 iris['target'][:5]


iris["feature_names"]


data=pd.DataFrame(data=np.c_[iris['data'],iris['target']], columns=iris['feature_names']+['target'])

data.head()



3. Using Elbow Method

The elbow method is used to get the optimum value of K in K-means. 

The elbow-method runs k-means clustering on the dataset for a range of values for K (say from 1-10) and from each value of  K computes an average score for all clusters.


code. 


sse=[]

k_range=range(1,10)


for k in k_range:

    kmeans=KMeans(n_clusters=k)

    kmeans.fit(iris['data'])

    sse.append(kmeans.inertia_)

plt.plot(k_range,sse)

plt.title("Elbow Method")

plt.xlabel("no. of clusters")

plt.ylabel("average distortion score")

plt.show()

In the above graph the elbow point is 3. Hence we will create 3 classified clusters.  


4. Creating a K-means classifier model  

 Now its time to create the Kmeans clustering model. For that, we will use the Kmeans model and fit the iris features and get the predicted values. 

After getting the predicted values to attach all the values to the target column of the dataset. Now collect all the data related to target=0, 1, and 2.

and finally, calculate the centroids using the cluster centre.


code  

model=KMeans(n_clusters=3)

model.fit(iris['data'])

data.target=model.labels_

df1=data[data.target==0]

df2=data[data.target==1]

df3=data[data.target==2]

model.cluster_centers_ 


5. Plotting graph for Iris Dataset and observe  

Let's plot our model prepared dataset and let see how it looks.


code

plt.figure(figsize=(12,4))

plt.subplot(1,2,1)


plt.scatter(df1[['sepal length (cm)']],df1[['sepal width (cm)']],color="red",label="setosa")

plt.scatter(df2[['sepal length (cm)']],df2[['sepal width (cm)']],color="green",label="versicolor")

plt.scatter(df3[['sepal length (cm)']],df3[['sepal width (cm)']],color="blue",label="verginica")

plt.scatter(model.cluster_centers_[:,[0]],model.cluster_centers_[:,[1]],color="cyan",label="centroid")

plt.title('sepal length vs sepal width')

plt.legend()


plt.subplot(1,2,2)


plt.scatter(df1[['petal length (cm)']],df1[['petal width (cm)']],color="red",label="setosa")

plt.scatter(df2[['petal length (cm)']],df2[['petal width (cm)']],color="green",label="versicolor")

plt.scatter(df3[['petal length (cm)']],df3[['petal width (cm)']],color="blue",label="verginica")

plt.scatter(model.cluster_centers_[:,[2]],model.cluster_centers_[:,[3]],color="cyan",label="centroid")

plt.title('petal length vs petal width')

plt.legend()



From the above graph, we can observe that the data points of the setosa and verginica are overlapping each other.
We need to fix that.

6. Using Principle component method (PCA) 

PCA is an Unsupervised, non-parametric statistical technic primarily used for dimensional reduction in machine learning.

PCA can also be used to filter the noisy dataset.


code   

pca=PCA(n_components=2)

xp=pca.fit_transform(iris['data'])

xp


newdf=pd.DataFrame(xp,columns=['principle_component1','principle_component2'])

newdf


newdf['target']=model.labels_

newdf


7. Plotting graph for Classified dataset


Finally, let's plot our prepared data.

code 

df1=newdf[newdf.target==0]
df2=newdf[newdf.target==1]
df3=newdf[newdf.target==2] 


plt.scatter(df1.principle_component1,df1.principle_component2,color="red",label="setosa",marker='>')
plt.scatter(df2.principle_component1,df2.principle_component2,color="green",label="versiocolor",marker='D')
plt.scatter(df3.principle_component1,df3.principle_component2,color="blue",label="verginica")
plt.xlabel('principle component 1')
plt.ylabel('principle component 2')
plt.title('Data clusters of setosa, versicolor and verginica')

plt.legend()



Video Demonstration




Thanks for your time and stay creative...


Popular posts from this blog

What is machine learning and it's types?

 What is machine learning(ML)? Machine learning is a subset of artificial intelligence. Machine learning enables computers or machines to make data-driven decisions rather than being explicitly programmed for a certain task. These programs or algorithms are designed in a way that they learn and improve over time when are exposed to new data. Examples:- 1. Product recommendations   While checking for a product did you noticed when it recommends a product similar to what you are looking for? or did you noticed "the person bought this product also bought this" combination of products? How are they doing this recommendation? This is machine learning. 2. Email spam and malware filtering  There are a number of spam filtering approaches that email clients use.  To ascertain that these spam filters are continuously updated they are powered by machine learning. 3. Online customer support A number of websites nowadays offer the option to chat with customer support representati...

PIC18 Timer programming in C

The PIC18 timer is divided into 4 types Timer 0 Timer 1 Timer 2 Timer 3 PIC18 timers can be used to generate a time delay or as a counter to count external event happening outside the microcontroller. In this article, we will see how to generate a time delay by programming the PIC18 timer. Timer 0 The timer 0 module has the following features Software is scalable as an 8 bit or 16-bit timer/counter. Readable and writable Dedicated 8 bit software programmable Prescaler Clock source selectable to be internal or external Edge select for external clock Register required for Timer 0 Control register Each timer has a control register called TCON to set the various timer operation modes. T0CON is an 8-bit register used for control of timer 0. TOCON TMR0ON (Timer0 on/off control bit)                                       1 = Enable timer 0                ...

Interface relay with PIC18 microcontroller

Hi... today we will talk about another important component used in the embedded domain called a relay. Relays are electric switch which uses electromagnetism to either form or breaks the existing circuits. With the help of a relay, you can trigger a high voltage operation by a low voltage input signal. Relay is a highly versatile component that is as effective in a complex circuit as in a simple circuit. In this article, we will talk about how to interface relay with PIC18 controller. We will cover the topics in the following points: About the interfacing task Software tools used Required components Circuit diagram Code for controller Upload the HEX file Run simulation About the interfacing task In the interface relay task, with the help of a push-button, we will trigger the relay to control the lighting of the bulb which is of higher voltage. When we press the push button the microcontroller will trigger the relay to change its state, when the relay changes its state from normally clo...