Python - Clustering - T-Distributed Stochastic Neighbor Embedding (T-SNE)
Data:
Questionnaire data from mall visitors contains sex, age, salary & shopping score (200 rows).
Mission:
How to predict the probability of shopping score from given age & salary
Library used:
- Pandas
- Numpy
- Seaborn
- Matplotlib
- Scikit
Code:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn.manifold import TSNE
url = 'https://raw.githubusercontent.com/kokocamp/vlog101/master/vlog101.csv'
vlog128 = pd.read_csv(url)
vlog128.info()
X = vlog128[['Usia','Gaji (juta)']]
y = vlog128['Skor Belanja (1-100)']
sns.scatterplot(x="Usia",y="Gaji (juta)",data=vlog128,color="red",alpha=0.5)
n_random = 0
n_kom = 2
perplex = 50
X = np.array(X)
tsne = TSNE(n_components=n_kom, perplexity=perplex).fit_transform(X)
#print(tsne)
n_clusters = 5
kmeans = KMeans(n_clusters)
kmeans.fit(X)
print(kmeans.cluster_centers_)
nc = []
for i in range(n_clusters):
nc.append(i)
print(kmeans.labels_)
vlog128["kluster"] = kmeans.labels_
vlog128.head()
fig, ax = plt.subplots()
sct = ax.scatter(tsne[:,0],tsne[:,1], c = vlog128.kluster, marker = "+", alpha = 0.5)
plt.title("Hasil Klustering T-SNE")
plt.xlabel("Usia")
plt.ylabel("Gaji (juta)")
plt.show()
usia = input("Usia (thn): ")
usia = int(usia)
gaji = input("Gaji (juta): ")
gaji = int(gaji)
data = [usia,gaji]
hasil = kmeans.predict([data])
print("Prediksi Kluster (0-4): ", hasil)
data = np.array([data])
fig, ax = plt.subplots()
sct = ax.scatter(tsne[:,0],tsne[:,1], c = vlog128.kluster, marker = "+", alpha = 0.5)
plt.title("Hasil Klustering Hierarchical Clustering")
plt.xlabel("Usia")
plt.ylabel("Gaji (juta)")
xx = np.append(X,data,axis=0)
tsne2 = TSNE(n_components=n_kom, perplexity=perplex).fit_transform(xx)
plt.scatter(tsne2[:,0],tsne2[:,1], marker = "o", c = "red", alpha = 0.25)
plt.show()
I wrapped the scenario in a Youtube video below.
Click this link (http://paparadit.blogspot.com/2020/11/the-algorithms-of-machine-learning.html), if you want to check out for other algorithms. Thank you for for visiting this blog & subs my channel.
Labels: Programming, Python
PS: If you've benefit from this blog, you can support it by making a small contribution. |
Post a Comment
Leave comments here...