Python - Dimension Reduction – Anomaly Detection (Outlier Detection)
Data:
Employees when they sent job applicant (40 rows)
Mission:
How to find & learn about data anomaly from graphic output result
Library used:
Pandas
Numpy
Matplotlib
Seaborn
Scikit
PyOD
Code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt (#5)
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
!pip install pyod
from pyod.utils.data import get_outliers_inliers
from pyod.models.pca import PCA
from pyod.utils.data import evaluate_print
from pyod.utils.example import visualize
url = 'https://raw.githubusercontent.com/kokocamp/vlog119/main/vlog119.csv'
vlog134 = pd.read_csv(url)
vlog134.describe()
X = vlog134[['gpa', 'gmat','work_experience']]
y = vlog134['admitted']
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.25,random_state=0)
X_train = pd.DataFrame(X_train)
X_train['y'] = y_train
X_test = pd.DataFrame(X_test)
X_test['y'] = y_test
X_train.head()
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
df_train = pd.DataFrame(X_train)
X_test = sc.fit_transform(X_test)
df_test = pd.DataFrame(X_test)
sns.scatterplot(x=df_train[0], y=df_train[1], hue=df_train[3], data=df_train)
plt.title('Ground Truth')
pca = PCA(n_components=3)
pca.fit(X_train)
y_train_pred = pca.labels_
y_train_scores = pca.decision_scores_
sns.scatterplot(x=df_train[0], y=df_train[1], hue=y_train_scores, data=df_train, palette='RdBu_r');
plt.title('Skor Anomali PCA');
axes = df_train.plot(subplots=True, figsize=(16, 8), title='Simulated Anomaly Data for Training');
plt.show()
axes = df_test.plot(subplots=True, figsize=(16, 8), title='Simulated Anomaly Data for Test');
plt.show()
I wrapped the scenario in a Youtube video below.
Click this link (http://paparadit.blogspot.com/2020/11/the-algorithms-of-machine-learning.html), if you want to check out for other algorithms. Thank you for for visiting this blog & subs my channel.
Labels: Programming, Python
PS: If you've benefit from this blog, you can support it by making a small contribution. |
Post a Comment
Leave comments here...