Python - Dimension Reduction - Principal Componen Analysis (PCA)

Data: 

Employees when they sent job applicant (40 rows)

Mission:

How to predict the probability of someone will accepted from given gpa, gmat & work_experience

Library used:

Pandas

Scikit

Seaborn

Code:

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn import metrics

import seaborn as sn

 

from sklearn.decomposition import PCA

from sklearn.preprocessing import StandardScaler

 

url = 'https://raw.githubusercontent.com/kokocamp/vlog119/main/vlog119.csv'

vlog133 = pd.read_csv(url)

vlog133.describe()

 

X = vlog133[['gpa', 'gmat','work_experience']]

y = vlog133['admitted']

X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.25,random_state=0)

 

sn.scatterplot(x="gpa",y="admitted",data=vlog133,color="red",alpha=0.5)

 

sc = StandardScaler()

X_train = sc.fit_transform(X_train)

X_test = sc.transform(X_test)

 

pca = PCA(n_components=3)

X_train = pca.fit_transform(X_train)

X_test = pca.transform(X_test)

 

logistic_regression= LogisticRegression()

logistic_regression.fit(X_train,y_train)

y_pred=logistic_regression.predict(X_test)

 

confusion_matrix = pd.crosstab(y_test, y_pred, rownames=['Actual'], colnames=['Predicted'])

sn.heatmap(confusion_matrix, annot=True)

 

print('Accuracy: ',metrics.accuracy_score(y_test, y_pred))

 

print (X_test) #test dataset

print (y_pred) #predicted values

 

new_candidates = {'gpa': [2,3.7,3.3,2.3,3],

                  'gmat': [590,740,680,610,710],

                  'work_experience': [3,4,6,1,5]

                  }

 

df2 = pd.DataFrame(new_candidates,columns= ['gpa','gmat','work_experience'])

df2 = sc.transform(df2)

df2 = pca.transform(df2)

y_pred=logistic_regression.predict(df2)

 

print (df2)

print (y_pred)

I wrapped the scenario in a Youtube video below.



Click this link (http://paparadit.blogspot.com/2020/11/the-algorithms-of-machine-learning.html), if you want to check out for other algorithms. Thank you for for visiting this blog & subs my channel.

Labels: ,


PS: If you've benefit from this blog,
you can support it by making a small contribution.

Enter your email address to receive feed update from this blog:

Post a Comment

 

Post a Comment

Leave comments here...