Python - Regression - Linear

Data: 

Scikit (diabetes) 442 rows

Mission: 

How to predict the probability of someone will get diabetes from given weight data.

Library used:

  1. Matplotlib
  2. Numpy
  3. Scikit

Code:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn import datasets, linear_model
from sklearn.metrics import mean_squared_error, r2_score

diabetes_X, diabetes_y = datasets.load_diabetes(return_X_y=True)

pd.options.display.float_format = '{:,.4f}'.format
data = pd.DataFrame(data=diabetes_X)
print(data.describe())

diabetes_X = diabetes_X[:, np.newaxis, 2]

data = pd.DataFrame(data=diabetes_X)
print(data.describe())

diabetes_X_train = diabetes_X[:-20]

diabetes_X_test = diabetes_X[-20:]

data = pd.DataFrame(data=diabetes_X_train)
print(data.describe())

data = pd.DataFrame(data=diabetes_X_test)
print(data.describe())

diabetes_y_train = diabetes_y[:-20]
diabetes_y_test = diabetes_y[-20:]

regr = linear_model.LinearRegression()

regr.fit(diabetes_X_train, diabetes_y_train)

diabetes_y_pred = regr.predict(diabetes_X_test)

print('Koef. Determinasi (r2): %.2f' % r2_score(diabetes_y_test, diabetes_y_pred))

plt.scatter(diabetes_X_train, diabetes_y_train,  color='green')

plt.scatter(diabetes_X_test, diabetes_y_test,  color='black')
plt.plot(diabetes_X_test, diabetes_y_pred, color='blue', linewidth=3)

plt.show()


I wrapped the scenario in a Youtube video below.


 Thank you for for visiting this blog & subs my channel.


  Post a Comment

The Algorithms of Artificial Intelligent

I'm new to Python programming and - actually - it just was started in the middle of 2020 for a specific corporate purpose where I'd worked at. So, I've been learned for couple months to explore what is what about Python, and - since there's no best fit guidance about machine learning methodologies - here I summarized about 17 common algorithms I found.

A. Supervised Learning

Split into 2 methods:

  1. Regression
    • Linear
    • Logistic
    • Polynomial
  2. Classification
    • K-Nearest Neighbors (KNN)
    • Decision Tree (DT)
    • Naive Bayes (NB)
    • Support Vector Machine (SVM)

B. Unsupervised Learning 

Split into 3 methods with 2 models (ML & DL):

  • Machine Learning
    1. Clustering
      • K-Means
      • Hierarchical Clustering
      • T-SNE Clustering
      • DBScan
    2. Dimension Reduction
      • Principal Component Analysis
      • Anomaly Detection
      • Auto-Encoder
      • Hebbian Learning
  • Deep Learning
    1. Generative Models
      • Generative Adversarial Network
      • Self Organizing Maps

I wrapped this post on a video I published in Youtube:

Thank you for your reading & subs. I'll update this post as soon as I found any of new algorithms.

Labels: ,

  Post a Comment

Importing Various CSV Datasource in Python

From previous post (https://paparadit.blogspot.com/2020/11/python-common-libraries.html), we learn how to deal with libraries in Python. On this post, I'm going to show you how to import CSV data using Python. I'll update this post as soon as I discover another way how to do it:

1. CSV from local computer

import pandas as pd
from google.colab import files
uploaded = files.upload()
import io
vlog96 = pd.read_csv(io.BytesIO(uploaded['vlog96.csv']))
vlog96.head()

2. CSV from GitHub

import pandas as pd
url = 'https://raw.githubusercontent.com/kokocamp/vlog96/master/vlog96.csv'
vlog96 = pd.read_csv(url)
vlog96.head()

You can see my video below to make a practice on how to import CSV files from local computer and GitHub.


3. CSV from Google Drive (#1)

The long way:

 import pandas as pd

# Code to read csv file into Colaboratory:
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

link = 'https://drive.google.com/open?id=1-2lhLcwe9QNT8okL69QL_tV-O4BN9um6'
fluff, id = link.split('=')
print (id)
downloaded = drive.CreateFile({'id':id})
downloaded.GetContentFile('vlog98.csv')
vlog98 = pd.read_csv('vlog98.csv')
vlog98.head()

4. CSV from Google Drive (#2)

The simple way: 

import pandas as pd

from google.colab import drive
drive.mount('/content/drive')

path = '/content/drive/My Drive/data/vlog96.csv'
vlog96 = pd.read_csv(path)
vlog96.head()

I wrapped both way on a video below: 


For GitHub resource, you can use CSV from my account (https://github.com/kokocamp) on or you can make it your own.

Labels: ,

  Post a Comment