How to Create Text to Speech from Google Colab using gTTS Library

Google Text-to-Speech (gTTS) is one of several text-to-speech library available to used on Google Colab. It has simple code and reliable to use with, for example it also provide a simple language parameter so the speech out depend on what country id will be used. Don't forget to add audio player library since gTTS will create an audio file on-the-fly and need the audio player to make a sound. Check a compact code below and run it from Google Colab:

!pip install gTTS
from gtts import gTTS
from IPython.display import Audio

tts = gTTS('Selamat datang di channel youtube eko wahyudiharto',lang='id')
tts.save('1.wav')
sound_file = '1.wav'
Audio(sound_file, autoplay=True) 
 

I wrapped above code into a 5 mins video below in  my Youtube channel (in Bahasa). Please subscribe if you haven't or send your thumbs for the video or share it if you like.


PS: Still unable how to play a loop string in a 'for' syntax. Anyone can help? Share your thoughts on below comment box. Thank you!

Labels: ,

  Post a Comment

Tutorial Indonesian Natural Language Processing using Sastrawi | Google Colab Python

Definition of terms:

Sastrawi is a simple PHP library that allows you to reduce inflected words in Indonesian (Bahasa Indonesia) to their basic form (stem).

Cleansing is an activity to improve data systematically using certain algorithms.

Stemming is the process of changing affixed words into root words.

Tokenizing is the process of dividing text, which can be in the form of sentences, paragraphs or documents, into certain tokens/parts. Tokenization is often used in linguistics and the tokenization results are useful for further text analysis.

Stop-words are common words that usually appear in large numbers and are considered meaningless. Stop words are commonly used in information retrieval tasks, including by Google.

Source:

import requests
import string
import re

from bs4 import BeautifulSoup
import nltk
from nltk.corpus import stopwords

!pip install Sastrawi
from Sastrawi.Stemmer.StemmerFactory import StemmerFactory

web = requests.get('https://wartakota.tribunnews.com/').text
soup = BeautifulSoup(web)
for s in soup(['script', 'style']):
        s.decompose()
teks = ' '.join(soup.stripped_strings)
print (teks)

teks = teks.lower()
teks = re.sub(r"\d+", "", teks) #remove number
teks = teks.translate(str.maketrans("","",string.punctuation)) #remove punctuation
teks = teks.strip() #remove empty character

factory = StemmerFactory()
stemmer = factory.create_stemmer()
output   = stemmer.stem(teks)
print (output)

tokens = [t for t in output.split()]
print(tokens)

nltk.download()
clean_tokens = tokens[:]
for token in tokens:
  if token in stopwords.words('indonesian'):
      clean_tokens.remove(token)

freq = nltk.FreqDist(clean_tokens)
for key,val in freq.items():
  print(str(key) + ':' + str(val))

freq.plot(30)

And I wrapped them all into single video below:


Please support this blog or my video channel with subscribe button or like & share if you like it.

Labels: ,

  Post a Comment

Python - Dimension Reduction – Self Organizing Maps (SOM)

Data:

Employees when they sent job applicant (40 rows)

 

Mission:

How to visualize data mapping using SOM algorithm

 

Library used:

Pandas

Scikit

SimpSOM

 

Code:

import pandas as pd

from sklearn.preprocessing import StandardScaler

 

!pip install SimpSOM

import SimpSOM as sps

 

url = 'https://raw.githubusercontent.com/kokocamp/vlog119/main/vlog119.csv'

vlog139 = pd.read_csv(url)

 

X = vlog139[['gpa','gmat','work_experience','admitted']]

y = vlog139['admitted']

 

scaler = StandardScaler()

data = scaler.fit_transform(pd.DataFrame(X))

labels = scaler.fit_transform(pd.DataFrame(y))

 

print(data)

 

net = sps.somNet(50, 50, data)

 

#Train the network for 10000 epochs and with initial learning rate of 0.1.

net.train(0.01, 1000)

 

#Print a map of the network nodes and colour them according to the first feature (column number 0) of the dataset

#and then according to the distance between each node and its neighbours.

net.nodes_graph()

net.diff_graph()

 

#Project the datapoints on the new 2D network map.

net.project(data, labels=labels)

 

#Cluster the datapoints according to the Quality Threshold algorithm.

net.cluster(data)

 

I wrapped the scenario in a Youtube video below.


 

Click this link (http://paparadit.blogspot.com/2020/11/the-algorithms-of-machine-learning.html), if you want to check out for other algorithms. Thank you for for visiting this blog & subs my channel.

Labels: ,

  Post a Comment

Python - Dimension Reduction – Generative Adversarial Model (GAN)

Data: 

Salary list from job position (11 rows)

 

Mission:

How to generate data prediction from current existing data

 

Library used:

Matplotlib

Numpy

Pandas

PyTorch

 

Code:

import matplotlib.pyplot as plt

import numpy as np

import pandas as pd

import torch

from torch import nn

 

class Discriminator(nn.Module):

    def __init__(self):

        super().__init__()

        self.model = nn.Sequential(

            nn.Linear(2, 256),

            nn.ReLU(),

            nn.Dropout(0.3),

            nn.Linear(256, 128),

            nn.ReLU(),

            nn.Dropout(0.3),

            nn.Linear(128, 64),

            nn.ReLU(),

            nn.Dropout(0.3),

            nn.Linear(64, 1),

            nn.Sigmoid(),

        )

 

    def forward(self, x):

        output = self.model(x)

        return output

 

discriminator = Discriminator()

 

class Generator(nn.Module):

    def __init__(self):

        super().__init__()

        self.model = nn.Sequential(

            nn.Linear(2, 16),

            nn.ReLU(),

            nn.Linear(16, 32),

            nn.ReLU(),

            nn.Linear(32, 2),

        )

 

    def forward(self, x):

        output = self.model(x)

        return output

 

generator = Generator()

 

url = 'https://raw.githubusercontent.com/kokocamp/vlog120/main/vlog120.csv'

vlog138 = pd.read_csv(url)

print(vlog138)

 

X = vlog138['Level']

y = vlog138['Gaji']

 

tensor_X = torch.from_numpy(X.values)

tensor_y = torch.from_numpy(y.values)

 

train_data_length = 10

train_data = torch.zeros((train_data_length, 2))

train_data[:, 0] = tensor_X

train_data[:, 1] = tensor_y

train_labels = torch.zeros(train_data_length)

train_set = [

    (train_data[i], train_labels[i]) for i in range(train_data_length)

]

plt.plot(train_data[:, 0], train_data[:, 1], ".")

#==============================

batch_size = 10

train_loader = torch.utils.data.DataLoader(

    train_set, batch_size=batch_size, shuffle=True

)

#==============================

lr = 0.001

num_epochs = 10

loss_function = nn.BCELoss()

 

optimizer_discriminator = torch.optim.Adam(discriminator.parameters(), lr=lr)

optimizer_generator = torch.optim.Adam(generator.parameters(), lr=lr)

 

for epoch in range(num_epochs):

    for n, (real_samples, _) in enumerate(train_loader):

        # Data for training the discriminator

        real_samples_labels = torch.ones((batch_size, 1))

        latent_space_samples = torch.randn((batch_size, 2))

        generated_samples = generator(latent_space_samples)

        generated_samples_labels = torch.zeros((batch_size, 1))

        all_samples = torch.cat((real_samples, generated_samples))

        all_samples_labels = torch.cat(

            (real_samples_labels, generated_samples_labels)

        )

 

        # Training the discriminator

        discriminator.zero_grad()

        output_discriminator = discriminator(all_samples)

        loss_discriminator = loss_function(output_discriminator, all_samples_labels)

        loss_discriminator.backward()

        optimizer_discriminator.step()

 

        # Data for training the generator

        latent_space_samples = torch.randn((batch_size, 2))

 

        # Training the generator

        generator.zero_grad()

        generated_samples = generator(latent_space_samples)

        output_discriminator_generated = discriminator(generated_samples)

        loss_generator = loss_function(

            output_discriminator_generated, real_samples_labels

        )

 

        loss_generator.backward()

        optimizer_generator.step()

 

        # Show loss

        if epoch % 10 == 0 and n == batch_size - 1:

            print(f"Epoch: {epoch} Loss D.: {loss_discriminator}")

            print(f"Epoch: {epoch} Loss G.: {loss_generator}")

==============================

latent_space_samples = torch.randn(10, 2)

generated_samples = generator(latent_space_samples)

 

generated_samples = generated_samples.detach()

plt.plot(generated_samples[:, 0], generated_samples[:, 1], ".")

 

I wrapped the scenario in a Youtube video below.


 

Click this link (http://paparadit.blogspot.com/2020/11/the-algorithms-of-machine-learning.html), if you want to check out for other algorithms. Thank you for for visiting this blog & subs my channel.

Labels: ,

  Post a Comment

Python - Dimension Reduction – Hebbian Learning

Data: 

Employees when they sent job applicant (40 rows)

 

Mission: 

How to predict the probability of someone will accepted from given gpa, gmat & work_experience

 

Library used: 

Numpy

Pandas

Scikit

Neupy

 

Code:

import numpy as np

import pandas as pd

 

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

from sklearn.metrics import accuracy_score

 

!pip install neupy

from neupy import algorithms

 

url = 'https://raw.githubusercontent.com/kokocamp/vlog119/main/vlog119.csv'

vlog136 = pd.read_csv(url)

vlog136.describe()

 

X = vlog136[['gpa','gmat','work_experience']]

y = vlog136['admitted']

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.25,random_state=0)

 

sc = StandardScaler()

X_train = sc.fit_transform(X_train)

X_test = sc.transform(X_test)

 

hebbnet = algorithms.HebbRule(

    n_inputs=3,

    n_outputs=1,

    n_unconditioned=1,

    step=0.1,

    decay_rate=0.2,

    verbose=True

)

hebbnet.train(X_train, epochs=10)

y_pred = hebbnet.predict(X_test)

 

print(y_pred)

 

print('Accuracy : '+str(accuracy_score(y_test, y_pred)))

 

new_candidates = {'gpa': [2,3.7,3.3,2.3,3],

                  'gmat': [590,740,680,610,710],

                  'work_experience': [3,4,6,1,5]

                  }

 

df2 = pd.DataFrame(new_candidates,columns= ['gpa','gmat','work_experience'])

df2 = sc.transform(df2)

y_pred=hebbnet.predict(df2)

 

print (df2)

print (y_pred)

I wrapped the scenario in a Youtube video below.


 

Click this link (http://paparadit.blogspot.com/2020/11/the-algorithms-of-machine-learning.html), if you want to check out for other algorithms. Thank you for for visiting this blog & subs my channel.

Labels: ,

  Post a Comment