Digits Recognizer using Python and React. Train the model.

Do you have any questions? Contact us!

I agree the Terms of Service

published April 4, 2018

Today's topic is an introduction to machine learning combined with computer vision. I will show you a kNN fitting model with the handwritten digits data taken from the MNIST Database with accuracy assessment.

#Environment setup

I chose Anaconda as my environment, so I'll skip the dependencies installation step.

#Import dependencies

from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report
from sklearn import datasets
import numpy as np
import matplotlib.pyplot as plt
import cv2

#Prepare data

The dataset from MNIST Database is available in the datasets module of sklearn, so let's start loading the data.

digits = datasets.load_digits()

We need to have two different datasets: one for testing and the other for training our model.

(X_train, X_test, y_train, y_test) = train_test_split(
    digits.data, digits.target, test_size=0.25, random_state=42
)

#Fit the model

Let's find the best parameter k. We can't just make up the k value, so let's train the model and evaluate k parameter accuracy.

ks = np.arange(2, 10)
scores = []
for k in ks:
    model = KNeighborsClassifier(n_neighbors=k)
    score = cross_val_score(model, X_train, y_train, cv=5)
    score.mean()
    scores.append(score.mean())
plt.plot(scores, ks)
plt.xlabel('accuracy')
plt.ylabel('k')

As the output we can see such plot:

Looking at this chart, we can understand that the best accuracy was reached when k was 3. So from now, we'll be using k=3 for our model.

#Evaluate the model on the test data

model = KNeighborsClassifier(n_neighbors=3)
model.fit(X_train, y_train)

z = model.predict(X_test)

Let's now create a classification report to see the accuracy.

print(classification_report(y_test, z))

Amazing! We've reached 99% accuracy!

Did you like this article?

Share article on social networks

Originally published at teimurjan.github.io.

Teimur Gasanov

Python/Go/Javascript full stack developer

⇐ Back to all articles