Digit recognition

Problem Statement

Aim: To recognize a single digit saved as an image on PC and display the result

Demo

The image in PC is sent to TIVA microcontroller.
The image being recognized is displayed on NOKIA LCD display.
The predicted digit is displayed on 16×2 LCD display.

Demo video

Data transfer

Image is saved as 8 bit 28×28 grayscale image in PC. It is sent to TIVA microcontroller through UART file transfer. The image write starts with writing a character ‘s’ through UART followed by 784 pixel values in column major format. TIVA accepts the image through UART interrupt service routine. Once the image transfer is complete, it is displayed on NOKIA LCD. The prediction algorithm starts as soon as the image is displayed. Once the prediction is completed, the predicted character is sent to 16×2 LCD display. To read back the image that is currently loaded, a character ‘r’ is sent to UART. The microcontroller will respond with 784 pixel values through UART which can be read.

Learning Objectives

Interfacing NOKIA LCD display
Neural Network training and prediction
UART file transfer
Serial port access using Octave

Hardware connections

Hardware connections

16×2 LCD Interfacing

16×2 LCD interfacing consists of 8 data lines and 3 control bits. Control bits are Enable, Register Select and Register Read/Write. Read/Write is always set as write. Register select is set to low for command register and high for data register. The command or data is latched on positive edge of enable pin.

16×2 LCD Interfacing

NOKIA LCD Interfacing

NOKIA 6100 LCD interfacing is through SPI implemented by bit banging. Each transfer of command/data contains 9 bytes with MSB sent first through MOSI pin. A logic high MSB indicates command and logic low MSB indicates data. The rest of the bits determine the actual command/data. The data or command is latched on each high to low edge of clock. Chip select needs to be made low before each transaction.

Training and prediction

Neural networks are used for recognizing the digit. Training is done in PC using Octave. Sigmoid neuron is the basic building block of neural networks. A neuron takes several inputs (x1,x2, ..) and produces a single output. Inputs can take any value between 0 and 1. It has weights for each input, w1, w2, … and an overall bias, b. Its output is σ(w.x+b), where σ is the sigmoid function.

Neural network used for digit recognition contains 3 layers – Input layer containing 784 (28×28) input neurons, hidden layer containing 20 neurons and output layer with 10 output neurons.

Neural network for digit recognition

Training of the neural network is done with 60000 images of handwritten digits fetched from MNIST database. – http://yann.lecun.com/exdb/mnist/. Goal of the training process is to minimize the cost function which is the average norm of the error between predicted value and target value. Cost function is minimized using stochastic gradient descent algorithm by updating the weights using back propagation after every image prediction. Gradient descent algorithm works on the principle that gradient of a multi dimensional function represents the direction of maximum change in the function with its magnitude being the maximum rate of change. This ensures that each iteration will result in reduction of error provided the step taken is less than the magnitude of the error. At the same time step taken should be high enough for faster convergence of the training. There are 784*20 weights for the hidden layer and 20*10 weights for the output layer.

Code used for Training: https://github.com/gitofsachin/digitrecognition/blob/master/sourcecode/PCTraining/training.o
Output of training: Weights for hidden layer and output layer in separate files (hiddenweights1 & outputweights1)

Weights are found out with an error of 0.06 after 60000 iteration of back propagation. These weights are ported to TIVA microcontroller and is used to find the output of the neural network.Digit is predicted by the micro-controller from 784 pixel values received from UART and the fetched weights.Output vector is found using the following equation
Input to the output layer, [hiddenOutput]_20×1= sigmoid([hiddenWeights]_20×784 * [inputVector]_784×1 )
Output of the output layer, [outputVector]_10×1= sigmoid([outputWeights]_10×20 *[hiddenOutput]_20×1 )

Index of the output neuron which has the highest output value is the predicted digit.

Code used for Prediction in TIVA: https://github.com/gitofsachin/digitrecognition/tree/master/sourcecode/TIVA

Training and prediction algorithm for digit recognition explained in http://neuralnetworksanddeeplearning.com/

Components Used

TIVA microcontroller
NOKIA LCD Display
16×2 LCD Display

Softwares used

Eclipse IDE
TivaWare Peripheral Driver Library
gcc
Octave

References

16×2 LCD Datasheet – https://www.sparkfun.com/datasheets/LCD/ADM1602K-NSW-FBS-3.3v.pdf
MNIST digit database – http://yann.lecun.com/exdb/mnist/
Neural network notes – http://neuralnetworksanddeeplearning.com/

Source code

All the source code can be found at https://github.com/gitofsachin/digitrecognition

Future scope

Training implementation on TIVA
Image read from SD card
Alphabets and multiple character recognition

Team Members

Sachin S
Sajna Remi Clere

Problem Statement

Demo

Data transfer

Learning Objectives

Hardware connections

16×2 LCD Interfacing

16×2 LCD Interfacing

NOKIA LCD Interfacing

Training and prediction

Components Used

Softwares used

References

Source code

Future scope

Team Members

Recent Posts

Recent Comments