Let’s come to the trickiest (and challenging) part – the training itself.
I’m using Tensorflow with Keras (for some reasons..)
What do we need for training ?
First: a lot (and I mean: a lot) of images for training and validation. In this example I’m using about 5500 images, 28*28 pixels , grayscale. I have 36 Classes (characters), so I have about 150 images of each character (letter or number). How can you generate such amount of images ? I’ll tell it later…
Second we need our (OSX) Terminal and a lot of python modules.
To make life easier, I’m using Anaconda to have a virtual environment for python.
Let’s start:
Download and install Anaconda .
After that you have to open a new Terminal Shell …
OK.
Stay in your home-directory and type (you can paste the line)
1 |
conda create -n ocrTraining python=2.7 |
his will make a fresh Python (2.7) environment named “ocrTraining“.
Lets go to this environment:
1 |
cd anaconda2/envs/ocrTraining/ |
Now we have to activate this environment:
1 |
source activate ocrTraining |
Now let’s install all the necessary modules. One after each other ..
1 |
conda install tensorflow |
1 |
conda install numpy |
1 |
conda install pillow |
1 |
conda install h5py |
1 |
conda install pandas |
1 |
conda install scikit-learn |
Some modules won’t install from ‘conda‘ – so I’m using instead ‘pip‘
1 |
pip install keras==2.0.6 |
We’re using Keras v. 2.0.6 because of Apples CoreMLTools … (thats why we’re using Python 2.7 ..)
1 |
pip install coremltools |
Let’s go ahead:
As I mentioned, you need a lot of training-images; you can download my dataset from my repository at Github
Extract the content (a lot of PNGs and one CSV) into
1 |
anaconda2/envs/ocrTraining/train_data_inconsolata |
(inside this folder should ONLY PNG-Files and one CSV-File)
Now download the Python-script from my repository at Github
and put it into
1 |
anaconda2/envs/ocrTraining |
Great !
Let’s start training –
type
1 |
python2 train_OCR.py |
On my MacBookPro (late 2016) one epoch (and we have five) will take about only 5 seconds – ok, it’s not such amount of images, BUT:
the accuracy is 0.999272726622 what means 99.92 % !!!
If it’s finished (an it should without big errors – some syntax-thingies) you’ll have at the end in your folder a file OCR.mlmodel what you can use direct in your project ( as in part1 and part2)