Part 3 – Train your own model (using Tensorflow)

Let’s come to the trickiest (and challenging) part – the training itself.

I’m using Tensorflow with Keras (for some reasons..)

What do we need for training ?

First: a lot (and I mean: a lot) of images for training and validation. In this example I’m using about 5500 images, 28*28 pixels , grayscale. I have 36 Classes (characters), so I have about 150 images of each character (letter or number). How can you generate such amount of images ? I’ll tell it later…

Second we need our (OSX) Terminal and a lot of python modules.

To make life easier, I’m using Anaconda to have a virtual environment for python.

Let’s start:

Download and install Anaconda .

After that you have to open a new Terminal Shell …


Stay in your home-directory and type (you can paste the line)


his will make a fresh Python (2.7) environment named “ocrTraining“.

Lets go to this environment:


Now we have to activate this environment:


Now let’s install all the necessary modules. One after each other ..

Some modules won’t install from ‘conda‘ – so I’m using instead  ‘pip

We’re using Keras v. 2.0.6 because of Apples CoreMLTools … (thats why we’re using  Python 2.7 ..)

Let’s go ahead:

As I mentioned, you need a lot of training-images;  you can download my dataset from my repository at Github


Extract the content (a lot of PNGs and one CSV) into

(inside this folder should ONLY PNG-Files and one CSV-File)

Now download the Python-script from my repository at Github

and put it into

Great !

Let’s start training –


On my MacBookPro (late 2016) one epoch (and we have five) will take about only 5 seconds – ok, it’s not such amount of images, BUT:

the accuracy is 0.999272726622 what means 99.92 % !!!

If it’s finished (an it should without big errors – some syntax-thingies) you’ll have at the end in your folder a file OCR.mlmodel what you can use direct in your project ( as in part1 and part2)


Part 1 – How simple is it ? OCR without Tesseract ? On IOS ? Yeah !!

Let start with a simple design:

Open Xcode and create a new SingleView App. (If you don’t know this … hmm)

You’ll see this (depending on the name of your App) :


Not so much …

Let’s add some functions:

First wee need some imports:


Thats why we are using CoreML and Vision.

Now we have to create a request:


… and a completion handler:


If you have some compile-errors: You don’t have the (trained) model – you’ll find it in my repository

Now let’s load an image (from resource) and feed it to our request – we are doing this in viewDidLoad()


If you run this (simulator is enough – you will see the recognized character in your debug-view:

Simple – Uhhh !!


some remarks:

The trained model (by me) is ONLY for the font “Incosolata” – it’s a free font from Google (downloadable here  or here)

I trained characters “A” .. “Z” (only uppercase) and numbers “0”..”9″.

Later I’ll show you how to train (but that’s not that simple – for some reasons)

You can download the project from my repository at GitHub

Next Part is about TextDetection as Source for OCR

Stay tuned !