The On Demand Global Workforce - oDesk

The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but it is probably one of the most accurate open source OCR engines available. The source code will read a binary, grey or color image and output text. A tiff reader is built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images.

1. Download source code from URL : http://code.google.com/p/tesseract-ocr/

2. create a .sh file inside the tesseract source code folder

3. Write the following code in .sh file

Code Starts


#!/bin/sh
# build_fat.sh
#
# Created by Robert Carlsen on 15.07.2009. Updated 24.9.2010
# build an arm / i386 lib of standard linux project
#
# initially configured for tesseract-ocr v2.0.4
# updated for tesseract prerelease v3

outdir=outdir
mkdir -p $outdir/arm $outdir/i386

libdirs=( api ccutil ccmain ccstruct classify cutil dict image textord training viewer wordrec )
libs=( api ccutil main ccstruct classify cutil dict image textord training viewer wordrec )
count=${#libdirs[@]}

make distclean
unset CPPFLAGS CFLAGS LDFLAGS CPP CXX CC CXXFLAGS DEVROOT SDKROOT LD

export DEVROOT=/Developer/Platforms/iPhoneOS.platform/Developer
export SDKROOT=$DEVROOT/SDKs/iPhoneOS4.1.sdk
export CFLAGS="-arch armv6 -pipe -no-cpp-precomp -isysroot$SDKROOT -miphoneos-version-min=3.0 -I$SDKROOT/usr/include/"
export CPPFLAGS="$CFLAGS"
export CXXFLAGS="$CFLAGS"
export LDFLAGS="-L$SDKROOT/usr/lib/"
export LD="$DEVROOT/usr/bin/ld"
export CPP="$DEVROOT/usr/bin/cpp-4.2"
export CXX="$DEVROOT/usr/bin/g++-4.2"
export CC="$DEVROOT/usr/bin/gcc-4.2"
./configure --host=arm-apple-darwin
make -j3

index=0
while [ "$index" -lt "$count" ]
do
cp ${libdirs[index]}/.libs/libtesseract_${libs[index]}.a $outdir/arm/libtesseract_${libs[index]}_armv6.a
((index++))
done

make distclean
unset CPPFLAGS CFLAGS LDFLAGS CPP CXX CC CXXFLAGS DEVROOT SDKROOT LD

export DEVROOT=/Developer/Platforms/iPhoneSimulator.platform/Developer
export SDKROOT=$DEVROOT/SDKs/iPhoneSimulator4.1.sdk
export CFLAGS="-arch i386 -pipe -no-cpp-precomp -isysroot$SDKROOT -miphoneos-version-min=3.0 -I$SDKROOT/usr/include/"
export CPPFLAGS="$CFLAGS"
export CXXFLAGS="$CFLAGS"
export LDFLAGS="-L$SDKROOT/usr/lib/"
export LD="$DEVROOT/usr/bin/ld"
export CPP="$DEVROOT/usr/bin/cpp-4.2"
export CXX="$DEVROOT/usr/bin/g++-4.2"
export CC="$DEVROOT/usr/bin/gcc-4.2"
./configure
make -j3

index=0
while [ "$index" -lt "$count" ]
do
cp ${libdirs[index]}/.libs/libtesseract_${libs[index]}.a $outdir/i386/libtesseract_${libs[index]}_i386.a
((index++))
done

# are the fat libs making the bundle too big?
index=0
while [ "$index" -lt "$count" ]
do
/usr/bin/lipo -arch armv6 $outdir/arm/libtesseract_${libs[index]}_armv6.a -arch i386 $outdir/i386/libtesseract_${libs[index]}_i386.a -create -output $outdir/libtesseract_${libs[index]}.a
((index++))
done

 

unset CPPFLAGS CFLAGS LDFLAGS CPP CXX CC CXXFLAGS DEVROOT SDKROOT

Code Ends

 

 

Source : http://robertcarlsen.net/2010/09/24/compiling-tesseract-v3-for-iphone-1299

5. Navigate to the tesseract source code folder by  Terminal

6. RUN sh ./yourfilename.sh from Terminal. It will configure & build library for iOS.

7. After compile finished, check for outdir folder inside the tesseract source code folder. You will find the all required library files for iOS.

iOS implementation for tesseract ocr : https://github.com/rcarlsen/Pocket-OCR