OCR Android App Using Tesseract

I am trying to create an OCR application on Android using Tesseract, but when I save the image, the application is overwhelming.

I created a photo assembly using the Simple Android Photography tutorial and the OCR feature with the tutorial Creating a Simple OCR Application for Android using Tesseract .

This is the code I'm using:

package com.mmm.pitter; import java.io.File; import java.io.IOException; import com.mmm.pitter.R; import com.googlecode.tesseract.android.*; import com.googlecode.leptonica.android.*; import android.app.Activity; import android.content.Intent; import android.graphics.Bitmap; import android.graphics.BitmapFactory; import android.graphics.Matrix; import android.media.ExifInterface; import android.net.Uri; import android.os.Bundle; import android.os.Environment; import android.provider.MediaStore; import android.util.Log; import android.view.View; import android.widget.Button; import android.widget.ImageView; import android.widget.TextView; public class PitterActivity extends Activity { protected Button _button; protected ImageView _image; protected TextView _field; protected String _path; protected boolean _taken; protected static final String PHOTO_TAKEN = "photo_taken"; @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.main); _image = ( ImageView ) findViewById( R.id.image ); _field = ( TextView ) findViewById( R.id.field ); _button = ( Button ) findViewById( R.id.button ); _button.setOnClickListener( new ButtonClickHandler() ); _path = Environment.getExternalStorageDirectory() + "/images/make_machine_example.jpg"; } public class ButtonClickHandler implements View.OnClickListener { public void onClick( View view ){ Log.i("MakeMachine", "ButtonClickHandler.onClick()" ); startCameraActivity(); } } protected void startCameraActivity() { Log.i("MakeMachine", "startCameraActivity()" ); File file = new File( _path ); Uri outputFileUri = Uri.fromFile( file ); Intent intent = new Intent(android.provider.MediaStore.ACTION_IMAGE_CAPTURE ); intent.putExtra( MediaStore.EXTRA_OUTPUT, outputFileUri ); startActivityForResult( intent, 0 ); } @Override protected void onActivityResult(int requestCode, int resultCode, Intent data) { Log.i( "MakeMachine", "resultCode: " + resultCode ); switch( resultCode ) { case 0: Log.i( "MakeMachine", "User cancelled" ); break; case -1: try { onPhotoTaken(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } break; } } protected void onPhotoTaken() throws IOException { Log.i( "MakeMachine", "onPhotoTaken" ); _taken = true; BitmapFactory.Options options = new BitmapFactory.Options(); options.inSampleSize = 4; Bitmap bitmap = BitmapFactory.decodeFile( _path, options ); _image.setImageBitmap(bitmap); _field.setVisibility( View.GONE ); //_path = path to the image to be OCRed ExifInterface exif = new ExifInterface(_path); int exifOrientation = exif.getAttributeInt( ExifInterface.TAG_ORIENTATION, ExifInterface.ORIENTATION_NORMAL); int rotate = 0; switch (exifOrientation) { case ExifInterface.ORIENTATION_ROTATE_90: rotate = 90; break; case ExifInterface.ORIENTATION_ROTATE_180: rotate = 180; break; case ExifInterface.ORIENTATION_ROTATE_270: rotate = 270; break; } if (rotate != 0) { int w = bitmap.getWidth(); int h = bitmap.getHeight(); // Setting pre rotate Matrix mtx = new Matrix(); mtx.preRotate(rotate); // Rotating Bitmap & convert to ARGB_8888, required by tess bitmap = Bitmap.createBitmap(bitmap, 0, 0, w, h, mtx, false); bitmap = bitmap.copy(Bitmap.Config.ARGB_8888, true); } TessBaseAPI baseApi = new TessBaseAPI(); // DATA_PATH = Path to the storage // lang for which the language data exists, usually "eng" baseApi.init(""sdcard/tesseract/tessdata", "eng"); baseApi.setImage(bitmap); String recognizedText = baseApi.getUTF8Text(); System.out.println(recognizedText); baseApi.end(); } @Override protected void onRestoreInstanceState( Bundle savedInstanceState){ Log.i( "MakeMachine", "onRestoreInstanceState()"); if( savedInstanceState.getBoolean( PitterActivity.PHOTO_TAKEN ) ) { try { onPhotoTaken(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } @Override protected void onSaveInstanceState( Bundle outState ) { outState.putBoolean( PitterActivity.PHOTO_TAKEN, _taken ); } } 

And this is the log:

 10-13 23:13:51.191: I/MakeMachine(29787): ButtonClickHandler.onClick() 10-13 23:13:51.191: I/MakeMachine(29787): startCameraActivity() 10-13 23:13:51.851: D/CLIPBOARD(29787): Hide Clipboard dialog at Starting input: finished by someone else... ! 10-13 23:13:51.866: W/IInputConnectionWrapper(29787): showStatusIcon on inactive InputConnection 10-13 23:14:07.431: I/MakeMachine(29787): onRestoreInstanceState() 10-13 23:14:07.431: I/MakeMachine(29787): resultCode: -1 10-13 23:14:07.431: I/MakeMachine(29787): onPhotoTaken 10-13 23:14:07.431: I/System.out(29787): Not a DRM File, opening notmally 10-13 23:14:07.436: E/JHEAD(29787): can't open 10-13 23:14:07.436: D/dalvikvm(29787): Trying to load lib /data/data/com.mmm.pitter/lib/liblept.so 0x4154e9a0 10-13 23:14:07.436: D/dalvikvm(29787): Added shared lib /data/data/com.mmm.pitter/lib/liblept.so 0x4154e9a0 10-13 23:14:07.446: D/dalvikvm(29787): Trying to load lib /data/data/com.mmm.pitter/lib/libtess.so 0x4154e9a0 10-13 23:14:07.456: D/dalvikvm(29787): Added shared lib /data/data/com.mmm.pitter/lib/libtess.so 0x4154e9a0 10-13 23:14:07.471: D/AndroidRuntime(29787): Shutting down VM 10-13 23:14:07.471: W/dalvikvm(29787): threadid=1: thread exiting with uncaught exception (group=0x40c5b1f8) 10-13 23:14:07.476: E/AndroidRuntime(29787): FATAL EXCEPTION: main 10-13 23:14:07.476: E/AndroidRuntime(29787): java.lang.RuntimeException: Unable to resume activity {com.mmm.pitter/com.mmm.pitter.PitterActivity}: java.lang.RuntimeException: Failure delivering result ResultInfo{who=null, request=0, result=-1, data=null} to activity {com.mmm.pitter/com.mmm.pitter.PitterActivity}: java.lang.IllegalArgumentException: Data path must contain subfolder tessdata! 10-13 23:14:07.476: E/AndroidRuntime(29787): at android.app.ActivityThread.performResumeActivity(ActivityThread.java:2456) 10-13 23:14:07.476: E/AndroidRuntime(29787): at android.app.ActivityThread.handleResumeActivity(ActivityThread.java:2484) 10-13 23:14:07.476: E/AndroidRuntime(29787): at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:1998) 10-13 23:14:07.476: E/AndroidRuntime(29787): at android.app.ActivityThread.handleRelaunchActivity(ActivityThread.java:3363) 10-13 23:14:07.476: E/AndroidRuntime(29787): at android.app.ActivityThread.access$700(ActivityThread.java:127) 10-13 23:14:07.476: E/AndroidRuntime(29787): at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1163) 10-13 23:14:07.476: E/AndroidRuntime(29787): at android.os.Handler.dispatchMessage(Handler.java:99) 10-13 23:14:07.476: E/AndroidRuntime(29787): at android.os.Looper.loop(Looper.java:137) 10-13 23:14:07.476: E/AndroidRuntime(29787): at android.app.ActivityThread.main(ActivityThread.java:4507) 10-13 23:14:07.476: E/AndroidRuntime(29787): at java.lang.reflect.Method.invokeNative(Native Method) 10-13 23:14:07.476: E/AndroidRuntime(29787): at java.lang.reflect.Method.invoke(Method.java:511) 10-13 23:14:07.476: E/AndroidRuntime(29787): at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:790) 10-13 23:14:07.476: E/AndroidRuntime(29787): at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:557) 10-13 23:14:07.476: E/AndroidRuntime(29787): at dalvik.system.NativeStart.main(Native Method) 10-13 23:14:07.476: E/AndroidRuntime(29787): Caused by: java.lang.RuntimeException: Failure delivering result ResultInfo{who=null, request=0, result=-1, data=null} to activity {com.mmm.pitter/com.mmm.pitter.PitterActivity}: java.lang.IllegalArgumentException: Data path must contain subfolder tessdata! 10-13 23:14:07.476: E/AndroidRuntime(29787): at android.app.ActivityThread.deliverResults(ActivityThread.java:2992) 10-13 23:14:07.476: E/AndroidRuntime(29787): at android.app.ActivityThread.performResumeActivity(ActivityThread.java:2443) 10-13 23:14:07.476: E/AndroidRuntime(29787): ... 13 more 10-13 23:14:07.476: E/AndroidRuntime(29787): Caused by: java.lang.IllegalArgumentException: Data path must contain subfolder tessdata! 10-13 23:14:07.476: E/AndroidRuntime(29787): at com.googlecode.tesseract.android.TessBaseAPI.init(TessBaseAPI.java:178) 10-13 23:14:07.476: E/AndroidRuntime(29787): at com.mmm.pitter.PitterActivity.onPhotoTaken(PitterActivity.java:146) 10-13 23:14:07.476: E/AndroidRuntime(29787): at com.mmm.pitter.PitterActivity.onActivityResult(PitterActivity.java:88) 10-13 23:14:07.476: E/AndroidRuntime(29787): at android.app.Activity.dispatchActivityResult(Activity.java:4649) 10-13 23:14:07.476: E/AndroidRuntime(29787): at android.app.ActivityThread.deliverResults(ActivityThread.java:2988) 10-13 23:14:07.476: E/AndroidRuntime(29787): ... 14 more 10-13 23:19:32.376: I/Process(29787): Sending signal. PID: 29787 SIG: 9 
+4
source share
3 answers

You need to put the data files in the tessdata directory and specify the parent tessdata directory in your init() method:

 baseApi.init("/mnt/sdcard/tesseract", "eng"); 
+11
source
 baseApi.init(""sdcard/tesseract/tessdata", "eng"); 

replaced by

 baseApi.init(""sdcard/tesseract/", "eng"); 

with the tessaract folder, you must include the tessdata folder. Since when compiling the path will add "tessdat" with the line

 File tessdata = new File(datapath + "tessdata"); 

in the init () function. And why there is a slash ("/") at the end of the path, the following comment will help you:

Dantap must be the name of the parent directory tessdata and must end with /. Any name after the last / will be deleted. language (usually), the string ISO 639-3 or null will be the default - eng. It is completely safe (and ultimately will be effective too) to call Init several times in one instance, to change the language or simply reset the classifier.

You can see the website with a comment for the init () function. Hope they help you.

+1
source

Tesseract library as .zip for windows, as .tar.gz for linux user.

 baseApi.init("/mnt/sdcard/tesseract/tessdata/eng.traineddata", "eng"); 

Please tell me if it works for you.

-2
source

Source: https://habr.com/ru/post/1439584/


All Articles