Node.js 20x slower than browser (Safari) with Tesseract.Js

New to JS and very new to Node. Running Tesseract.js (text recognition software: http://tesseract.projectnaptha.com ) in Safari takes about 10 seconds and starts displaying progress immediately.
Node (v6.9.1) (launched from the terminal or via electron) starts the CPU up to 100% for 4 minutes 20 seconds before it starts to output to the console. Then it ends at about the same time.

What are the recommended troubleshooting steps? Is this common to Node?
The only difference that I see in the logs is "Safari", found in the eng.traineddata cache. "Clearing and disabling the cache minimally affects the time. We tried several .JPG and .PNG files (300-600kb) with the same result - but BMP (3.7MB) gave a quick response of 17 seconds - then the errors did not end. (Is this the problem of the β€œnext tick”?)

var Tesseract = require('tesseract.js'); var image = "./images/sample.jpg"; function tesseract(){ Tesseract.recognize(image) .progress(function(message){console.log(message)}) .then(result => console.log(result.text)) } tesseract(); 

(editor formats output as code)
NODE console.Log

 >Bash-3.2$ node JustTess.js *Waits 4+ min and Then* { status: 'loading tesseract core' } { status: 'loaded tesseract core' } { status: 'initializing tesseract', progress: 0 } pre-main prep time:108 ms { status: 'initializing tesseract', progress: 1 } { status: 'loading eng.traineddata', progress: 0 } { status: 'loading eng.traineddata', progress: 1 } { status: 'initializing api', progress: 0 } { status: 'initializing api', progress: 0.3 } { status: 'initializing api', progress: 0.6 } { status: 'initializing api', progress: 1 } { status: 'recognizing text', progress: 0 } { status: 'recognizing text', progress: 0.014285714 }... 

SAFARI console.log

 >[Log] – {status: "loading tesseract core"} [Log] – {status: "loaded tesseract core"} [Log] – {status: "initializing tesseract api"} [Log] pre-main prep time: 115 ms (index.js, line 10) [Log] – {status: "initialized tesseract api"} [Log] – {status: "found in cache eng.traineddata"} [Log] – {status: "loaded eng.traineddata"} [Log] – {status: "initialized with language"} [Log] – {status: "recognizing text", progress: 0} [Log] – {status: "recognizing text", progress: 0.0142}... 

NODE with BMP

 bash-3.2$ node JustTess.js *After 17 sec* { status: 'initializing tesseract', progress: 0 } pre-main prep time: 118 ms { status: 'initializing tesseract', progress: 1 } { status: 'loading eng.traineddata', progress: 0 } { status: 'loading eng.traineddata', progress: 1 } { status: 'initializing api', progress: 0 } { status: 'initializing api', progress: 0.3 } { status: 'initializing api', progress: 0.6 } Error in pixRemoveColormap: pixs must be {1,2,4,8} bpp Error in pixGetDepth: pix not defined Error in pixGetWpl: pix not defined Error in pixCreateHeader: depth must be {1, 2, 4, 8, 16, 24, 32} Error in pixCreateNoInit: pixd not made Error in pixCreateTemplateNoInit: pixd not made Error in pixCreateTemplate: pixd not made Error in pixCopy: pixd not made { status: 'initializing api', progress: 1 } 3 3 /Users/brent/Library/Mobile Documents/com~apple~CloudDocs/Programming/GitHub/ba/node_modules/tesser act.js-core/index.js:4 function f(a){throw a;}var h=void 0,i=!0,j=null,k=!1;function aa(){return function(){}}function ba( a){return function(){return a}}var n,Module;Module||(Module=eval("(function() { try { return Tesser actCore || {} } catch(e) { return {} } })()"));var ca={},da;for(da in Module)Module.hasOwnProperty( da)&&(ca[da]=Module[da]);var ea=i,fa=!ea&&i; ^ abort(3) at Error at Error (native) at Na (/Users/brent/Library/Mobile Documents/com~apple~CloudDocs/Programming/GitHub/ba/node_mod ules/tesseract.js-core/index.js:32:26) at ka (/Users/brent/Library/Mobile Documents/com~apple~CloudDocs/Programming/GitHub/ba/node_mod ules/tesseract.js-core/index.js:507:108) at Array.JHa (/Users/brent/Library/Mobile Documents/com~apple~CloudDocs/Programming/GitHub/ba/n ode_modules/tesseract.js-core/index.js:402:25808) at xd (/Users/brent/Library/Mobile Documents/com~apple~CloudDocs/Programming/GitHub/ba/node_mod ules/tesseract.js-core/index.js:382:924) at R.TesseractCore.V.Begin (/Users/brent/Library/Mobile Documents/com~apple~CloudDocs/Programmi ng/GitHub/ba/node_modules/tesseract.js-core/index.js:511:288) at DumpLiterallyEverything (/Users/brent/Library/Mobile Documents/com~apple~CloudDocs/Programmi ng/GitHub/ba/node_modules/tesseract.js/src/common/dump.js:13:8) at /Users/brent/Library/Mobile Documents/com~apple~CloudDocs/Programming/GitHub/ba/node_modules /tesseract.js/src/common/worker.js:121:22 at /Users/brent/Library/Mobile Documents/com~apple~CloudDocs/Programming/GitHub/ba/node_modules /tesseract.js/src/common/worker.js:92:9 at /Users/brent/Library/Mobile Documents/com~apple~CloudDocs/Programming/GitHub/ba/node_modules /tesseract.js/src/node/lang.js:14:25 If this abort() is unexpected, build with -s ASSERTIONS=1 which can give more information. 
+5
source share
2 answers

I can not answer the question; however, other answers do not shed much light on this question. See http://www.jsbenchmarks.com/?anywhichway/lookup/master/benchmark.js/ for an example of how NodeJS and browsers vary widely in several ways. Please note: although the browser results on this site belong to several visitors, and the Node results come from a single server, tests in an isolated environment show the same thing.

0
source

This issue was resolved by updating the Tesseract.Js software.

0
source

Source: https://habr.com/ru/post/1259404/


All Articles