How do font identification algorithms work?

I was wondering how automatic font identification services (like WhatTheFont , rather than question-based like Identifont ). The simplest option is a service that allows you to load an image containing text, and the service returns the name of the font used. How is this done, and how is it done so quickly to be practical? I am new to this, but here is my understanding so far:

  • Perhaps some pre-processing to reduce noise. I am not particularly interested in this.
  • First, the image is launched through OCR to extract text - quite simply.
  • Then you look at each font in tens / hundreds of thousands in your database and visualize the text that you extracted in each of them, if it is close to the original. Adjust size, alignment, kerning, different weights or italics, etc. How is this possible fast enough to be practical?

Is it correct?

Please provide some idea of ​​how this is done and how it is done effectively.

+4
source share
1 answer

Suppose you are doing a match in a raster representation (and not in a vectorized outline).

, , ; , .

: , (, , ?) , . , .

, , . , .

+3

Source: https://habr.com/ru/post/1536299/


All Articles