What the title says.
Strictly speaking, what I define as a “text” bounding box for a gray image is a set of 4 coordinates (x, y, x + width, y + height) that should define the area of the rectangle in this image that has the maximum the number of non-white pixels and at the same time the smallest possible number of white pixels (excluding the maximum number of non-white pixels). I have quoted text, as the images do not actually contain text, because the images contain only pixels with colors.
By installing ImageMagick in my Ubuntu and typing the command: in the terminal $convert input.png -trim ouput.png, I get:


Open the two images in new tabs in your web browser, and you will understand their difference, and you will also understand what I define as a bounding box. Output.png has actually the width and height that I am looking for. I do not know how to get the x and y coordinates.
The answer presented here (1) for pdf pages does not meet my criteria, since the "text" bounding box that gs gives me has large white margins (and as far as I can understand, what gs defines as " the text "bounding box for pdf is something different from my definition of the bounding box" text "for an image).