When this option is used, the threshold is adapted to the cropped
image, i.e., after the "crop" command, but not directly before.
This allows to avoid adjusting the threshold to the full image,
and thus potentially reduce the time needed for recognition.
Instead of adapting the threshold to the image before executing
commands, adapt the threshold just before it is needed.
This allows to avoid theshold adaptation when -p, --process-only
is used with only the "grayscale" and/or "mirror" commands.
This also prepares the code to allow introduction of a new option
to avoid adapting the threshold to the original image before the
"crop" command is applied.
Commands with optional argument had two code paths leading
to the respective function application, one of those with
hard-coded argument "1". Instead, ensure the variable for
the optional argument is always set, and have just one
function call, always using this variable, per command.
Both functions are always called with the same arguments.
Only the get_threshold() function is also called with
different x, y, w, h arguments (when used during dynamic
thresholding).
Instead of reading one byte at a time, read data in chunks
of BUFSIZ bytes.
For one example, this improves recognition time for image
data read via standard input from 281ms to 53ms. YMMV.
* The draw_pixel() function was called with an "image" parameter
of type "Imlib_Image" instead of "Imlib_Image *". This type
error did not result in a compilation error, and thus stayed
undetected in the code.
* Introduce a new function draw_color_pixel() similar to draw_pixel(),
and use it instead of repeatedly open-coding this operation.
When the widest digit found in the image is a one, it is likely
that a decimal separator is nearly as wide as this digit. Thus
it cannot be recognized, because the decimal separator needs to
be at most half as wide as the widest digit (before this commit).
Thus add an additional pass over the digits for this special case.
This pass comes after the existing recognition passes for the
digit one, decimal separator, and minus sign. In the new pass,
the width of the digit is ignored.
This addresses GitHub issue #26.
This did not work before, and it was not intended to change
this.
I do not know of a use case where one is expecting no digits
at all, but is using ssocr to recognize the non-existing
digits. If you do have such a use case, let me know.
When there is a bit of noise in the image, the segmentation
step might find lots of small potential digits that are not
really digits (or other characters) of the display. Given
sufficiently large display characters, it may be possible
to specify minimum character dimensions to remove spurious
potential characters (digits) based on their size.
If a specific number of digits is expected, i.e., without
-d-1, the number of found (potential) digits is compared
with the expected number, and ssocr errors out on mismatch.
This prepares for the introduction of an option to reject
some potential images, e.g., because they are too small to
contain a digit or character from the display.
It also prepares for allowing a range of digits as the
expected segmentation result, e.g., for a clock display
with a blinking colon (:), or for a thermometer display.
If the string length of the value of the environment variable TMP
is too big, adding the length of a slash and the temporary file
name might overflow, which would result in an insufficient memory
allocation for the absolute file name.
I do not know if that can actually happen in any existing operating
system and platform combination where ssocr can be used.
This adds an option to enable white spece detection, and two
further options to control the operation of white space detection.
White space detection (--print-spaces) is intended for use cases
where digit (resp. character) grouping is important for correct
interpretation. One use case is the recognition of superimposed
dates in photographic images.
This commit also increases the version number to 2.21.0 and tweaks
some debug output.
This introduces a function to scan part of the image for foreground
pixels.
This scanline() function may be of use for distuingishing between
the digit '1' and the symbol ":".
It may also help in segment detection reliability if the "len" parameter
is used to skip scanning image areas between segment positions.
This is the first step towards support of different character sets.
Different character sets are intended to be used to e.g. select
between '6' and 'b', but also to receive an error if e.g. a decimal
display is recognized as a hexadecimal digit.