When the widest digit found in the image is a one, it is likely
that a decimal separator is nearly as wide as this digit. Thus
it cannot be recognized, because the decimal separator needs to
be at most half as wide as the widest digit (before this commit).
Thus add an additional pass over the digits for this special case.
This pass comes after the existing recognition passes for the
digit one, decimal separator, and minus sign. In the new pass,
the width of the digit is ignored.
This addresses GitHub issue #26.
Neither lower case c nor lower case n can happen with correct
image segmentation. Lower case c was already used in the code,
but lower case n was only found as a comment.
I am not sure if I want to keep this or rather remove both lower
case variants. I do want to have this consistent, though.
The command name stems from the time when ssocr always considered
"white" as background and "black" as foreground. The intention
has always been to ensure some background border around the seven
segment displays, and the implementation always followed this. The
output of --help correctly uses "background" for the "white_border"
description. Now, the man page description is correct, too.
The latest release date is extracted from the NEWS file,
i.e., it depends only on the sources, not the build date.
This is intended to help in creating reproducible builds
by avoiding timestamps. It is also closer to the date of
the contents of the man page than using just the latest
copyright year.
I do not have a perfect solution that works for both a git
clone and a downloaded tar ball. This solution works well
for released tar balls of the ssocr sources.
In order to help creating reproducible builds of ssocr,
do not use the build day of the man page as the date
inside the man page. Instead, use the latest copyright
year of ssocr.
Using the man page build date has always been problematic,
because it is misleading. But I do not have a general
automatic way to maintain the last change date for the
man page that works for both git clones and tar balls.
This seems like an improvement to me. It provides some
idea of how old the man page is, and this date depends
only on the ssocr source code, not the build date.
This should help with one of the two problems reported
in GitHub issue #22.
This ssocr release adds new features and thus addresses
GitHub issue #21:
* new option -N, --min-segment=SIZE
* new option -M, --min-char-dims=WxH
* a range of expected digits can be specified (before, only
a single number could be specified, or the number could be
left unspecified)
This did not work before, and it was not intended to change
this.
I do not know of a use case where one is expecting no digits
at all, but is using ssocr to recognize the non-existing
digits. If you do have such a use case, let me know.
When there is a bit of noise in the image, the segmentation
step might find lots of small potential digits that are not
really digits (or other characters) of the display. Given
sufficiently large display characters, it may be possible
to specify minimum character dimensions to remove spurious
potential characters (digits) based on their size.
If a specific number of digits is expected, i.e., without
-d-1, the number of found (potential) digits is compared
with the expected number, and ssocr errors out on mismatch.
This prepares for the introduction of an option to reject
some potential images, e.g., because they are too small to
contain a digit or character from the display.
It also prepares for allowing a range of digits as the
expected segmentation result, e.g., for a clock display
with a blinking colon (:), or for a thermometer display.