ssocr

Author	SHA1	Message	Date
Erik Auerswald	e914669079	add option -F, --adapt-after-crop When this option is used, the threshold is adapted to the cropped image, i.e., after the "crop" command, but not directly before. This allows to avoid adjusting the threshold to the full image, and thus potentially reduce the time needed for recognition.	2025-03-23 19:03:48 +01:00
Erik Auerswald	94ce321060	lazily adapt threshold to image Instead of adapting the threshold to the image before executing commands, adapt the threshold just before it is needed. This allows to avoid theshold adaptation when -p, --process-only is used with only the "grayscale" and/or "mirror" commands. This also prepares the code to allow introduction of a new option to avoid adapting the threshold to the original image before the "crop" command is applied.	2025-03-23 18:54:13 +01:00
Erik Auerswald	7cae796188	ssocr.c: code maintenance Fix two minor issues with crop command execution: * correct two variable names in a comment, and * remove a useless imlib_context_set_image() call.	2025-03-23 14:27:52 +01:00
Erik Auerswald	7f2a9f3f22	warn when options -a and -T are used together	2025-03-22 18:45:15 +01:00
Erik Auerswald	56d1b22118	simplify some command execution code paths Commands with optional argument had two code paths leading to the respective function application, one of those with hard-coded argument "1". Instead, ensure the variable for the optional argument is always set, and have just one function call, always using this variable, per command.	2025-03-20 21:12:00 +01:00
Erik Auerswald	fa30f473e7	simplify adapt_threshold() and iterative_threshold() Both functions are always called with the same arguments. Only the get_threshold() function is also called with different x, y, w, h arguments (when used during dynamic thresholding).	2025-03-18 21:31:19 +01:00
Erik Auerswald	67abe0fba8	combine get_minval() and get_maxval() This simplifies the code a bit, and slightly speeds up using option "-P, --debug-output".	2025-03-18 20:42:38 +01:00
Erik Auerswald	ac060de96b	speed up reading from standard input Instead of reading one byte at a time, read data in chunks of BUFSIZ bytes. For one example, this improves recognition time for image data read via standard input from 281ms to 53ms. YMMV.	2025-03-18 20:10:51 +01:00
Erik Auerswald	f012c14d93	refactoring and type consistency fixes * The draw_pixel() function was called with an "image" parameter of type "Imlib_Image" instead of "Imlib_Image ". This type error did not result in a compilation error, and thus stayed undetected in the code. Introduce a new function draw_color_pixel() similar to draw_pixel(), and use it instead of repeatedly open-coding this operation.	2025-02-02 20:32:48 +01:00
Erik Auerswald	b44a4ad72a	update copyright years	2025-02-01 19:33:29 +01:00
Erik Auerswald	88a627050a	print warning when ignoring unknown luminance formula	2024-11-18 17:55:02 +01:00
Erik Auerswald	6e8c8361fd	improve wording of unknown charset warning	2024-11-18 17:43:55 +01:00
Erik Auerswald	acdf26c6bf	print warning when ignoring unknown charset	2024-11-17 19:25:33 +01:00
Erik Auerswald	70692aa5c4	suppress debug output without -P, --debug-output	2024-11-17 19:10:34 +01:00
Erik Auerswald	9964c8ce85	fix copy & paste error in a comment	2024-11-17 19:09:00 +01:00
Erik Auerswald	894f3035fd	fix a special case for decimal point recognition When the widest digit found in the image is a one, it is likely that a decimal separator is nearly as wide as this digit. Thus it cannot be recognized, because the decimal separator needs to be at most half as wide as the widest digit (before this commit). Thus add an additional pass over the digits for this special case. This pass comes after the existing recognition passes for the digit one, decimal separator, and minus sign. In the new pass, the width of the digit is ignored. This addresses GitHub issue #26.	2024-06-22 20:04:08 +02:00
Erik Auerswald	6569ca9f6b	update copyright years	2024-02-23 19:03:58 +01:00
Erik Auerswald	81c5472b69	add program name to warning messages	2023-09-10 14:49:12 +02:00
Erik Auerswald	6fc62729e9	improve error message for wrong number of digits * Use different messages for single digit and digit ranges. * Use singular when looking for one digit, plural otherwise.	2023-09-10 14:35:49 +02:00
Erik Auerswald	19f93dcb58	fix debug output regarding flag DEBUG_OUTPUT	2023-07-26 19:50:15 +02:00
Erik Auerswald	5ca3455873	add ssocr version to debug output	2023-07-26 19:47:01 +02:00
Erik Auerswald	0c3adc9002	avoid useless digit memory copy There is no need to copy digit memory if all potential digits are kept.	2023-05-13 20:05:23 +02:00
Erik Auerswald	ad9027b6a0	guard against some integer overflows When determining memory allocation sizes, integer overflow can lead to memory safety errors. Add some guards to prevent this.	2023-05-13 19:58:16 +02:00
Erik Auerswald	cf2391f400	do not accept 0 expected digits This did not work before, and it was not intended to change this. I do not know of a use case where one is expecting no digits at all, but is using ssocr to recognize the non-existing digits. If you do have such a use case, let me know.	2023-05-01 17:00:49 +02:00
Erik Auerswald	898f5ec712	allow to specify a range for the number of digits This can be helpful when using ssocr with a display showing a variable number of digits, e.g., a clock, a scale, or a thermometer.	2023-05-01 16:19:12 +02:00
Erik Auerswald	9e2d37ddbf	add option -M, --min-char-dims=WxH When there is a bit of noise in the image, the segmentation step might find lots of small potential digits that are not really digits (or other characters) of the display. Given sufficiently large display characters, it may be possible to specify minimum character dimensions to remove spurious potential characters (digits) based on their size.	2023-04-30 19:14:19 +02:00
Erik Auerswald	8df40937c9	accept any number of digits during segmentation If a specific number of digits is expected, i.e., without -d-1, the number of found (potential) digits is compared with the expected number, and ssocr errors out on mismatch. This prepares for the introduction of an option to reject some potential images, e.g., because they are too small to contain a digit or character from the display. It also prepares for allowing a range of digits as the expected segmentation result, e.g., for a clock display with a blinking colon (:), or for a thermometer display.	2023-04-30 16:29:47 +02:00
Erik Auerswald	06c2afc85e	add option -N, --min-segment=SIZE This option is similar to -n, --number-pixels=#, but also applies the limit to ratio based detection (i.e., for recognition of "one" and "minus").	2023-04-29 11:53:32 +02:00
Erik Auerswald	2d6b019842	consistently use "scanline" in comments	2023-04-24 17:04:34 +02:00
Erik Auerswald	ea8f724846	update copyright years	2023-04-23 12:53:48 +02:00
Erik Auerswald	320487e300	tweak alignment of some debug output	2023-04-23 12:50:02 +02:00
Erik Auerswald	b83610d7d2	update copyright year	2022-01-25 18:38:52 +01:00
Erik Auerswald	0793da7bcb	replace strncat() with memcpy() GCC 10 warns about the use of strncat(), but accepts the equivalent use of memcpy() without warning. ¯\_(ツ)_/¯	2021-10-26 20:54:14 +02:00
Erik Auerswald	9b227e7d64	update reasons for string.h includes	2021-10-26 20:28:36 +02:00
Erik Auerswald	9b31b50154	guard against potential unsigned integer overflow If the string length of the value of the environment variable TMP is too big, adding the length of a slash and the temporary file name might overflow, which would result in an insufficient memory allocation for the absolute file name. I do not know if that can actually happen in any existing operating system and platform combination where ssocr can be used.	2021-10-26 20:18:11 +02:00
Erik Auerswald	113d665135	add ability to detect and print white space This adds an option to enable white spece detection, and two further options to control the operation of white space detection. White space detection (--print-spaces) is intended for use cases where digit (resp. character) grouping is important for correct interpretation. One use case is the recognition of superimposed dates in photographic images. This commit also increases the version number to 2.21.0 and tweaks some debug output.	2021-04-25 14:05:51 +02:00
Erik Auerswald	dc463a5529	options to control decimal separator recognition Additionally, bump copyright dates and version number.	2021-04-19 19:27:52 +02:00
Erik Auerswald	17928b4b76	always use spaces for indentation, not tabs	2019-03-10 18:06:58 +01:00
Erik Auerswald	2031c2c08e	refactor scanning for set segments This introduces a function to scan part of the image for foreground pixels. This scanline() function may be of use for distuingishing between the digit '1' and the symbol ":". It may also help in segment detection reliability if the "len" parameter is used to skip scanning image areas between segment positions.	2019-03-10 17:58:04 +01:00
Erik Auerswald	d3fdf3b223	cosmetic changes in ssocr.c	2019-02-02 15:09:50 +01:00
Erik Auerswald	535aa89bdb	keep line length <= 80	2019-02-02 14:30:41 +01:00
Erik Auerswald	9569289c63	bump copyright year to 2019	2019-02-02 13:08:13 +01:00
Erik Auerswald	07623ef831	make debug output imply verbose operation	2019-02-02 13:03:31 +01:00
Erik Auerswald	3bdd09428f	ssocr.c: replace strcat with strncat	2018-12-29 11:40:02 +01:00
Erik Auerswald	592a4044e5	ssocr.c: simplify some string operations	2018-12-29 11:06:37 +01:00
Erik Auerswald	8c2f93d6c2	ssocr.c: replace last malloc() with calloc() The code actually needs the allocated memory to have a '\0' byte at the beginning.	2018-12-29 10:49:29 +01:00
Erik Auerswald	d25ddb4674	final step of character set support This implements selection of character set to recognize and documents it in the man page.	2018-08-05 07:01:50 +02:00
Erik Auerswald	e6f4e49ba9	second step to implement different character sets - add character sets full, digits, decimal, hex - full is used by default - character set cannot be selected for now	2018-08-05 06:26:04 +02:00
Erik Auerswald	d6a957e6d3	move character printing to a separate function This is the first step towards support of different character sets. Different character sets are intended to be used to e.g. select between '6' and 'b', but also to receive an error if e.g. a decimal display is recognized as a hexadecimal digit.	2018-08-05 05:24:38 +02:00
Erik Auerswald	1b90fbba6e	prefix all error messages with 'ssocr: '	2018-07-27 22:30:05 +02:00

1 2 3

143 Commits