-----------------------------------
| ==> installation info <== |
-----------------------------------
synOCR-user: synOCR
synOCR-user is admin: yes
synOCR-version: 1.4.5
Architecture: x86_64
DSM-build: 69057
Device: 224plus (2150986347)
current Profil: Scan All Steuer
monitor is running?: yes
DB-version: 9
used image (created): jbarlow83/ocrmypdf:v12.7.2 (2021-11-04T21:53:21)
document author:
used ocr-parameter (raw): -srd -l deu+eng
OCR-arg 1: -srd
OCR-arg 2: -l
OCR-arg 3: deu+eng
ocropt_array: -srd -l deu+eng
search prefix:
replace search prefix: yes
renaming syntax: §yocr-§mocr-§docr_§tag_§tit
Symbol for tag marking: #
target file handling: useCatDir
Document split pattern: SYNOCR-SEPARATOR-SHEET
split page handling: discard
delete blank pages:
threshold black/white:
threshold black pixels:
clean up spaces: true
Date search method: use Python
date found order: firstfound
source for filedate: ocr
ignored dates by search: ;
date range in past: 0 [absolute: 0]
date range in future: 0 [absolute: 0]
PATH-Variable: /sbin:/bin:/usr/sbin:/usr/bin:/usr/syno/sbin:/usr/syno/bin:/usr/local/sbin:/usr/local/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/syno/bin:/usr/syno/sbin:/usr/local/bin:/opt/usr/bin:/usr/syno/synoman/webman/3rdparty/synOCR/bin
Docker test: OK
DSM notify to user: Hxxxxxxxy
apprise notify service:
apprise attachment: false
notify language: ger
Loglevel: debug
max. count of logfiles: 100
rotate backupfiles after: (purge backup deactivated)
Source directory: /volume1/Dokumente/Rezepte, Brillen, Zahnrechnungen usw. für Steuer/InBox/
Target directory: /volume1/Dokumente/Rezepte, Brillen, Zahnrechnungen usw. für Steuer/
BackUp directory: /volume1/Dokumente/synocr/backup/
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● ---------------------------------- ●
● | ==> RUN THE FUNCTIONS <== | ●
● ---------------------------------- ●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
-----------------------------------------------------------------------------------
| check the python3 installation and the necessary modules: |
-----------------------------------------------------------------------------------
[runtime up to now: 00:00:00]
Check Python:
module list:
Package Version
------------------ ----------
apprise 1.4.5
argcomplete 3.2.2
backports.zoneinfo 0.2.1
certifi 2024.2.2
charset-normalizer 3.3.2
click 8.1.7
dateparser 1.2.0
DateTime 5.4
deprecation 2.1.0
idna 3.6
importlib-metadata 7.0.1
lxml 5.1.0
Markdown 3.5.2
oauthlib 3.2.2
packaging 23.2
pikepdf 7.1.2
pillow 10.2.0
pip 24.0
pypdf 3.5.1
python-dateutil 2.8.2
pytz 2024.1
PyYAML 6.0.1
regex 2023.12.25
requests 2.31.0
requests-oauthlib 1.3.1
setuptools 56.0.0
six 1.16.0
tomlkit 0.12.3
typing_extensions 4.9.0
tzlocal 5.2
urllib3 2.2.0
xmltodict 0.13.0
yq 3.2.3
zipp 3.17.0
zope.interface 6.1
prepare_python: OK
Target temp directory: /tmp/tmp.rW8RsjwPDz
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● STEP 1 - RUN OCR / SPLIT FILES, IF NEEDED: ●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
CURRENT FILE: ➜ 23042024_0.pdf
temp. target file: /tmp/tmp.rW8RsjwPDz/step1_tmp_1713852089/23042024_0.pdf
-----------------------------------------------------------------------------------
| processing PDF @ OCRmyPDF: |
-----------------------------------------------------------------------------------
[runtime up to now: 00:00:00]
➜ OCRmyPDF-LOG:
DEBUG ocrmypdf - ocrmypdf 12.7.2.post0+g8be9a68c.d20211104
DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--list-langs']
DEBUG ocrmypdf.subprocess.tesseract - stdout/stderr = List of available languages (7):
chi_sim
deu
eng
fra
osd
por
spa
DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--version']
DEBUG ocrmypdf.subprocess - Found tesseract 4.1.1
DEBUG ocrmypdf.subprocess - Running: ['gs', '--version']
DEBUG ocrmypdf.subprocess - Found gs 9.53.3
INFO ocrmypdf._validation - reading file from standard input
DEBUG ocrmypdf.helpers - os.symlink(/tmp/ocrmypdf.io.ejtk_c0m/stdin, /tmp/ocrmypdf.io.ejtk_c0m/origin.pdf)
DEBUG ocrmypdf.builtin_plugins.tesseract_ocr - Using Tesseract OpenMP thread limit 3
DEBUG ocrmypdf.subprocess - 1 Running: ['gs', '-dQUIET', '-dSAFER', '-dBATCH', '-dNOPAUSE', '-sDEVICE=jpeggray', '-dFirstPage=1', '-dLastPage=1', '-r299.863118x299.863118', '-o', '-', '-sstdout=%stderr', '-dAutoRotatePages=/None', '-f', '/tmp/ocrmypdf.io.ejtk_c0m/origin.pdf']
DEBUG ocrmypdf._exec.ghostscript - 1 Rotating output by 0
DEBUG ocrmypdf.subprocess - 1 Running: ['tesseract', '-l', 'osd', '--psm', '0', '/tmp/ocrmypdf.io.ejtk_c0m/000001_rasterize_preview.jpg', 'stdout']
INFO ocrmypdf._pipeline - 1 page is facing ⇧, confidence 11.02 - no change
DEBUG ocrmypdf._pipeline - 1 Rasterize with png16m, rotation 0
DEBUG ocrmypdf.subprocess - 1 Running: ['gs', '-dQUIET', '-dSAFER', '-dBATCH', '-dNOPAUSE', '-sDEVICE=png16m', '-dFirstPage=1', '-dLastPage=1', '-r299.863118x299.863118', '-o', '-', '-sstdout=%stderr', '-dAutoRotatePages=/None', '-f', '/tmp/ocrmypdf.io.ejtk_c0m/origin.pdf']
DEBUG PIL.PngImagePlugin - 1 STREAM b'IHDR' 16 13
DEBUG PIL.PngImagePlugin - 1 STREAM b'iCCP' 41 2354
DEBUG PIL.PngImagePlugin - 1 iCCP profile name b'default_rgb.icc'
DEBUG PIL.PngImagePlugin - 1 Compression method 0
DEBUG PIL.PngImagePlugin - 1 STREAM b'pHYs' 2407 9
DEBUG PIL.PngImagePlugin - 1 STREAM b'tEXt' 2428 31
DEBUG PIL.PngImagePlugin - 1 STREAM b'IDAT' 2471 8192
DEBUG ocrmypdf._exec.ghostscript - 1 Rotating output by 0
DEBUG PIL.PngImagePlugin - 1 STREAM b'IHDR' 16 13
DEBUG PIL.PngImagePlugin - 1 STREAM b'pHYs' 41 9
DEBUG PIL.PngImagePlugin - 1 STREAM b'IDAT' 62 8192
DEBUG ocrmypdf._pipeline - 1 resolution (299.9994, 299.9994)
DEBUG PIL.PngImagePlugin - 1 STREAM b'IHDR' 16 13
DEBUG PIL.PngImagePlugin - 1 STREAM b'pHYs' 41 9
DEBUG PIL.PngImagePlugin - 1 STREAM b'IDAT' 62 8192
DEBUG ocrmypdf._pipeline - 1 convert
DEBUG img2pdf - 1 PIL format = JPEG
DEBUG img2pdf - 1 imgformat = JPEG
DEBUG img2pdf - 1 input dpi = 300 x 300
DEBUG img2pdf - 1 rotation = 0°
DEBUG img2pdf - 1 input colorspace = RGB
DEBUG img2pdf - 1 width x height = 866px x 3286px
DEBUG img2pdf - 1 read_images() embeds a JPEG
DEBUG ocrmypdf._pipeline - 1 convert done
DEBUG ocrmypdf.subprocess - 1 Running: ['tesseract', '-l', 'deu+eng', '-c', 'textonly_pdf=1', '/tmp/ocrmypdf.io.ejtk_c0m/000001_ocr.png', '/tmp/ocrmypdf.io.ejtk_c0m/000001_ocr_tess', 'pdf', 'txt']
DEBUG ocrmypdf._graft - 1 Emplacement update
DEBUG ocrmypdf._graft - 1 Text rotation: (text, autorotate, content) -> text misalignment = (0, 0, 0) -> 0
DEBUG ocrmypdf._graft - 1 Grafting
DEBUG ocrmypdf._graft - 1 Page rotation: (content, auto) -> page = (0, 0) -> 0
INFO ocrmypdf._sync - Postprocessing...
DEBUG ocrmypdf.helpers - os.symlink(/tmp/ocrmypdf.io.ejtk_c0m/graft_layers.pdf, /tmp/ocrmypdf.io.ejtk_c0m/fix_docinfo.pdf)
DEBUG ocrmypdf.subprocess - Running: ['gs', '-dBATCH', '-dNOPAUSE', '-dSAFER', '-dCompatibilityLevel=1.6', '-sDEVICE=pdfwrite', '-dAutoRotatePages=/None', '-sColorConversionStrategy=LeaveColorUnchanged', '-dAutoFilterColorImages=true', '-dAutoFilterGrayImages=true', '-dJPEGQ=95', '-dPDFA=2', '-dPDFACompatibilityPolicy=1', '-o', '-', '-sstdout=%stderr', '/tmp/ocrmypdf.io.ejtk_c0m/fix_docinfo.pdf', '/tmp/ocrmypdf.io.ejtk_c0m/pdfa.ps']
DEBUG ocrmypdf.subprocess.gs - GPL Ghostscript 9.53.3 (2020-10-01)
DEBUG ocrmypdf.subprocess.gs - Copyright (C) 2020 Artifex Software, Inc. All rights reserved.
DEBUG ocrmypdf.subprocess.gs - This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
DEBUG ocrmypdf.subprocess.gs - see the file COPYING for details.
DEBUG ocrmypdf.subprocess.gs - Processing pages 1 through 1.
DEBUG ocrmypdf.subprocess.gs - Page 1
DEBUG ocrmypdf.optimize - Treating 18 as an optimization candidate
DEBUG ocrmypdf.optimize - XrefExt(xref=18, ext='.png')
DEBUG ocrmypdf.optimize - Optimizable images: JPEGs: 0 PNGs: 1
DEBUG ocrmypdf.optimize - Treating 18 as an optimization candidate
DEBUG ocrmypdf.optimize - Optimizable images: JBIG2 groups: (0,)
INFO ocrmypdf.optimize - Optimize ratio: 1.00 savings: 0.0%
DEBUG ocrmypdf.helpers - os.symlink(/tmp/ocrmypdf.io.ejtk_c0m/optimize.opt.pdf, /tmp/ocrmypdf.io.ejtk_c0m/optimize.pdf)
DEBUG ocrmypdf._pipeline - /tmp/ocrmypdf.io.ejtk_c0m/optimize.pdf -> -
INFO ocrmypdf._sync - Output sent to stdout
← OCRmyPDF-LOG-END
[runtime up to now: 00:00:24]
target file (OK): /tmp/tmp.rW8RsjwPDz/step1_tmp_1713852089/23042024_0.pdf
-----------------------------------------------------------------------------------
| document split handling: |
-----------------------------------------------------------------------------------
splitpage count: 0
no separator sheet found, or number of pages too small
-----------------------------------------------------------------------------------
| handle source file: |
-----------------------------------------------------------------------------------
➜ File name already exists! Add counter (2)
➜ backup source file to: /volume1/Dokumente/synocr/backup/23042024_0 (2).pdf
removed directory '/tmp/tmp.rW8RsjwPDz/step1_tmp_1713852089/'
Stats:
runtime last file: ➜ 00:00:24
runtime 1st step (all files): ➜ 00:00:25