commanderi
Benutzer
- Mitglied seit
- 25. Mrz 2011
- Beiträge
- 205
- Punkte für Reaktionen
- 4
- Punkte
- 18
Danke, keine Ahnung wie der da weg gekommen ist.-frd -l deu+eng
Jetzt habe ich gerade eine Rechnung von einer E-Mail in den Input geschoben und die will er nicht, hier der LOG:
-----------------------------------
| ==> installation info <== |
-----------------------------------
synOCR-user: root
synOCR-user is admin: yes
synOCR-version: 1.2.0
Architecture: x86_64
DSM-build: 23824
Device: 716plusII (4275539728)
current Profil: default
DB-version: 5
used image (created): jbarlow83/ocrmypdf:latest (2022-08-15T00:14:34)
used ocr-parameter (raw): -frd -l deu+eng
OCR-arg 1: -frd
OCR-arg 2: -l
OCR-arg 3: deu+eng
ocropt_array: -frd -l deu+eng
search prefix:
replace search prefix: yes
renaming syntax: §y-§m-§d_§tag_§tit
Symbol for tag marking: #
Document split pattern:
Date search method: use standard search via RegEx
source for filedate: now
ignored dates by search: 2021-02-29;2020-11-31
PATH-Variable: /sbin:/bin:/usr/sbin:/usr/bin:/usr/syno/sbin:/usr/syno/bin:/usr/local/sbin:/usr/local/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/syno/bin:/usr/syno/sbin:/usr/local/bin:/opt/usr/bin:/usr/syno/synoman/webman/3rdparty/synOCR/bin
Docker test: OK
DSM notify to user: admin
Loglevel: debug
max. count of logfiles: 100000
Source directory: /volume1/DMS/_INPUT/
Target directory: /volume1/DMS/_OUTPUT/
Files are deleted immediately! / No valid directory [/]
rotate backupfiles after: (purge backup deactivated)
----------------------------------
| ==> Funktionsaufrufe <== |
----------------------------------
show files in INPUT with transcoded special characters
@eaDir$
Rechnung_2022_08_DE_180015429386.pdf$
(pages counted with pdfinfo)
PROCESSING: ➜ Rechnung_2022_08_DE_180015429386.pdf (Tue Sep 6 10:20:03 CEST 2022)
temp. target file: /tmp/tmp.3rJYR2fsII/Rechnung_2022_08_DE_180015429386.pdf
[runtime up to now: 00:00:00]
➜ OCRmyPDF-LOG:
DEBUG ocrmypdf - ocrmypdf 13.7.1.dev22+g23f38305.d20220815
DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--version']
DEBUG ocrmypdf.subprocess - Found tesseract 4.1.1
DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--version']
DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--version']
DEBUG ocrmypdf.subprocess - Running: ['gs', '--version']
DEBUG ocrmypdf.subprocess - Found gs 9.55.0
DEBUG ocrmypdf.subprocess - Running: ['gs', '--version']
DEBUG ocrmypdf.subprocess - Running: ['tesseract', '--list-langs']
DEBUG ocrmypdf.subprocess.tesseract - stdout/stderr = List of available languages (7):
chi_sim
deu
eng
fra
osd
por
spa
INFO ocrmypdf._validation - reading file from standard input
DEBUG ocrmypdf.helpers - os.symlink(/tmp/ocrmypdf.io.cllpz4id/stdin, /tmp/ocrmypdf.io.cllpz4id/origin.pdf)
DEBUG ocrmypdf.builtin_plugins.tesseract_ocr - Using Tesseract OpenMP thread limit 2
INFO ocrmypdf._sync - Start processing 2 pages concurrently
INFO ocrmypdf._pipeline - 1 page already has text! - rasterizing text and running OCR anyway
INFO ocrmypdf._pipeline - 2 page already has text! - rasterizing text and running OCR anyway
DEBUG ocrmypdf.subprocess - 1 Running: ['gs', '-dQUIET', '-dSAFER', '-dBATCH', '-dNOPAUSE', '-dInterpolateControl=-1', '-sDEVICE=jpeggray', '-dFirstPage=1', '-dLastPage=1', '-r3178.807947x3178.807947', '-o', '-', '-sstdout=%stderr', '-dAutoRotatePages=/None', '-f', '/tmp/ocrmypdf.io.cllpz4id/origin.pdf']
DEBUG ocrmypdf.subprocess - 2 Running: ['gs', '-dQUIET', '-dSAFER', '-dBATCH', '-dNOPAUSE', '-dInterpolateControl=-1', '-sDEVICE=jpeggray', '-dFirstPage=2', '-dLastPage=2', '-r400.000000x400.000000', '-o', '-', '-sstdout=%stderr', '-dAutoRotatePages=/None', '-f', '/tmp/ocrmypdf.io.cllpz4id/origin.pdf']
DEBUG ocrmypdf._exec.ghostscript - 2 Rotating output by 0
DEBUG ocrmypdf.subprocess - 2 Running: ['tesseract', '-l', 'osd', '--psm', '0', '/tmp/ocrmypdf.io.cllpz4id/000002_rasterize_preview.jpg', 'stdout']
INFO ocrmypdf._pipeline - 2 page is facing ⇧, confidence 10.93 - no change
DEBUG ocrmypdf._pipeline - 2 Rasterize with png16m, rotation 0
DEBUG ocrmypdf.subprocess - 2 Running: ['gs', '-dQUIET', '-dSAFER', '-dBATCH', '-dNOPAUSE', '-dInterpolateControl=-1', '-sDEVICE=png16m', '-dFirstPage=2', '-dLastPage=2', '-r400.000000x400.000000', '-o', '-', '-sstdout=%stderr', '-dAutoRotatePages=/None', '-f', '/tmp/ocrmypdf.io.cllpz4id/origin.pdf']
DEBUG PIL.PngImagePlugin - 2 STREAM b'IHDR' 16 13
DEBUG PIL.PngImagePlugin - 2 STREAM b'iCCP' 41 2354
DEBUG PIL.PngImagePlugin - 2 iCCP profile name b'default_rgb.icc'
DEBUG PIL.PngImagePlugin - 2 Compression method 0
DEBUG PIL.PngImagePlugin - 2 STREAM b'pHYs' 2407 9
DEBUG PIL.PngImagePlugin - 2 STREAM b'tEXt' 2428 31
DEBUG PIL.PngImagePlugin - 2 STREAM b'IDAT' 2471 8192
DEBUG ocrmypdf._exec.ghostscript - 2 Rotating output by 0
DEBUG ocrmypdf.subprocess - 2 Running: ['tesseract', '-l', 'deu+eng', '--psm', '2', '/tmp/ocrmypdf.io.cllpz4id/000002_rasterize.png', 'stdout']
DEBUG PIL.PngImagePlugin - 2 STREAM b'IHDR' 16 13
DEBUG PIL.PngImagePlugin - 2 STREAM b'iCCP' 41 2350
DEBUG PIL.PngImagePlugin - 2 iCCP profile name b'ICC Profile'
DEBUG PIL.PngImagePlugin - 2 Compression method 0
DEBUG PIL.PngImagePlugin - 2 STREAM b'pHYs' 2403 9
DEBUG PIL.PngImagePlugin - 2 STREAM b'IDAT' 2424 65536
DEBUG PIL.PngImagePlugin - 2 STREAM b'IHDR' 16 13
DEBUG PIL.PngImagePlugin - 2 STREAM b'iCCP' 41 2350
DEBUG PIL.PngImagePlugin - 2 iCCP profile name b'ICC Profile'
DEBUG PIL.PngImagePlugin - 2 Compression method 0
DEBUG PIL.PngImagePlugin - 2 STREAM b'pHYs' 2403 9
DEBUG PIL.PngImagePlugin - 2 STREAM b'IDAT' 2424 65536
DEBUG ocrmypdf._pipeline - 2 resolution (399.9992, 399.9992)
DEBUG ocrmypdf._pipeline - 2 convert
DEBUG PIL.PngImagePlugin - 2 STREAM b'IHDR' 16 13
DEBUG PIL.PngImagePlugin - 2 STREAM b'iCCP' 41 2350
DEBUG PIL.PngImagePlugin - 2 iCCP profile name b'ICC Profile'
DEBUG PIL.PngImagePlugin - 2 Compression method 0
DEBUG PIL.PngImagePlugin - 2 STREAM b'pHYs' 2403 9
DEBUG PIL.PngImagePlugin - 2 STREAM b'IDAT' 2424 65536
DEBUG img2pdf - 2 PIL format = PNG
DEBUG img2pdf - 2 imgformat = PNG
DEBUG img2pdf - 2 input dpi = 400 x 400
DEBUG img2pdf - 2 rotation = 0°
DEBUG img2pdf - 2 input colorspace = RGB
DEBUG img2pdf - 2 width x height = 3307px x 4678px
DEBUG img2pdf - 2 read_images() embeds a PNG
DEBUG ocrmypdf._pipeline - 2 convert done
DEBUG ocrmypdf.subprocess - 2 Running: ['tesseract', '-l', 'deu+eng', '-c', 'textonly_pdf=1', '/tmp/ocrmypdf.io.cllpz4id/000002_ocr.png', '/tmp/ocrmypdf.io.cllpz4id/000002_ocr_tess', 'pdf', 'txt']
ERROR ocrmypdf._sync - A decompression bomb error was encountered while executing the pipeline. Use the argument --max-image-mpixels to raise the maximum image pixel limit.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/ocrmypdf/_sync.py", line 393, in run_pipeline
optimize_messages = exec_concurrent(context, executor)
File "/usr/local/lib/python3.10/dist-packages/ocrmypdf/_sync.py", line 280, in exec_concurrent
executor(
File "/usr/local/lib/python3.10/dist-packages/ocrmypdf/_concurrent.py", line 87, in __call__
self._execute(
File "/usr/local/lib/python3.10/dist-packages/ocrmypdf/builtin_plugins/concurrency.py", line 141, in _execute
result = future.result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.10/dist-packages/ocrmypdf/_sync.py", line 191, in exec_page_sync
rasterize_preview_out = rasterize_preview(page_context.origin, page_context)
File "/usr/local/lib/python3.10/dist-packages/ocrmypdf/_pipeline.py", line 345, in rasterize_preview
page_context.plugin_manager.hook.rasterize_pdf_page(
File "/usr/local/lib/python3.10/dist-packages/pluggy/_hooks.py", line 265, in __call__
return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult)
File "/usr/local/lib/python3.10/dist-packages/pluggy/_manager.py", line 80, in _hookexec
return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
File "/usr/local/lib/python3.10/dist-packages/pluggy/_callers.py", line 60, in _multicall
return outcome.get_result()
File "/usr/local/lib/python3.10/dist-packages/pluggy/_result.py", line 60, in get_result
raise ex[1].with_traceback(ex[2])
File "/usr/local/lib/python3.10/dist-packages/pluggy/_callers.py", line 39, in _multicall
res = hook_impl.function(*args)
File "/usr/local/lib/python3.10/dist-packages/ocrmypdf/builtin_plugins/ghostscript.py", line 67, in rasterize_pdf_page
ghostscript.rasterize_pdf(
File "/usr/local/lib/python3.10/dist-packages/ocrmypdf/_exec/ghostscript.py", line 121, in rasterize_pdf
with Image.open(BytesIO(p.stdout)) as im:
File "/usr/local/lib/python3.10/dist-packages/PIL/Image.py", line 3133, in open
im = _open_core(fp, filename, prefix, formats)
File "/usr/local/lib/python3.10/dist-packages/PIL/Image.py", line 3120, in _open_core
_decompression_bomb_check(im.size)
File "/usr/local/lib/python3.10/dist-packages/PIL/Image.py", line 3029, in _decompression_bomb_check
raise DecompressionBombError(
PIL.Image.DecompressionBombError: Image size (977066020 pixels) exceeds limit of 500000000 pixels, could be decompression bomb DOS attack.
← OCRmyPDF-LOG-END
[runtime up to now: 00:00:44]
┖➜ failed! (target file is empty or not available)
ERROR-Directory [/volume1/DMS/_INPUT/ERRORFILES] will be created!
Ist die irgendwie geschützt oder warum kann er die nicht verarbeiten?
Danke