-----------------------------------
| ==> installation info <== |
-----------------------------------
synOCR-user: synOCR
synOCR-user is admin: yes
synOCR-version: 1.4.5
Architecture: x86_64
DSM-build: 72806
Device: 718plus (0199208118)
current Profil: XXX
monitor is running?: yes
DB-version: 9
used image (created): jbarlow83/ocrmypdf:latest (2025-01-28T07:59:25)
document author:
used ocr-parameter (raw): -r -d -f -l deu+eng --pdf-renderer hocr
ocropt_array: -r -d -f -l deu+eng --pdf-renderer hocr
search prefix:
replace search prefix: yes
renaming syntax: §tit
Symbol for tag marking: #
target file handling: useCatDir
Document split pattern:
split page handling: discard
delete blank pages:
threshold black/white:
threshold black pixels:
clean up spaces: false
Date search method: use Python
date found order: firstfound
source for filedate: source
ignored dates by search:
date range in past: 0 [absolute: 0]
date range in future: 0 [absolute: 0]
Docker test: OK
DSM notify to user: XXX
apprise notify service:
apprise attachment: false
notify language: ger
Loglevel: normal
max. count of logfiles: 10
rotate backupfiles after: 1 days
Source directory: /volume1/synOCR/XXX/
Target directory: /volume1/Daten/XXX/
Files are deleted immediately! / No valid directory [/]
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● ---------------------------------- ●
● | ==> RUN THE FUNCTIONS <== | ●
● ---------------------------------- ●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
-----------------------------------------------------------------------------------
| check the python3 installation and the necessary modules: |
-----------------------------------------------------------------------------------
prepare_python: OK
Target temp directory: /tmp/tmp.Lfwpb85NOt
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● STEP 1 - RUN OCR / SPLIT FILES, IF NEEDED: ●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
CURRENT FILE: ➜ scan.pdf
temp. target file: /tmp/tmp.Lfwpb85NOt/step1_tmp_1738069202/scan.pdf
-----------------------------------------------------------------------------------
| processing PDF @ OCRmyPDF: |
-----------------------------------------------------------------------------------
➜ OCRmyPDF-LOG:
reading file from standard input
1 page already has text! - rasterizing text and running OCR anyway
1 page is facing ⇧, confidence 18.19 - rotation appears correct
Postprocessing...
Some input metadata could not be copied because it is not permitted in PDF/A. You may wish to examine the output PDF's XMP metadata.
Image optimization ratio: 1.47 savings: 32.0%
Total file size ratio: 0.74 savings: -35.5%
Output sent to stdout
← OCRmyPDF-LOG-END
target file (OK): /tmp/tmp.Lfwpb85NOt/step1_tmp_1738069202/scan.pdf
no split pattern defined or splitting not possible
-----------------------------------------------------------------------------------
| handle source file: |
-----------------------------------------------------------------------------------
➜ delete source file (scan.pdf)
removed directory '/tmp/tmp.Lfwpb85NOt/step1_tmp_1738069202/'
Stats:
runtime last file: ➜ 00:00:40
runtime 1st step (all files): ➜ 00:00:40
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● STEP 2 - SEARCH TAGS / RENAME / SORT: ●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
CURRENT FILE: ➜ scan.pdf
-----------------------------------------------------------------------------------
| search tags in ocr text: |
-----------------------------------------------------------------------------------
no tags defined
-----------------------------------------------------------------------------------
| search for a valid date in ocr text: |
-----------------------------------------------------------------------------------
2025-01-28 14:00:44,976 - Date scanning started
2025-01-28 14:00:44,976 - Version: 1.04
2025-01-28 14:00:44,976 - Parameter minYear = 0
2025-01-28 14:00:44,976 - Parameter maxYear = 0
2025-01-28 14:00:44,976 - Parameter searchnearest = off
2025-01-28 14:00:44,976 - set searchnearest = off
2025-01-28 14:00:44,977 - Parameter fileWithTextFindings = /tmp/tmp.Lfwpb85NOt/step2_tmp_1738069243//synOCR.txt
2025-01-28 14:00:44,977 - Start searching for alphanumerical and numerical dates......
2025-01-28 14:00:54,583 - finish searching for alphanumerical and numerical dates......
2025-01-28 14:00:54,584 - found 1 dates
2025-01-28 14:00:54,584 - found date 2025-01-14
2025-01-28 14:00:54,584 - Date scanning ended
Dates found: 1
check date ([yy]yy mm dd): 2025-01-14
➜ valid
day: 14
month:01
year: 2025
-----------------------------------------------------------------------------------
| rename and sort to target folder: |
-----------------------------------------------------------------------------------
➜ renaming:
apply renaming syntax ➜ scan
➜ insert metadata (use python pikepdf)
used metadata:
➜ '/Author': '',
➜ '/Keywords': '',
➜ '/CreationDate': 'D:20250114',
➜ '/CreatorTool': 'synOCR 1.4.5'
2025-01-28 14:00:55,231 - INFO - HandlePdf started
2025-01-28 14:00:55,232 - INFO - Version: 0.2
2025-01-28 14:00:55,232 - INFO - Task=metadata
2025-01-28 14:00:55,235 - INFO - >>>>> write meta_data started
2025-01-28 14:00:55,249 - INFO - save pdf to file (/tmp/tmp.Lfwpb85NOt/step2_tmp_1738069243/temp_scan_1738069243.pdf_meta.pdf)
empty
0
target file: scan.pdf
-----------------------------------------------------------------------------------
| adjusts the attributes of the target file: |
-----------------------------------------------------------------------------------
➜ Adapt file date (Source: Source file)
-----------------------------------------------------------------------------------
| final tasks: |
-----------------------------------------------------------------------------------
INFO: Notify for apprise not defined ...
run user defined post scripts:
Stats:
runtime last file: ➜ 00:00:12
pagecount last file: ➜ 1
file count profile : ➜ (profile XXX) - 229 PDF's / 604 Pages processed up to now
file count total: ➜ 386 PDF's / 938 Pages processed up to now since 2022-05-05
cleanup:
delete tmp-files ...
removed '/tmp/tmp.Lfwpb85NOt/scan.pdf'
removed '/tmp/tmp.Lfwpb85NOt/step2_tmp_1738069243/synOCR.txt'
removed '/tmp/tmp.Lfwpb85NOt/step2_tmp_1738069243/synOCR_filename.txt'
removed directory '/tmp/tmp.Lfwpb85NOt/step2_tmp_1738069243/'
removed directory '/tmp/tmp.Lfwpb85NOt'
purge log files ...
delete 0 log files ( > 10 files)
delete 0 search files ( > 10 files)
purge backup files ...
delete 0 backup files ( > 1 days)
runtime all files: ➜ 00:00:53
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● ---------------------------------- ●
● | ==> END OF FUNCTIONS <== | ●
● ---------------------------------- ●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●