-----------------------------------
| ==> installation info <== |
-----------------------------------
synOCR-user: synOCR
synOCR-user is admin: yes
synOCR-version: 1.4.5
Architecture: x86_64
DSM-build: 69057
Device: 1821plus (2072594204)
current Profil: Custom
monitor is running?: yes
DB-version: 9
used image (created): jbarlow83/ocrmypdf:v12.7.2 (2021-11-04T21:53:21)
document author:
used ocr-parameter (raw): -srd -l deu+eng
ocropt_array: -srd -l deu+eng
search prefix:
replace search prefix: yes
renaming syntax: §yocr-§mocr-§docr_§tag_§tit
Symbol for tag marking: #
target file handling: useCatDir
Document split pattern: SYNOCR-SEPARATOR-SHEET
split page handling: discard
delete blank pages:
threshold black/white:
threshold black pixels:
clean up spaces: false
Date search method: use Python
date found order: firstfound
source for filedate: ocr
ignored dates by search: ;
date range in past: 0 [absolute: 0]
date range in future: 0 [absolute: 0]
Docker test: OK
DSM notify to user: admin
apprise notify service:
apprise attachment: false
notify language: ger
Loglevel: normal
max. count of logfiles: 10
rotate backupfiles after: (purge backup deactivated)
Source directory: /volume1/HP Scans/
Target directory: /volume1/HP Scans/Output/
BackUp directory: /volume1/HP Scans/Backup/
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● ---------------------------------- ●
● | ==> RUN THE FUNCTIONS <== | ●
● ---------------------------------- ●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
-----------------------------------------------------------------------------------
| check the python3 installation and the necessary modules: |
-----------------------------------------------------------------------------------
prepare_python: OK
-----------------------------------------------------------------------------------
| convert images to pdf |
-----------------------------------------------------------------------------------
nothing to do ...
Target temp directory: /tmp/tmp.o5dmeixmdH
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● STEP 1 - RUN OCR / SPLIT FILES, IF NEEDED: ●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
CURRENT FILE: ➜ 26062024personal201605.pdf
temp. target file: /tmp/tmp.o5dmeixmdH/step1_tmp_1719483954/26062024personal201605.pdf
-----------------------------------------------------------------------------------
| processing PDF @ OCRmyPDF: |
-----------------------------------------------------------------------------------
➜ OCRmyPDF-LOG:
reading file from standard input
Start processing 2 pages concurrently
2 [tesseract] Too few characters. Skipping this page
2 [tesseract] Error during processing.
2 with existing rotation ⇩, page is facing ⇧, confidence 0.00 - no change
1 page is facing ⇧, confidence 13.16 - no change
2 Warning in pixFindSkewSweepAndSearchScorePivot: max found at sweep edge
Postprocessing...
Some input metadata could not be copied because it is not permitted in PDF/A. You may wish to examine the output PDF's XMP metadata.
Optimize ratio: 1.00 savings: 0.0%
Output sent to stdout
← OCRmyPDF-LOG-END
target file (OK): /tmp/tmp.o5dmeixmdH/step1_tmp_1719483954/26062024personal201605.pdf
-----------------------------------------------------------------------------------
| document split handling: |
-----------------------------------------------------------------------------------
splitpage count: 0
no separator sheet found, or number of pages too small
-----------------------------------------------------------------------------------
| handle source file: |
-----------------------------------------------------------------------------------
➜ backup source file to: /volume1/HP Scans/Backup/26062024personal201605.pdf
removed directory '/tmp/tmp.o5dmeixmdH/step1_tmp_1719483954/'
Stats:
runtime last file: ➜ 00:00:16
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
CURRENT FILE: ➜ 26062024personal201650.pdf
temp. target file: /tmp/tmp.o5dmeixmdH/step1_tmp_1719483970/26062024personal201650.pdf
-----------------------------------------------------------------------------------
| processing PDF @ OCRmyPDF: |
-----------------------------------------------------------------------------------
➜ OCRmyPDF-LOG:
reading file from standard input
1 page is facing ⇧, confidence 13.57 - no change
Postprocessing...
Some input metadata could not be copied because it is not permitted in PDF/A. You may wish to examine the output PDF's XMP metadata.
Optimize ratio: 1.00 savings: 0.0%
Output sent to stdout
← OCRmyPDF-LOG-END
target file (OK): /tmp/tmp.o5dmeixmdH/step1_tmp_1719483970/26062024personal201650.pdf
-----------------------------------------------------------------------------------
| document split handling: |
-----------------------------------------------------------------------------------
splitpage count: 0
no separator sheet found, or number of pages too small
-----------------------------------------------------------------------------------
| handle source file: |
-----------------------------------------------------------------------------------
➜ backup source file to: /volume1/HP Scans/Backup/26062024personal201650.pdf
removed directory '/tmp/tmp.o5dmeixmdH/step1_tmp_1719483970/'
Stats:
runtime last file: ➜ 00:00:17
runtime 1st step (all files): ➜ 00:00:33
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● STEP 2 - SEARCH TAGS / RENAME / SORT: ●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
CURRENT FILE: ➜ 26062024personal201650.pdf
-----------------------------------------------------------------------------------
| search tags in ocr text: |
-----------------------------------------------------------------------------------
source for tags is yaml based tag rule file [/volume1/HP Scans/synOCRrules.txt]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/__init__.py", line 125, in safe_load
return load(stream, SafeLoader)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/__init__.py", line 81, in load
return loader.get_single_data()
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/constructor.py", line 49, in get_single_data
node = self.get_single_node()
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 36, in get_single_node
document = self.compose_document()
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 55, in compose_document
node = self.compose_node(None, None)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 84, in compose_node
node = self.compose_mapping_node(anchor)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 133, in compose_mapping_node
item_value = self.compose_node(node, item_key)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 84, in compose_node
node = self.compose_mapping_node(anchor)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 133, in compose_mapping_node
item_value = self.compose_node(node, item_key)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 82, in compose_node
node = self.compose_sequence_node(anchor)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 111, in compose_sequence_node
node.value.append(self.compose_node(node, index))
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 84, in compose_node
node = self.compose_mapping_node(anchor)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 127, in compose_mapping_node
while not self.check_event(MappingEndEvent):
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/parser.py", line 98, in check_event
self.current_event = self.state()
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/parser.py", line 438, in parse_block_mapping_key
raise ParserError("while parsing a block mapping", self.marks[-1],
yaml.parser.ParserError: while parsing a block mapping
in "<unicode string>", line 18, column 7:
- newname: {date}_{time}_{bank_nam ...
^
expected <block end>, but found '<scalar>'
in "<unicode string>", line 18, column 22:
- newname: {date}_{time}_{bank_name}_{subject}.pdf
^
ERROR at line 448: tag_rule_content=$( python3 -c 'import sys, yaml, json; print(json.dumps(yaml.safe_load(sys.stdin.read()), indent=2, sort_keys=False))' < "${taglisttmp}")
ERROR - YAML-check failed!ERROR at line 2456: return 1
-----------------------------------------------------------------------------------
| search for a valid date in ocr text: |
-----------------------------------------------------------------------------------
2024-06-27 12:26:29,162 - Date scanning started
2024-06-27 12:26:29,162 - Version: 1.04
2024-06-27 12:26:29,162 - Parameter minYear = 0
2024-06-27 12:26:29,162 - Parameter maxYear = 0
2024-06-27 12:26:29,162 - Parameter searchnearest = off
2024-06-27 12:26:29,162 - set searchnearest = off
2024-06-27 12:26:29,162 - Parameter fileWithTextFindings = /tmp/tmp.o5dmeixmdH/step2_tmp_1719483987//synOCR.txt
2024-06-27 12:26:29,163 - Parameter dateBlackLIst = ;
2024-06-27 12:26:29,163 - start checking blacklist
2024-06-27 12:26:29,163 - end checking blacklist
2024-06-27 12:26:29,164 - Start searching for alphanumerical and numerical dates......
2024-06-27 12:26:34,898 - finish searching for alphanumerical and numerical dates......
2024-06-27 12:26:34,898 - found 4 dates
2024-06-27 12:26:34,898 - found date 2024-06-18
2024-06-27 12:26:34,898 - Date scanning ended
Dates found: 1
check date ([yy]yy mm dd): 2024-06-18
➜ valid
day: 18
month:06
year: 2024
-----------------------------------------------------------------------------------
| rename and sort to target folder: |
-----------------------------------------------------------------------------------
➜ renaming:
apply renaming syntax ➜ 2024-06-18__26062024personal201650
➜ insert metadata (use python pikepdf)
used metadata:
➜ '/Author': '',
➜ '/Keywords': '',
➜ '/CreationDate': 'D:20240618',
➜ '/CreatorTool': 'synOCR 1.4.5'
2024-06-27 12:26:35,284 - INFO - HandlePdf started
2024-06-27 12:26:35,284 - INFO - Version: 0.2
2024-06-27 12:26:35,284 - INFO - Task=metadata
2024-06-27 12:26:35,286 - INFO - >>>>> write meta_data started
2024-06-27 12:26:35,293 - INFO - save pdf to file (/tmp/tmp.o5dmeixmdH/step2_tmp_1719483987/temp_26062024personal201650_1719483987.pdf_meta.pdf)
empty
0
target file: 2024-06-18__26062024personal201650.pdf
-----------------------------------------------------------------------------------
| adjusts the attributes of the target file: |
-----------------------------------------------------------------------------------
➜ Adapt file date (Source: OCR)
-----------------------------------------------------------------------------------
| final tasks: |
-----------------------------------------------------------------------------------
INFO: Notify for apprise not defined ...
run user defined post scripts:
Stats:
runtime last file: ➜ 00:00:08
pagecount last file: ➜ 1
file count profile : ➜ (profile Custom) - 5 PDF's / 7 Pages processed up to now
file count total: ➜ 18 PDF's / 31 Pages processed up to now since 2024-06-26
cleanup:
delete tmp-files ...
removed '/tmp/tmp.o5dmeixmdH/26062024personal201650.pdf'
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
CURRENT FILE: ➜ 26062024personal201605.pdf
-----------------------------------------------------------------------------------
| search tags in ocr text: |
-----------------------------------------------------------------------------------
source for tags is yaml based tag rule file [/volume1/HP Scans/synOCRrules.txt]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/__init__.py", line 125, in safe_load
return load(stream, SafeLoader)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/__init__.py", line 81, in load
return loader.get_single_data()
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/constructor.py", line 49, in get_single_data
node = self.get_single_node()
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 36, in get_single_node
document = self.compose_document()
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 55, in compose_document
node = self.compose_node(None, None)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 84, in compose_node
node = self.compose_mapping_node(anchor)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 133, in compose_mapping_node
item_value = self.compose_node(node, item_key)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 84, in compose_node
node = self.compose_mapping_node(anchor)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 133, in compose_mapping_node
item_value = self.compose_node(node, item_key)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 82, in compose_node
node = self.compose_sequence_node(anchor)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 111, in compose_sequence_node
node.value.append(self.compose_node(node, index))
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 84, in compose_node
node = self.compose_mapping_node(anchor)
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/composer.py", line 127, in compose_mapping_node
while not self.check_event(MappingEndEvent):
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/parser.py", line 98, in check_event
self.current_event = self.state()
File "/usr/syno/synoman/webman/3rdparty/synOCR/python3_env/lib/python3.8/site-packages/yaml/parser.py", line 438, in parse_block_mapping_key
raise ParserError("while parsing a block mapping", self.marks[-1],
yaml.parser.ParserError: while parsing a block mapping
in "<unicode string>", line 18, column 7:
- newname: {date}_{time}_{bank_nam ...
^
expected <block end>, but found '<scalar>'
in "<unicode string>", line 18, column 22:
- newname: {date}_{time}_{bank_name}_{subject}.pdf
^
ERROR at line 448: tag_rule_content=$( python3 -c 'import sys, yaml, json; print(json.dumps(yaml.safe_load(sys.stdin.read()), indent=2, sort_keys=False))' < "${taglisttmp}")
ERROR - YAML-check failed!ERROR at line 2456: return 1
-----------------------------------------------------------------------------------
| search for a valid date in ocr text: |
-----------------------------------------------------------------------------------
2024-06-27 12:26:37,072 - Date scanning started
2024-06-27 12:26:37,072 - Version: 1.04
2024-06-27 12:26:37,072 - Parameter minYear = 0
2024-06-27 12:26:37,072 - Parameter maxYear = 0
2024-06-27 12:26:37,072 - Parameter searchnearest = off
2024-06-27 12:26:37,073 - set searchnearest = off
2024-06-27 12:26:37,073 - Parameter fileWithTextFindings = /tmp/tmp.o5dmeixmdH/step2_tmp_1719483995//synOCR.txt
2024-06-27 12:26:37,073 - Parameter dateBlackLIst = ;
2024-06-27 12:26:37,073 - start checking blacklist
2024-06-27 12:26:37,074 - end checking blacklist
2024-06-27 12:26:37,074 - Start searching for alphanumerical and numerical dates......
2024-06-27 12:26:42,817 - finish searching for alphanumerical and numerical dates......
2024-06-27 12:26:42,817 - found 2 dates
2024-06-27 12:26:42,817 - found date 2024-06-03
2024-06-27 12:26:42,817 - Date scanning ended
Dates found: 1
check date ([yy]yy mm dd): 2024-06-03
➜ valid
day: 03
month:06
year: 2024
-----------------------------------------------------------------------------------
| rename and sort to target folder: |
-----------------------------------------------------------------------------------
➜ renaming:
apply renaming syntax ➜ 2024-06-03__26062024personal201605
➜ insert metadata (use python pikepdf)
used metadata:
➜ '/Author': '',
➜ '/Keywords': '',
➜ '/CreationDate': 'D:20240603',
➜ '/CreatorTool': 'synOCR 1.4.5'
2024-06-27 12:26:43,200 - INFO - HandlePdf started
2024-06-27 12:26:43,201 - INFO - Version: 0.2
2024-06-27 12:26:43,201 - INFO - Task=metadata
2024-06-27 12:26:43,203 - INFO - >>>>> write meta_data started
2024-06-27 12:26:43,210 - INFO - save pdf to file (/tmp/tmp.o5dmeixmdH/step2_tmp_1719483995/temp_26062024personal201605_1719483995.pdf_meta.pdf)
empty
0
target file: 2024-06-03__26062024personal201605.pdf
-----------------------------------------------------------------------------------
| adjusts the attributes of the target file: |
-----------------------------------------------------------------------------------
➜ Adapt file date (Source: OCR)
-----------------------------------------------------------------------------------
| final tasks: |
-----------------------------------------------------------------------------------
INFO: Notify for apprise not defined ...
run user defined post scripts:
Stats:
runtime last file: ➜ 00:00:08
pagecount last file: ➜ 2
file count profile : ➜ (profile Custom) - 6 PDF's / 9 Pages processed up to now
file count total: ➜ 19 PDF's / 33 Pages processed up to now since 2024-06-26
cleanup:
delete tmp-files ...
removed '/tmp/tmp.o5dmeixmdH/26062024personal201605.pdf'
removed '/tmp/tmp.o5dmeixmdH/step2_tmp_1719483995/tmprulefile.txt'
removed '/tmp/tmp.o5dmeixmdH/step2_tmp_1719483995/synOCR.txt'
removed '/tmp/tmp.o5dmeixmdH/step2_tmp_1719483995/synOCR_filename.txt'
removed directory '/tmp/tmp.o5dmeixmdH/step2_tmp_1719483995/'
removed '/tmp/tmp.o5dmeixmdH/step2_tmp_1719483987/tmprulefile.txt'
removed '/tmp/tmp.o5dmeixmdH/step2_tmp_1719483987/synOCR.txt'
removed '/tmp/tmp.o5dmeixmdH/step2_tmp_1719483987/synOCR_filename.txt'
removed directory '/tmp/tmp.o5dmeixmdH/step2_tmp_1719483987'
removed directory '/tmp/tmp.o5dmeixmdH'
purge log files ...
delete 0 log files ( > 10 files)
delete 0 search files ( > 10 files)
purge backup deactivated!
runtime all files: ➜ 00:00:49
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● ---------------------------------- ●
● | ==> END OF FUNCTIONS <== | ●
● ---------------------------------- ●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●