Formatted Text Output?
Formatted Text Output?
When saving the OCR result as regular text (.txt), will the formatting be preserved? This is important for example if amounts need to line up under a certain column name. Like ( had to added the --- because posting the question removes the extra spaces!):
Date--------Description---------------Credit------------Debit
01/12-------text------------------------100.00
01/14-------text-----------------------------------------2,392.00
If the formatting is removed it could look like:
Date Description Credit Debit
01/12 text 100.00
01/14 text 2,392.00
Which makes it impossible to tell debit from credit.
Date--------Description---------------Credit------------Debit
01/12-------text------------------------100.00
01/14-------text-----------------------------------------2,392.00
If the formatting is removed it could look like:
Date Description Credit Debit
01/12 text 100.00
01/14 text 2,392.00
Which makes it impossible to tell debit from credit.
Re: Formatted Text Output?
Hi Dwg,
In our latest minor release we have improved text formatting when extracting text after OCR and saving the results as .txt.
This feature was greatly improved/implemented a few weeks ago.
I suggest you try this. Feel free to provide any document you are having trouble with and we'll take a look at it and fix it if necessary.
Regards,
In our latest minor release we have improved text formatting when extracting text after OCR and saving the results as .txt.
This feature was greatly improved/implemented a few weeks ago.
I suggest you try this. Feel free to provide any document you are having trouble with and we'll take a look at it and fix it if necessary.
Regards,
Re: Formatted Text Output?
Can you check this example PDF doc? It is important to keep the amounts under the correct columns...
I think I would like to evaluate v14 if this looks good. Thanks
I think I would like to evaluate v14 if this looks good. Thanks
- Attachments
-
- exampleB_good.pdf
- (19.04 KiB) Downloaded 560 times
Re: Formatted Text Output?
Hi Dwg,
Currently this is implemented but improvements can still be made. This is quite complex to implement as it needs to take into count the font style as well as the spaces.
This is currently how our OCR demo can render this. See attachments.
Regards
Currently this is implemented but improvements can still be made. This is quite complex to implement as it needs to take into count the font style as well as the spaces.
This is currently how our OCR demo can render this. See attachments.
Regards
Who is online
Users browsing this forum: No registered users and 1 guest