Extract date from OCR'd PDF

Get help. Get answers. Let others lend you a hand.

Moderator: Mr_Noodle

Re: Extract date from OCR'd PDF Thu Oct 31, 2013 3:54 pm • by Mr_Noodle
See my question above.
Mr_Noodle
Site Admin
 
Posts: 11866
Joined: Sun Sep 03, 2006 1:30 am
Location: New York City

Re: Extract date from OCR'd PDF Fri Feb 21, 2014 9:52 am • by Nestorito
I modified the script to file several receipt but in one case the date format is:
Code: Select all
dd.mm.yyyy

The point is I don't know how to tell the script that the separator is a "."

Here's the line causing problems:
Code: Select all
x=`egrep -o 'entro il ..\/..\/....' "$path"/"_extract.txt" | awk '{print $3}'`
--instead of the slash (/) I have to use "." as a delimiter


can somebody help? :P
Nestorito
 
Posts: 6
Joined: Thu Sep 26, 2013 10:31 am

Re: Extract date from OCR'd PDF Tue Aug 05, 2014 6:16 am • by lucaberta
sussdorff wrote:Here is a Ruby Script which I so far successfully use to extract the date from the OCR'd PDF and store that date as the modification date. As you might guess from the naming of the months, I am from Germany, but US works as well.


thanks for the script, and for the idea to resort to a Ruby script to solve this issue. I like very much the flexibility you have configured in your script, with the possibility to capture dates in different formats, though in my case I found that the script captured my birthday rather than the invoice date, so I will need to hack it a bit!

For those who wonder how to use Ruby on OSX, it turns out that there is a need for an additional module called osx/cocoa, otherwise these sort of errors will appear:

Code: Select all
/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:55:in `require': cannot load such file -- osx/cocoa (LoadError)


The module is known officially as RubyCocoa:

https://en.wikipedia.org/wiki/RubyCocoa

The project is hosted on SourceForge and a very recent release 1.2, dated 2014-07-27 is now available. I have downloaded the Ruby 2.0 version and it runs nicely on OSX 10.9 Mavericks.

http://rubycocoa.sourceforge.net/

Ciao, Luca
lucaberta
 
Posts: 10
Joined: Tue Aug 05, 2014 6:09 am
Location: Lausanne, Switzerland

Previous

Return to Support