Patrons of the Sacred Art

Can't log in? Contact Us

OPEN TO REGISTER: Click HERE if you want to join Alchemy Forums!

View RSS Feed

Greg Marcus

stick-to-itiveness

Rate this Entry
That is such a nonsense word. That it's in the mainstream makes me feel like we're literally living in the Idiocracy. There's much better words.. resolve.. perseverance, persistence, determination, etc. But I digress.

After probably about 5 days of work, I finally have the complete english translation of the zohar parsed from PDF and the verses inserted into a database. 17000+ verses. It would have been easy if the formatting was standardized. There's a combination of commentary and verses, each numbered individually. I had the the parsing figured out in no time.. it was the data that was bad. So I still pretty much had to go through each raw file fixing up the markup so my counts came out correctly. They most did their numbering with "numbered lists", so theres quite a few mistakes with the numbering too. Basically, while I haven't exactly read the whole thing, the whole zohar has scrolled in front of my eyes.

The idra raba section was the worst. It goes from 1-134, then in the middle, it starts over, the middle section is 1-364, and then the first section continues from 135-199. But it's done now. I have some fixing up to do, and then I have to do a front end.

As far as the windows indexing goes, I've abandoned the open-source "DocFetcher" It failed to index completely, giving me errors I don't understand on certain files, and then the indexing was stuck. It was supposedly indexing the same file all day, so I guess it had crashed. So I've deleted it. Copernic Desktop Search seems to be working pretty good though.

I'm going to set up a VM to install and test that DtSearch.. and Systran 6 while I'm at it.

Submit "stick-to-itiveness" to Digg Submit "stick-to-itiveness" to del.icio.us Submit "stick-to-itiveness" to StumbleUpon Submit "stick-to-itiveness" to Google

Tags: None Add / Edit Tags
Categories
Uncategorized

Comments