I had a lovely experience today with Student Stores (no sarcasm here). I went in, asked for a copy of the book listing of Spring 2004 books (which I think is incomplete, those bastards!) ... they said it would cost $0.05 per printed page, and that they couldn't get it in a digital format.

No problem.

I got all of it printed, then I headed over to Undergrad Library to OCR the data at the Collaboratory (WHICH HAS FILM SCANNERS, WHY DID NO ONE TELL ME THIS?!). I OCRed all 63 pages into Word, then converted the Word docs to HTML.

I then came back to the dorm and wrote a quick PHP script that was able to quickly parse all 8 megs of text data (MS Word generates a LOT of crap HTML) and extract all book data!!!!

So now I have all the textbooks that go with each class. W00t!
Currently listening to: Yellowcard's View from Heaven
Posted by roy on November 11, 2003 at 04:51 PM in | Add a comment

Related Entries

Want to comment with Tabulas?. Please login.