So, I’m probably 60% complete, in terms of lines of code, working on a script to parse data out of a bunch of HTML pages, but I’m probably 90% complete in terms of time spent. I’ve figured out the basic form of the regular expressions I need to use, now I just need to write more of them — easy as pie!
So, of course, I want to change my approach now to something more “elegant,” rather than creating a regexp for each piece of data I want to extract, I want to create a sort a more generic parser that will extract all the fields in the documents, and let me easily query to get the pieces I want. It will be cool to figure out how to do it, even suboptimally. It’s also going to take me probably another day, or more likely, two more days.
The right thing, in this case, is to just finish the damn script the way I started it and be done with it. Sigh.
Update: I ended up splitting the difference because I was having trouble getting things to work with my original approach.