undefined | Better HN

0 pointsjs25y ago0 comments

About 18 months ago I started using Letterboxd. Also, I like Roger Ebert's reviews. I wanted both in one place.

So I wrote some code to scrape his nearly 8000 reviews from rogerebert.com and then import them to letterboxd:

(I only put the first two paragraphs of his review on letterboxd then link to his full review on his site.)

The hard parts of this were:

- Extracting the text of his reviews correctly from his site's HTML. That wasn't too terrible though.

- Matching his reviews to the correct movies on TMDB. This just required a bunch of trial and error and about 20-30 manual corrections. I employed various strategies to match by using movie title, year of review, year of movie release (if on his review, but often off by a year or two), director, producer, cast if on his review.

I also built this for myself:

https://github.com/jaysoffian/eap_proxy

I should put my bin directory full of random scripts up on GitHub. I tend to build them as I need them. They're often very simple things like:

- jqpaste -- which is just "pbaste | jq"

- jsonl [jq|gron --stream] which takes it input and if it isn'v valid JSON, converts it to a JSON string so that I can paste random log output which is sometimes a mix of JSON and not into jq or gron.

Those are just a couple off the top of my head.

0 comments

1 comments · 1 top-level

regularperson255y ago

ohh so random that this was you!!, i've been following the roger user for a long time

j / k navigate · click thread line to collapse