Behind the scenes at Bridge’s Abridged Books

So I don’t forget how I did this.

I downloaded text versions of books from Project Gutenberg. I chose my targets from https://www.gutenberg.org/browse/scores/top, plus I threw in the Declaration of Independence, as I was curious what the first Project Gutenberg text was.

Example:

~/bridges-abridged-books/originals $ wget https://www.gutenberg.org/ebooks/1.txt.utf-8

~/bridges-abridged-books/originals $ mv 1.txt.utf-8 declaration-independence.txt

Next, I used a simple grep for “bridge” & included extra lines of context before & after the found mention:

~/bridges-abridged-books $ for BOOK in $(ls originals/); do grep -C5 bridge originals/$BOOK > abridged/$BOOK; done

I do want to go back & clean up my version of Middlemarch at some point — my grep pulled in a lot of “Cambridge” and “Bambridge”, overshadowing the structure mentions. But for now, I’m calling this good enough!