That's quite a lot! One of every 25, so probably at least one on the front page at any time? Are there domains that account for considerably more than 4%? Would be interesting to see a top ten list.
The chart says "Unique Bikes Used Per Day". Q1 2015 was a very cold winter, and I can imagine the system being so under-utilized that there were many bikes that sat unused through an entire day.
Sure, there are undoubtedly lots of examples of businesses that opened in desolate areas and created new taxi activity where there had been none previously, I just happened to focus on Williamsburg for the post.
Another idea that I didn't get around to doing was to look at concert venues and measure taxi traffic around particular concerts to see if it would correlate to bands' overall popularity
Yep, all on a 2012 MacBook Air. Data size was over 400GB with indexes
Simple queries on indexed columns of the trips table take a minute or two, more complicated queries that require a full sequence scan can take up to a few hours
All on internal storage? What was the max internal storage then, 512GB? Sounds like that's pretty tight, barely possible if you aren't doing anything else that takes up much space.
I wonder if it would be easier to stick the Postgres on a server, maybe AWS or local, and just do the queries from the laptop. Or maybe on a Tmux on the server, so you can let a long query run without having to keep the laptop up.
Yes, the database is all on the machine's local 512 GB hard drive. I did store the downloaded flat text files to an external drive and loaded them into the db from there.
Mostly to keep the app simple and still deployable on the free Heroku plan. Sure, it could handle multiple locations, but then it would quickly outgrow the free tier