My Terrible Car I Bought With Data

Data-driven decisions are vulnerable to a dangerous fallacy, a version of The Map Is Not The Territory fallacy: optimizing for maximum value given the factors you can measure will give an answer… but you risk choosing a faulty local minimum because of factors you forget or can't measure.

Anyway, I now own a terrible car that will last forever. 🤷

Years ago my first car (A trusty Honda Civic) finally perished. I spent a week of nights scouring Craigslist, clicking on every link with the words "civic," "accord," "corolla." Mind numbingly boring. That listing was far away. That one was dirty. Too expensive. Too scary. It's the least fun game of Goldilocks I've ever played.

While googling "How to buy a car" I learned that certain years tend to be more reliable: roughly the last two years of a body-style. Every four to five years car companies give their car a makeover. The first few years of that new style have teething problems and the last years of that generation are the statistically more reliable.

I also wanted to buy one that was mechanically sound and didn't rely on expensive computer parts… usually cheaper to repair with the mechanics I was using. That (and my limited budget) ruled out some of the newer models. At that time that meant searching for years between 2006 and 2012.

This… sounds like a heuristic! I was bored, so I rolled up nokogiri and started breaking TOS with a webscraper.

If you aren't a rubyist: I wrote a program that clicked on every link in Craigslist that matched my criteria, copied the values out of the years, location, and mileage, and made a spreadsheet of only vehicles that interests me.

At first, it was great! I ran the script on my lunch break I saved myself thirty minutes of scrolling and used the list of links to shop quickly and efficiently.

I was still overwhelmed. How could I spot the diamond in the rough? I could ask someone for help… or I could use made-up napkin math! (I chose made up math. 👨‍🔬)


I threw the CSV into Google Sheets and decided that a good way to narrow the field was to assume that all these cars had a lifespan of 250k miles. My napkin math formula was the (250k Miles - Current Mileage) / Current Price = Price per Prospective Mile. It's a good enough theory, and the data seemed to confirm it: Normalized for price per prospective mile most of the cars had an average value. Newer cars had lower mileage and high price, older cars had higher mileage and low price, and the price per mile was about the same across all six years and all models. I started graphing, prodding, and playing with the data… and a couple days into using this system The Outlier jumped off the graph.

It seemed too good to be true.

The Outlier was a 2006 Honda Accord. Towards the end of its body style, and desirable because it has a rare chain drive. I test drove The Outlier with my wife, the car seemed fine, seemed ok… I felt victorious and validated. I can solve adult problems without help and with code!

I drove it home and the next day whatever they had sprayed in The Outlier to deaden the smoke odor had disappeared and my new commuter car revealed the power to instantly grant me migraine headaches. Several large payments and three ozone treatments later… you can't smell the smoke except on very hot and humid days. I live in a very hot and humid place. 😓

Over time I learned some other quirks of The Outlier.

It's got a broken computer programmed by Fremen: blinkers and windshield wipers are strange, they toggle without rhythm to avoid attracting the worm.

It's got a slick, weird stain that will not come out no matter how I scrub.

The cabin has humidity issues from somewhere, so the cabin fabric is dropping away from the roof.

I kinda wonder if someone died in it.

But it is low mileage and has a chain drive so it's likely I'll be driving my data-driven Outlier decision for far longer than I would wish. It will remain in my garage with its key in my pocket to remind me of my hubris: I optimized for two valuable factors… but in my cleverness I forsook wisdom… and forgot all the ways humans can disguise a lemon. 🍋

Be careful with your data-driven decisions, you could be stuck driving them.

  • 2023-02-26 19:45:37 -0600
    rename: markdown -> md

  • 2023-02-24 17:19:44 -0600
    Change punchline

  • 2023-02-24 17:15:55 -0600
    Silly post