Could you complete the ultimate American road trip in less time? And can you figure it out without being a "data genius" if you had the right tools?
You may have seen an article recently about a “data genius [who] computes the ultimate American road trip.” It was one of the hottest trending articles of the week, and it had hundreds of thousands of links on every social media outlet and channel.
While I’m still not sure I understand why so much of the world was interested in such an academic concept—I guess that’s just the nature of viral content—it made me think: what if you were not a “data genius?” Could you still determine an optimized route of the ultimate American road trip? Given the tremendous number of views, links, and retweets, I am certain that I wasn’t the only one with the same question.
So I performed some research with off-the-shelf products to determine if a layperson who was not a data genius could solve this same problem with similar results.
Spoiler alert: the answer is yes.
While I am admittedly more data savvy than the average person, I will prove to you that one does not have to be a data genius to solve this problem. You can do it yourself without knowing anything about extremely complex algorithms or access to massive supercomputers.
Disclaimer: We’re going to have to trudge through a few technical details before we can arrive at my simple solution, so please bear with me for a few paragraphs while we set the stage!
Before we get into the details of my solution, let’s first take a look at Randy Olson’s blog post on the issue so we can define the parameters of the challenge and the goal of the solution. Randy is the data genius who originally created the algorithm and wrote about the solution after being prompted by Tracy Staedter at Discovery News to take on this challenge. (Why Tracy nudged Randy to take this on is interesting, and I’d recommend that you read about it, but for now, let’s just see what Randy and Tracy have to say about this issue.)
The singular goal of the challenge is also simple: Minimize the time spent driving. In other words, determine the optimal route that hits all stops without unnecessary backtracking.
The rules that were set for the challenge are simple:
There was a lot of controversy about rule #2. Many people got caught up in the selection of stops and missed the point of the challenge. The stops along the route are arbitrary, as long as they fall within the bounds of the aforementioned rules. With that said, even rule #2 is arbitrary—it was just included for fun so the challenge wouldn’t seem so academic. The only things that really matter for this academic exercise are: 1) make at least one stop in each of the lower 48 states without leaving the country and 2) find the optimal route of driving among each stop.
This is nothing more than an application of the travelling salesman problem (TSP). What makes this such an interesting challenge is the sheer number of stops. Randy estimates that, using traditional TSP methodologies, it would take over 964,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 years to calculate the optimal route among these 48 stops. Randy found a way to approximate an answer to this TSP that was MUCH faster and “good enough.”
I’ll leave it to you to read the details about Randy’s approach in his blog post, but in short, it is rife with extremely complex genetic algorithms, which is a field of study within the realm of artificial intelligence. He employs hundreds of lines of cobbled-together code in various programming languages, and the inputs to his algorithms were based upon 2500 sets of heuristic data he pulled about each potential leg of the overall route.
Note: there was also a lot of controversy about the number of possible legs in the route, but you can argue with Randy about that if you disagree—it doesn’t matter for my solution!
If I haven’t lost you after reading the last paragraph or Randy’s description of his approach, congratulations! I promise there won’t be any more techno-speak for the remainder of the post!
The essence of my solution is simple: break up the route into bite-sized segments that off-the-shelf route optimizers can handle.
This part of my strategy has a lot in common with Randy’s, except mine uses common sense to define the segments instead of artificial intelligence.
So how do we break up our stops into sets of segments? We must find a few key “endpoints” that act as spoke hubs within the overall route. How we arrive at and leave from each segment endpoint does not matter; what matters is that each endpoint be a “gateway” to another set of points that need to be optimized. Yes, this is a bit of an academic leap, but I’m taking the same liberties as Randy by claiming this is “good enough.”
This process makes a lot more sense when seeing all of the stops on a map. There are several free online tools to plot the stops, but in this example, I am using a product called QlikMaps, which is an extension for data discovery tools called QlikView and Qlik Sense. I used QlikMaps because I am very familiar with the product, and it allows me to easily vary the colors of my stops, which is a valuable nice-to-have feature in a future step toward my solution.
Let’s start with an easy example of a segment endpoint: Cape Canaveral in Florida. As we can see on the map, there is only one way in and one way out. There is route optimization that must occur before and after Cape Canaveral, but it is clear by looking at the map that this stop is a gateway between segments.
Now let’s find the other endpoint of this segment. At first glance, it may seem that Acadia National Park in Maine could be a gateway. However, under closer scrutiny and with the application of a little common sense, we can determine that this is not a candidate for a segment endpoint: it is not clear if we would start in Maine then work our way west or south -OR- if Maine is just one of many stops on our journey from the west through Vermont or New Hampshire.
Instead we see that the Fox Theater in Detroit is a better candidate. We don’t yet know if we will arrive in Detroit from Ohio, Indiana, Illinois, or Wisconsin; and we don’t know if we will leave Detroit bound for West Virginia, Vermont, or someplace else on the eastern seaboard. All we know at this juncture is that Detroit is our gateway from the Midwest to/from the East.
Following similar logic, we find that our four segment endpoints are:
Coloring our segment endpoints in red and giving each segment a unique color, this is what the result of our grouping exercise looks like in QlikMaps:
Now that we have broken up our stops into bite-sized segments and can visualize the ordering of said segments, we can use an off-the-shelf route optimizer for each segment, then add the results of each step together:
So which off-the-shelf route optimizer to use? For the first time in this post, I respectfully disagree with Randy’s position. He mentions that Google accepts up to 10 addresses as waypoints, RouteXL accepts up to 20… and that’s the best available for free. He is incorrect. MapQuest can handle up to 25 addresses as waypoints, and it has many more options for optimization: you can specify fastest vs. shortest route and whether or not to avoid tolls/country borders/ferries/etc.
Now you can manually copy and paste the addresses for each segment into the MapQuest route optimizer. What you will find is that the results will be 251 miles shorter than Randy’s solution yields!
Can we optimize things ever further? Make the process even more turnkey?
What if we wanted to optimize our route based upon drive time instead of driving distance? Update results based upon current driving conditions? Use different segment endpoints? Change some stops? Add additional stops? See if the route is faster traveling counterclockwise instead of clockwise? Bend the rules and take shortcut through Canada? Decide we want to avoid tolls? Ferries?
While we could certainly play out all of these what-if scenarios by repeatedly copying and pasting addresses into the MapQuest route optimizer, it is much easier to play out these scenarios in a tool called QlikRoutes. QlikRoutes is another extension for QlikView and Qlik Sense, and it uses the same engine as the MapQuest route optimizer behind the scenes. This allows you to tweak your route in an easy-to-use interface and see the same results as before:
As I have proven, a layperson who is not a data genius can, in fact, determine the optimal route for the ultimate US road trip using off-the-shelf products! And hopefully it is clear that I mean no disrespect to Randy Olson. I’d love to chat about his interesting work over a beer sometime… so if you’re reading this, Randy, please drop me a note!
Patrick Vinton is the CTO and Product Architect for QlikMaps and QlikRoutes. QlikMaps is a mapping visualization and location analytics engine for QlikView and Qlik Sense. QlikRoutes is a route optimization engine for QlikView and Qlik Sense. Customers all over the world use QlikMaps and QlikRoutes for commercial applications.