Monday, April 20, 2015

An introduction to the traveling salesman problem

There's always a relevant XKCD comic. From https://xkcd.com/399/
What's the most efficient way of going through a set of cities once and coming back to where you started? As simple as it sounds, iterating through every possible combination would be next to impossible if the list of cities is big enough. If there's 100 cities and you pick one as the starting point, there's still 99 more cities to choose from. The number of combinations would be 100(100-1)(100-2)... until there's only one choice left. The amount of possible solutions increases at a staggering rate as the amount of cities increase. Researchers have worked on trying to find a way to solve it as fast as possible. To understand the traveling salesman problem (TSP) in terms of computer science, we must first understand the concepts of time complexity.

Given an input n, the time complexity is determined by how much of the given input, n, we have to go through. For example, if I were to look through 100 boxes one by one, the worst case scenario would be finding what I was looking for after going through all 100 boxes. The time complexity of the example would be written as O(n) in the Big O notation because given the input, n = 100, and my search method, I need to look through all of them. Although I can find whatever it is in the first box I look into, time complexity is usually determined by the worst case scenario.

These time complexities are grouped into classes and the two important classes regarding TSP are classes P and NP. Problems in class P are solvable in polynomial time with a deterministic algorithm where one input can only have one output. Polynomial time meaning time complexities such as n, nlog(n), n10, etc. Problems in class NP, non-deterministic polynomial time, cannot be solved in polynomial time with a deterministic algorithm, but solutions can be checked in polynomial time. TSP falls under the class NP because while solving it is difficult we can verify the solution by checking if the solution goes through every city just once which takes a polynomial amount of time. The goal now is to find the algorithm that can solve TSP within polynomial time.

Why is this important? TSP is just one small droplet in the overarching problem of P versus NP. If a solution can be found for the decision version of TSP, this would prove P = NP meaning all NP-complete problems can be solved within a minimal amount of time. NP-complete problems are problems with no known shortcuts. This breakthrough will change the works of medical sciences, mathematics, engineering, transportation, and computer science. A lot of fundamental problems are NP-complete such as searching through DNA sequences, optimal protein folding, and shortening mathematical proofs. But even after decades of research, P = NP still hasn't been proven and it's becoming more of a fantasy. For more information on P and NP problems or an overview on P versus NP in general, Lance Fortnow wrote a great article called "The status of the P versus NP problem" in the Communications of the ACM magazine. He also has a much better blog on dealing with computational complexity at http://blog.computationalcomplexity.org/.

References

Lecture by Erik Demaine on Computational Complexity

Fortnow, L. (2009). The status of the P versus NP problem. Association for Computing Machinery.Communications of the ACM, 52(9), 78. doi:http://dx.doi.org/10.1145/1562164.1562186

Hamburger, H. & Richards, D. (2002). Logic and language models for computer science. Upper Saddle River, N.J: Prentice Hall.

5 comments:

  1. This seems like a complicated topic to research but I think this is a clear way to say what you are writing about. I liked how at the very top you put a picture which kind of drew me in at first glance. It made me want to continue reading the post to see what it was about. The way you introduced the topic made me interested as well and also helped me in trying to understand it to a certain degree. Not having any background knowledge of this topic, I have an idea of the issue you are addressing and it sounds very complexing but interesting to find a way in solving.

    ReplyDelete
    Replies
    1. You seem to have posted twice so I removed the duplicate. To the topic at hand, I don't think I did a very good job of explaining how it relates to computer science. I agree it's a complicated topic and trying to explain all the terms at once seems to have made it more confusing. Even after reading all the articles during the semester I don't think I have a good grasp on the topic myself. Only those who truly understand can explain difficult things in simple ways.

      Delete
  2. Your topic is very complex and I believe that that is indeed the point. However, I am having a hard time determining what it is you wish to do about this problem or how it relates back to the computer science field and the issues involved. I will stay tuned to find out more!

    ReplyDelete
  3. Kyoungjin Yoon: This topic also engages business problems tourism agents face. I would like to know outcomes of how engineers apply your calculation into our real lives. I never realized that mapping the fastest route requires high level of algorithm, as we know in the Google Maps App. I think you related computer science topic to your subject because you mentioned about mathematics of traveling salesman problem. You remarked on the challenges computer engineers need to face.

    ReplyDelete
  4. As I read through your post, I found it somewhat easy to understand because I have a good background in math. The highest level math I took in college was Calculus with Analytic Geometry, so I was able to imagine P and NP as variables in an equation that must be solved with polynomials. I think the traveling salesman is a good analogy as well because anyone can picture the problem in their head. You did a great job in breaking it down for people who are not familiar with the topic. I look forward to reading more of your posts!

    ReplyDelete