# CSE 5311: Algorithm Design and Analysis Traveling Salesman CSE 5311: Algorithm Design and Analysis Traveling Salesman Problem is NP-Complete by Vaishnavi Balasubramanya ID: 1000-58-3834 Traveling Salesman In the travelingProblem salesman problem, a salesman must visit n cities. Salesman wishes to make a tour visiting each city exactly once and finishing at the city he started. There is an integer cost c(i,j) to travel from city i to city j. For example, the salesman must travel to a, b, c, d locations.

Travel costs are given Traveling Salesman The salesman wishes to make the tour whose Problem total cost is minimum. The total cost is sum of the individual costs along the edges of the tour In the example the minimum cost tour is a-c-b-d-a The cost of this tour is 1+2+1+3 = 7 Traveling Salesman Problem The formal language: TSP = { : G=(V,E) is a complete graph, c is a function from VxV->Z, k Z and G has a traveling salesman

tour with cost at most k} Next we see that a fast algorithm for the traveling salesman problem is unlikely to exist. TSP is NP-complete To show that TSP is NP-complete we first show that TSP belongs to NP. Given an instance of the problem the certificate is the sequence of n vertices (cities) in the tour. The certifier (verification algorithm) checks that this sequence contains each vertex exactly once, sums up the edge costs and checks whether the sum is at most k.

This process can be done in polynomial time. TSP is NP-complete To prove that TSP is NP-hard we show that cycle p TSP. Let G=(V,E) be an instance of Ham-cycle. We construct an instance of TSP as follows Form the complete graph G' = (V,E') where E' = { (i,j) : i, j V and ij} and Define the cost function c by c(i,j) = { 0 if (i,j) E, 1 if (i,j) E } Ham- TSP is NP-complete Note that G is undirected, it has no self loops and hence c(u,u) = 1 for all v V The instance of TSP is then which is easily

formed in polynomial time. We now show that graph G has a Hamiltonian cycle if and only if graph G' has a tour of cost at most 0. Suppose the graph G' has a Hamiltonian cycle h. Each edge in h belongs to E and thus has a cost 0 in G' Thus h is a tour in G' with cost 0 TSP is NP-complete Conversely suppose that graph G' has a tour h' of cost at most 0. Since the cost of edges in E' are 0 and 1, the cost of tour h' is exactly 0 and each edge on the tour must have cost 0. Thus h' contains only edges in E. Hence we conclude that h' is a Hamiltonian cycle in graph G. Applications of Traveling Salesman Problem Printed circuit manufacturing: Planning the most efficient motion of a robotic arm that drills holes in n points on the surface of a VLSI chip. Serving I/O requests on a disk.

Sequencing the execution of n software modules to minimize the context switching time. Zero Weight Cycle Jonathan Cross The Problem Given a directed graph G = (V,E) with weights w e on its edges e E. The weights can be positive or negative. The Zero-Weight-Cycle Problem is to decide if there is a simple cycle in G so that the sum of the edge weights on this cycles is exactly 0. 1 -6 5 -2 -3 -3 Is ZWC NP?

Verify a given solution in polynomial time. Simple: Traverse the solution and verify the sum is zero. 1 0 1 -6 5 -2 -3 -5 -3 Reduction To a believed NP Complete

Problem: Subset Sum Via Section 8.8 we believe the Subset Sum to be NP Complete: S = {a1,a2,,an} & W Construct G0 with vertices {vi, ui} as equal to each ai in S. S = {1,-2,-3,5,-3,-6}; n = 6; 1 -6 5 -2 -3 -3 Reduction S = {1,-2,-3,5,-3,-6}; n = 6; Construct a weighted Graph G0 with 2n vertices. Each ai has vertices vi and ui Add zero weight edges to each vi from all uj

Add zero weight edges to each ui from all vj Total Number of edges = 2n(n-1) + n Summing a traversal equivalent to an examination for a subset in a subset sum problem. If there exists a zero weight cycle in G then all weights from ui to vi must sum to zero. G0 1 1 1 2 2 2 3 3 4 5 3 4 5

6 v 5 3 6 u 6 Zero weight edge Reduction Construct a cycle by picking all edges corresponding to the element in S0 and connect those edges by those zero weight edges and finally obtain a zero weight cycle. 1 1 1 1 2 2 2 3 3

4 5 3 -6 4 5 -2 -3 -3 5 6 v 5 3 6 u 6 Zero weight edge Solution

First, given a simple cycle in G, we can determine whether the sum of its edge weights is zero in polynomial time. Thus Zero-Weight-Cycle 2 NP. Then we reduce the Subset Sum Problem to this problem. The subset sum problem is: given A set of integers, determine whether the sum of some non-empty subset equal exactly zero. Consider a set of integers S = {a1, . . . , an}, we construct a weighted directed graph G with 2n Vertices, such that every element ai corresponds to two vertices vi and ui. For each vi, add an edge from vi to ui with weight ai and add edges from every vertex uj to it with weight 0. For each ui, add edges from this vertex ui to every other vj with weight 0. If we find a zero-weight-cycle in G, then all the weights from v i to ui along the cycle must be zero. If we get a subset S0 which sums to zero, we construct a cycle by picking all edges (vi, ui) corresponds to the element in S0 and connect those edges by those zero weight edges and finally obtain a zero weight cycle. Thus this problem is at least as hard as subset sum problem. Since the subset problem is NP-complete, we have Zero-Weight-Cycle 2 NPC. Foreground/Background

Image Segmentation Paul Doliotis What is our goal? To label each pixel in an image as belonging to either the foreground of the scene or the background Solution? This problem can be solved efficiently by a minimum cut computation. Likelihood and separation parameters For each pixel i we have a likelihood ai that it belongs to the foreground and a likelihood bi that it belongs to the background. We can label a pixel i as belonging to the foreground if ai > bi, and to the background otherwise. We must also consider a pixels neighbours. If many neighbours are in the background we would be more inclined to label i as background. Thus, for each pair(i,j) of neighbouring pixels there is a separation penalty pij >= 0 if both pixels dont belong to foreground or background.

Defining our problem mathematically We can define our Segmentation Problem as finding an partition of the set of pixels into sets A and B (foreground and background respectively) so as to maximize the following sum: q ( A, B) ai bj iA jB p ij (1) (i, j)E |A {i, j}1| This is a maximization problem though. Minimum cut algorithm is a minimization problem Converting our problem to a minimum cut

problem In equation (1) we are defining a maximization problem. We must modify (1) to make our problem a minimization problem. Let Q i(a. i bi) The sum: a b i j iA jB equals Q - iA bi - jB aj . As a result we can rewrite (1) as: q(A, B) Q - iA bi - jB aj - same as minimizing q(A,B):

p ij (i, j)E |A {i, j}1| . Maximizing q(A,B) is the q' (A, B) iA bi jB aj p ij (i, j)E |A {i, j}1| Constructing our graph (1) Let V be the set of pixels and E to denote the set of all pairs of neighbouring pixels. We obtain an undirected graph G=(V,E).

Constructing our graph (2) We create a source node s to represent the foreground and a sink node t to represent the background. We attach each of s and t to every pixel and use ai, bi for capacities between pixel i and the source and sink respectively. For each pair (i,j) we create instead of one undirected, two directed edges (i,j) and (j,i) with capacity pij (separation parameter) Minimum cut(A,B) An s-t cut(A,B) is a partition of our pixels into sets A (foreground) and B (background). Edges (s,j), j contribute aj capacity to the cut Edges (i,t), iA contribute bi capacity to the cut Edges (i,j), iA j contribute pij capacity to the cut If we add these contributions we get: c( A, B) iA bi jB aj p ij

(i, j)E |A {i, j}1| q' (A, B) An Application of Maximum Flow: The Baseball Elimination Problem We are given the following tournament situation: w(i) g(i) Team Yale Wins 33 To play 8

Y Harvard 29 4 1 Cornell 28 7 6 0 Brown 27

5 1 3 Mayur Mayur Motgi g(i,j) H 1 C 6 B 1 0

3 1 1 Note: No ties are allowed. Each win gives one point. Question: Is Harvard eliminated or not? (A team is eliminated if it cant be the first or tied for the first at the end of the tournament). The Baseball Elimination Problem: Preliminary Analysis The maximum number of points Harvard can get is W = 29 + 4 = 33 (by winning all its games) Suppose Harvard wins all its remaining games. It will not be eliminated if and only if Brown has no more than u(B) = W-w(B) = 33-27 = 6 wins in the remaining games; Cornell has no more than u(C) = W-w(C) = 33-28 = 5 wins in the remaining games; Yale has no more than u(Y) = W-w(Y) = 33-33 = 0 wins in the remaining games.

Let P be the set of all the teams other than Harvard: P = {Y, C, B} Let Q be the set of all possible pairs of P-teams: Q = { (Y,C), (Y,B), (C,B) } The total number of games to be played between P-teams is G = 6+1+1 = 8 . Solving the Baseball Elimination Problem via Maximum Flow The baseball elimination problem can be solved by creating and solving a related instance of maximum flow problem: Create a source node O (all the games originate here). Create a node for each pair from Q; for each Q-node (i, j), add an arc from O to (i, j); the arcs capacity is the number of games to be played between i and j. Create a node for each team from P; for each Q-node (i, j), add arcs from (i ,j) to P-nodes i and j; cap( (i,j)i ) = cap( (i,j)j ) = cap( O (i,j) ) . Create sink node T (the wins of the teams are recorded here). Add an arc from any P-node j to T; the capacity of the arc is u(j) . 6 Y,C 6 6

O 1 1 Y 1 Y,B 1 C 1 C,B 1 B 0 5 6

T Solving the Baseball Elimination Problem via Maximum Flow Find the maximum flow from O to T in the resulting network. If maximum flow value = G (total number of remaining games among Pteams) then Harvard still has chances to be number one, else Harvard is eliminated. (that is, if all the games can be played so that teams Y, C, B get no more than u(Y), u(C), u(B) wins correspondingly, then Harvard still can be number one). For our example, the bold red numbers on the arcs show the optimal flow values. Since the maximum flow value is 7 < 8 = G, Harvard is eliminated. 5 6 Y,C 6 6 O

1 1 1 Y 5 1 C Y,B 1 1 C,B 1 1 5

5 2 1 1 0 B 6 T Showing the elimination of Harvard using minimum-cut-based arguments Below is a different way to show the elimination of Harvard. The team nodes on the O-side are Y and C. The total number of wins between Y and C is (33 + 28) + 6 = 67. Then the average number of wins is 67 / 2 = 33.5 . This means that one of Y and C will certainly get 34 points.

So Harvard is eliminated with its maximum possible 33 points. Generally, suppose we have teams 0, 1,, n. w i g R If there is a set of teams R {1,,n} such that i R R (g(R) = total number of games to be played among R) then team 0 is eliminated. W Showing the elimination of Harvard using minimum-cut-based arguments The O-side of the minimum cut is {O, (Y,C), Y, C} (the set of the nodes that are reachable from O via augmenting paths) The team nodes on the O-side are Y and C. The number of games to be played between Y and C is 6.

But the maximum number of total wins for Y and C, that allows Harvard to be number one, is 0+5 = 5. Thus, Harvard is eliminated. Claim: If team 0 is eliminated, then R = team nodes on the O-side of the minimum cut. 6 O 5 1 1 1 1 Y, C 6 6

1 Y, 1 B 1 C, 1 Y 5 1 0 C 5 1 B 6 Min cut

5 2 T