||Let G be an edge-colored graph. A walk on G is a path in G containing all edges of G. The trace of a path in G is the sequence of the edge-colors seen in the path. The graph inference from a walk is t...he problem of, given a string x of colors, finding an edge-colored graph with the minimum number of edges which realizes a walk with trace x. This thesis is devoted to the study of the graph inference from a walk and its modified versions called the graph inference from partial walks and the walk realizability problem. Here, a partial walk on a graph is a path in the graph. These graph inference problems are closely related to inferring a Markov chain from its output, the identification of a finite state automaton from its input/output behavior, and the problem of constructing an edge-weighted graph from distance data. For the graph inference from a walk, all results so far known are concerned with degree-bounded graphs as follows: (i) Raghavan gave an O(n log n)-time algorithm to solve the graph inference from a walk for graphs of bounded degree two, which are linear chains and cycles, where n is the length of a given string. (ii) He also showed that the following variant of the graph inference from a walk for graphs of bounded degree k is NP-complete for any k geq 3; Given a string x and a positive integer K, to find a graph of bounded degree k with at most K nodes which realizes a walk with trace x. First, we focus our attention on the structure of trees without any degree-bound constraint. We consider the tree inference from a walk that is the problem of finding from a string x the smallest undirected edge-colored tree that realizes a walk with trace x. Our first contribution is an algorithm that solves this problem in time linear in the length of x. This linear-time algorithm is developed by devising a tree rewriting system with the Church-Rosser property.
The NP-completeness result of (ii) on degree-bounded graphs and our linear-time algorithm on trees naturally raise a question whether the problem for trees of bounded degree k is solvable in polynomial time for k geq 3. Our second contribution is a negative answer to this question. We prove that for any k geq 3, the graph inference from a walk for trees of bounded degree k is NP-complete even if the number of colors is restricted to k + 1. The number k + 1 of colors is optimal because when the number of colors is at most k, the problem for trees of bounded degree k is solvable in linear time. Our third contribution is a thorough refinement of the above NP-completeness result on trees of bounded degree k for k geq 3. We consider fairly simple trees of bounded degree three called (1,1)-caterpillars. A (1,1)-caterpillar is a tree obtained from a linear chain l by attaching at most one other appendage edge to each node of the linear chain l. The shape of a (1,1)-caterpillar is very similar to that of its backbone linear chain. By exploiting a technique of embedding a tree into a (1,1)-caterpillar, we prove that the problem is NP-complete even for this class of simple trees. This result is strong enough to convince us that the class of (1,1)-caterpillars is one of the simplest classes of graphs for which the graph inference from a walk is NP-complete. Because of the NP-completeness result on trees of bounded degree k (k geq 3), we further investigate the problem from the view point of how nearly optimal solutions can be produced in polynomial-time. We prove that there is no polynomial-time approximation scheme (PTAS) for the problem unless P=NP by showing that the problem is MAX SNP-hard. Under the same assumption that P \ eq NP, we show that the problem can not be approximated in polynomial time with ratio less than two. In similar ways, we can show that all the above results concerning polynomial-time approximation hold even for (1,1)-caterpillars. From these nonapproximability results, it seems to be still rather hard to approximate the problem on degree-bounded trees. Instead of inferring graphs from a single walk, we consider the problem of inferring graphs from a finite number of partial walks. We ask: Given a finite set S of strings of colors, what is the smallest tree T such that, for each string x in S, the tree T realizes some partial walk with trace x. We call the problem the tree inference from partial walks. In contrast with the case of a single walk, we prove that the tree inference from partial walks turns to be NP-complete. This problem remains NP-complete even if the number of colors is restricted to three. On the other hand, the problem with at most two colors is solvable in linear time. Moreover, it is shown that there is no polynomial-time approximation scheme for the problem unless P=NP. However we yield a polynomial-time approximation algorithm for the problem. The performance of the algorithm is guaranteed in terms of the compression in the trees T constructed, that is, ￥ ￥left ￥|S￥right ￥| - mid T mid where ￥ ￥left ￥|S￥right ￥| is the total length of the strings in S and mid T mid is the number of edges in T. We also consider the problem of inferring a linear chain from partial walks. While the linear chain inference from a single walk, as stated above in (i), is solvable in polynomial time, the problem from partial walks is NP-complete even if the number of colors is three (The problem with at most two colors is solvable in linear time). Although the problem is shown to be unfortunately MAX SNP-hard, we present a polynomial-time approximation algorithm where the approximation performance is analyzed in terms of the length of a linear chain. Our last contribution is a series of results on the walk realizability problem. The walk realizability problem is to decide if, given a graph G in addition to a string x of colors, G realizes a walk with trace x, i.e., if the string x can be produced by walking on all edges of G. We introduce a linear chain decomposition of a graph and show that the size of the linear chain decomposition is a key to the walk realizability. The problem is shown to be in NLOG if a graph is decomposable into a constant number of linear chains. We next show that the problem is solvable in polynomial time for trees T decomposable into O(log m) linear chains, where m is the number of edges of T. However, the problem for the class of trees, which are decomposable into O(m) linear chains, turns NP-complete even for (1,1)-caterpillars, which are simple trees of bounded degree three.続きを見る