Word equation
A word equation is a formal equality between a pair of words and , each over an alphabet comprising both constants (c.f. ) and unknowns (c.f. ).[1] An assignment of constant words to the unknowns of is said to solve if it maps both sides of to identical words. In other words, the solutions of are those morphisms whose restriction to is the identity map, and which satisfy . Word equations are a central object in combinatorics on words; they play an analogous role in this area as do Diophantine equations in number theory. One stark difference is that Diophantine equations have an undecidable solubility problem,[2] whereas the analogous problem for word equations is decidable.[3] A classical example of a word equation is the commutation equation , in which is an unknown and is a constant word. It is well-known[4] that the solutions of the commutation equation are exactly those morphisms mapping to some power of . Another example is the conjugacy equation[5] , in which and are all unknowns. The solutions of this equation are precisely those morphisms sending and to conjugate words, with the image being filled in as appropriate. Many subclasses of word equations have been introduced, some of which include:
HistoryThe study of word equations was initiated by Willard Quine as early as 1946. Quine proved[8] that the first-order theory of word equations is essentially equivalent to the first-order theory of arithmetic. In 1954, Andrey Markov coined the term "word equation",[9] and introduced[10] the solubility problem for them: decide whether a given word equation admits a solution. For a long time, it was hoped that this problem was undecidable. One reason for this is that it was expected,[11] (incorrectly, it turns out), that word equations might provide an intermediary step between Hilbert's Tenth Problem and the undecidable problems relating to Turing machines. Further contributions were made in the early 1970s with the work of André Lentin and Juri Ilich Hmelevskii.[12] In 1976, Gennady Makanin introduced[3] a method by which it could be determined whether any given word equation admitted a solution. That this procedure, which has come to be known as Makanin's algorithm, exists is very difficult to prove, and it is one of the most celebrated results in combinatorics on words.[11] Makanin's algorithm is considered to be a one of the most conceptually difficult existing in literature,[6] and it is also highly intractable, requiring (in its initial formulation) triply exponential time.[13] Thus, there were many attempts to improve upon it.[13] In 1999, Wojciech Plandowski introduced a novel algorithm, showing[14] that the solubility problem for word equations is in PSPACE. In 2006, Plandowski and Wojciech Rytter showed[15] that minimal solutions of word equations are highly (i.e., exponentially) compressible using Lempel-Ziv encoding. It is conjectured[15] that the length of a minimal solution of a word equation is (at most) singly exponential in the length of . If this conjecture is true, then Plandowski and Rytter's result yields a straightforward "guess-and-verify" NP algorithm for the solubility problem: they show that a solution can be verified whilst working only with its LZ-compressed representation , and the conjecture being true would imply that has size polynomial in . As it stands, the last part of the complexity analysis—the question as to whether solving word equations is NP-complete—remains open. (NP-hardness follows immediately from the fact that solving word equations generalises the NP-complete problem of pattern matching[15]). Methods of solutionThere is no "elementary" algorithm for determining whether a given word equation admits a solution.[16] The algorithms mentioned above are all of theoretical interest, but they'll not help in solving a word equation by hand, for instance. There exist however a few methods that can sometimes help with this: Length argumentsBecause a solution to a word equation must unify its two sides, one can use the multiset of symbols occurring on either side of to deduce a linear equality in the lengths of the images of the unknowns. For instance, the form of implies that its solutions must satisfy , which narrows down the set of possible to check. Similar arguments can allow for a word equation to be "split up" into smaller ones if it can be deduced positions within the two sides of which must line up in all solutions of . For instance, the midpoints of each side of can be detected via a length argument, and hence that word equation can be split into the system . Appeal to periodicityAnother useful tool for reasoning about word equations is the Periodicity Lemma of Fine and Wilf,[1] which describes what happens if a certain word has multiple periods (i.e., distances at which its letters repeat). Consider, for instance, the word equation .[16] Suppose that is one of its solutions. Then . By taking a suitable conjugacy in this identity, one can infer that there exists some conjugate of which is such that . Now a length argument permits for the midpoint of each side to be identified here, and it follows from this observation that . Herein is the commutation equation, whence and are powers of a common word. Now the infinite words and have a common prefix of length . Since , the Periodicity Lemma can be applied. Its conclusion here is that and are powers of a common word too. Thus, every solution of maps and to powers of a common word. Nielsen transformationsLet be a word equation, such that , and . Here shall be presented a conceptually simple method (called Nielsen transformations algorithm, or Levi's Method.[17]) to determine whether is soluble, with the caveat that the method terminates only on quadratic word equations (as defined above).[5] The idea of the algorithm is to "guess" how the lengths of and compare in some solution of . Either , , or . In the first case, one can apply the string-rewriting rule to , where (after the rewriting) is a new quantity whose meaning is "what's left of the old once is removed". Symmetrically, in the second case, one can apply the rule , and in the third case . The present method actually makes all three guesses; for each of them (separately) it rewrites to account for the guess having been made. By construction, there will be some cancellation at the start of after applying each string-rewriting rule. (For instance, applying to the equation yields , which cancels down to ). The method always takes advantage of this cancellation; the hope is that it is enough to counteract the string-rewriting rule, which (in general) will have made the equation longer. The algorithm thus amounts to exhaustively applying these transformations. It is natural to view the workings of the algorithm as the construction of a graph ,[5] whose nodes are the reached equations, and edges are the transformations between them. If the trivial word equation , (where is the empty word) is ever encountered during this construction, then is surely solvable. Conversely, if is soluble, then must appear in . So, by this method (assuming is finite) it can be determined whether admits a solution. Systems of word equationsOne can define systems of word equations in the natural way.[4] A solution of such a system is a morphism that solves simultaneously every equation in . A natural extension is to consider Boolean formulas of word equations,[4] in which also negation and disjunction is allowed. In fact, every system (and even every Boolean formula) of word equations, is equivalent to a single word equation.[4] Thus, many results on word equations generalise immediately to such systems (resp. formulas). It must be said, however, that the transformation into a single word equation can introduce extra unknowns, and this is sometimes by necessity. Two word equations (or systems thereof) are called equivalent if they have the same set of solutions. A system of word equations is called independent if it is not equivalent to any of its proper subsystems.[16] Put another way, an independent system of word equations is one such that every can be solved "independently", i.e., without solving any of the other . An interesting compactness theorem, usually bearing the name of Andrzej Ehrenfeucht, states that an infinite system of word equations, and with a finite number of unknowns, is necessarily equivalent to one of its finite subsystems.[18] It follows that any independent system of word equations with a finite number of unknowns is itself finite. Expressing formal languages and relationsWord equations can be used to characterise properties of (tuples of) words. For instance, a word ends in if and only if it is the image in some solution of the word equation . Similarly, two words commute if and only if they are the images in some solution of the word equation . In this sense, word equations can be thought of as mechanisms for expressing formal languages,[19] in analogy with automata and formal grammars. It is not known exactly which properties of (tuples of) words are expressible in via word equations in this way. In particular, to show that a relation is inexpressible by word equations is often quite challenging.[16] (An example of an inexpressible property is " is primitive"[19]). It should be also noted that even characterising the solution set of a single word equation is complicated. Hmelevskii[9] proved that, although the solutions to three-unknown constant-free equations can be given in terms of finite expressions with word and integer parameters, this is not true (in general) for four-unknown constant-free equations. In fact, is an example of such a "non-parametrisable" word equation. Extended theories and connections to string solvingOne can augment word equations with other types of constraints on the values , . For instance, in 1968, Yuri Matiyasevich considered[20] an extension of word equations by "length constraints" as a possible tool for showing the unsolvability of Hilbert's tenth problem. These length constraints amounted to linear inequalities in the unknowns , . Sometimes, allowing extra constraints (alongside word equations) leads to theories with undecidable solubility problems, but it is also possible to add less powerful constraints and end up with a theory that's still decidable. An example of the former type of constraint is requiring that some should be Abelian equivalent (i.e., anagrams of one another);[21] an example of the latter type is requiring that some should belong to a given regular language .[22] For Matiyasevich's extension with the length constraints, the solubility problem still has open decidability status.[17] There has been recent interest in the theory of word equations (and more general theories based on it), from the practical point of view of those developing software verification tools called string solvers.[23] These tools, which are increasingly popular,[24] seek to solve algorithmically constraint satisfaction problems about strings. Such problems take the form of a set of constraints, which an unknown set of strings must satisfy. The string solver should then determine whether strings exist which satisfy all the given constraints. A typical goal of such a tool would be to guarantee that a particular piece of software was free from some string-related vulnerability, such as cross-site scripting or code injection.[25] The building blocks of the constraints used in these tools are the standard questions one might ask of strings, such as "is a substring of ?", "what is the length of string ?", and "what is the index of string in string ?".[24] Ostensibly, these constraints can be modelled by theories based on word equations, and as such, string solver tools must be capable of dealing with these theories algorithmically, (at least in the subcase of those equations and formulas that actually arise in practice). Relation to the defect effectThe defect theorem is a central result to combinatorics on words.[4][5] It says that, if a set of words satisfies a nontrivial relation, then the words of can be (simultaneously) expressed as powers of words, where . Such a set is then said to possess a "defect effect" of order . Systems of word equations, (at least "nontrivial" ones), express the fact that a certain finite set of words, (namely the images of the unknowns ), satisfy some nontrivial relation(s). So it can be said that systems of word equations cause a defect effect in the sets of words coming from solutions of .[16] The defect effect caused by certain systems of word equations has been studied.,[26] and there exist some surprising results to this end showing that the "dimensionality properties" of sets of words are actually quite weak. For instance, it is known that here exists an independent system of equations of size , containing unknowns, which is such that causes only a defect effect of order .[16] Role within abstract algebraThere has been much research into formulating and solving equations within different structures of abstract algebra (e.g., groups and semigroups).[27][28] Word equations, as presented here, are simply equations in free monoids. Equations in free semigroups are closely related to these; in fact, they are just word equations with the additional requirement that the solution morphism is nonerasing. One can also consider equations in free groups, although the theory of such objects differs in many ways from the discussion presented here. Another result of Makanin's[29] states that the solubility problem for equations in free groups is again decidable. References
|