Noncontracting grammar

In formal language theory, a grammar is noncontracting (or monotonic) if for all of its production rules, α → β (where α and β are strings of nonterminal and terminal symbols), it holds that |α| ≤ |β|, that is β has at least as many symbols as α. A grammar is essentially noncontracting if there may be one exception, namely, a rule S → ε where S is the start symbol and ε the empty string, and furthermore, S never occurs in the right-hand side of any rule.

A context-sensitive grammar is a noncontracting grammar in which all rules are of the form αAβ → αγβ, where A is a nonterminal, and γ is a nonempty string of nonterminal and/or terminal symbols.

However, some authors use the term context-sensitive grammar to refer to noncontracting grammars in general.[1]

A noncontracting grammar in which |α| < |β| for all rules is called a growing context-sensitive grammar.

History

Chomsky (1959) introduced the Chomsky hierarchy, in which context-sensitive grammars occur as "type 1" grammars; general noncontracting grammars do not occur.[2]

Chomsky (1963) calls a noncontracting grammar a "type 1 grammar", and a context-sensitive grammar a "type 2 grammar", and by presenting a conversion from the former into the latter, proves the two weakly equivalent .[3]

Kuroda (1964) introduced Kuroda normal form, into which all noncontracting grammars can be converted.[4]

Example

S abc
S aSBc
cB Bc
bB bb

This grammar, with the start symbol S, generates the language { anbncn : n ≥ 1 },[5] which is not context-free due to the pumping lemma.

A context-sensitive grammar for the same language is shown below.

Expressive power

Every context-sensitive grammar is a noncontracting grammar.

There are easy procedures for

Hence, these three types of grammar are equal in expressive power, all describing exactly the context-sensitive languages that do not include the empty string; the essentially noncontracting grammars describe exactly the set of context-sensitive languages.

A direct conversion

A direct conversion into context-sensitive grammars, avoiding Kuroda normal form:

For an arbitrary noncontracting grammar (N, Σ, P, S), construct the context-sensitive grammar (N’, Σ, P’, S) as follows:

  1. For every terminal symbol a ∈ Σ, introduce a new nonterminal symbol [a] ∈ N’, and a new rule ([a] → a) ∈ P’.
  2. In the rules of P, replace every terminal symbol a by its corresponding nonterminal symbol [a]. As a result, all these rules are of the form X1...XmY1...Yn for nonterminals Xi, Yj and mn.
  3. Replace each rule X1...XmY1...Yn with m>1 by 2m rules:[note 1]
X1 X2 ... Xm-1 Xm Z1 X2 ... Xm-1 Xm
Z1 X2 ... Xm-1 Xm Z1 Z2 ... Xm-1 Xm
:
Z1 Z2 ... Xm-1 Xm Z1 Z2 ... Zm-1 Xm
Z1 Z2 ... Zm-1 Xm Z1 Z2 ... Zm-1 Zm Ym+1 ... Yn
Z1 Z2 ... Zm-1 Zm Ym+1 ... Yn       →       Y1 Z2 ... Zm-1 Zm Ym+1 ... Yn
Y1 Z2 ... Zm-1 Zm Ym+1 ... Yn Y1 Y2 ... Zm-1 Zm Ym+1 ... Yn
:
Y1 Y2 ... Zm-1 Zm Ym+1 ... Yn Y1 Y2 ... Ym-1 Zm Ym+1 ... Yn
Y1 Y2 ... Ym-1 Zm Ym+1 ... Yn Y1 Y2 ... Ym-1 Ym Ym+1 ... Yn
where each ZiN’ is a new nonterminal not occurring elsewhere.[7][8]

For example, the above noncontracting grammar for { anbncn | n ≥ 1 } leads to the following context-sensitive grammar (with start symbol S) for the same language:

[a] a from step 1
[b] b from step 1
[c] c from step 1
S [a] [b] [c] from step 2, unchanged
S [a] S B [c]       from step 2, unchanged
[c] B B [c] from step 2, further modified below
[c] B Z1 B modified from above in step 3
Z1 B Z1 Z2 modified from above in step 3
Z1 Z2       →       B Z2 modified from above in step 3
B Z2 B [c] modified from above in step 3
[b] B [b] [b] from step 2, further modified below
[b] B Z3 B modified from above in step 3
Z3 B Z3 Z4 modified from above in step 3
Z3 Z4 [b] Z4 modified from above in step 3
[b] Z4 [b] [b] modified from above in step 3

See also

Notes

  1. ^ For convenience, the non-context part of left and right hand side is shown in boldface.

References

  1. ^ Willem J. M. Levelt (2008). An Introduction to the Theory of Formal Languages and Automata. John Benjamins Publishing. pp. 125–126. ISBN 978-90-272-3250-2.
  2. ^ Chomsky, N. 1959a. On certain formal properties of grammars. Information and Control 2: 137–67. (141–42 for the definitions)
  3. ^ Noam Chomsky (1963). "Formal properties of grammar". In R.D. Luce and R.R. Bush and E. Galanter (ed.). Handbook of Mathematical Psychology. Vol. II. New York: Wiley. pp. 323–418. Here: pp. 360–363 and 367
  4. ^ a b Sige-Yuki Kuroda (June 1964). "Classes of languages and linear-bounded automata". Information and Control. 7 (2): 207–223. doi:10.1016/s0019-9958(64)90120-2.
  5. ^ Mateescu & Salomaa (1997), Example 2.1, p. 188
  6. ^ Mateescu & Salomaa (1997), Theorem 2.2, p. 190
  7. ^ Mateescu & Salomaa (1997), Theorem 2.1, p. 187
  8. ^ John E. Hopcroft, Jeffrey D. Ullman (1979). Introduction to Automata Theory, Languages, and Computation. Addison-Wesley. ISBN 0-201-02988-X. Exercise 9.9, p.230. In the 2003 edition, the chapter on noncontracting / context-sensitive languages has been omitted.
  • Book, R. V. (1973). "On the structure of context-sensitive grammars". International Journal of Computer & Information Sciences. 2 (2): 129–139. doi:10.1007/BF00976059. hdl:2060/19710024701. S2CID 31699138.
  • Mateescu, Alexandru; Salomaa, Arto (1997). "Chapter 4: Aspects of Classical Language Theory". In Rozenberg, Grzegorz; Salomaa, Arto (eds.). Handbook of Formal Languages. Volume I: Word, language, grammar. Springer-Verlag. pp. 175–252. ISBN 3-540-61486-9.