Production (computer science)

Clash Royale CLAN TAG#URR8PPP A production or production rule in computer science is a rewrite rule specifying a symbol substitution that can be recursively performed to generate new symbol sequences. A finite set of productions Pdisplaystyle P is the main component in the specification of a formal grammar (specifically a generative grammar). The other components are a finite set Ndisplaystyle N
of nonterminal symbols, a finite set (known as an alphabet) Σdisplaystyle Sigma
of terminal symbols that is disjoint from Ndisplaystyle N
and a distinguished symbol S∈Ndisplaystyle Sin N
that is the start symbol.
In an unrestricted grammar, a production is of the form u→vdisplaystyle uto v where udisplaystyle u
and vdisplaystyle v
are arbitrary strings of terminals and nonterminals however udisplaystyle u
may not be the empty string. If vdisplaystyle v
is the empty string, this is denoted by the symbol ϵdisplaystyle epsilon
, or λdisplaystyle lambda
(rather than leave the right-hand side blank). So productions are members of the cartesian product
V∗NV∗×V∗=(V∗∖Σ∗)×V∗displaystyle V^*NV^*times V^*=(V^*setminus Sigma ^*)times V^*,
where V:=N∪Σdisplaystyle V:=Ncup Sigma is the vocabulary, ∗displaystyle ^*
is the Kleene star operator, V∗NV∗displaystyle V^*NV^*
indicates concatenation, and ∪displaystyle cup
denotes set union. If we do not allow the start symbol to occur in vdisplaystyle v
(the word on the right side), we have to replace V∗displaystyle V^*
by (V∖S)∗displaystyle (Vsetminus S)^*
on the right side of the cartesian product symbol.[1]
The other types of formal grammar in the Chomsky hierarchy impose additional restrictions on what constitutes a production. Notably in a context-free grammar, the left-hand side of a production must be a single nonterminal symbol. So productions are of the form:
- N→(N∪Σ)∗displaystyle Nto (Ncup Sigma )^*
Grammar generation
To generate a string in the language, one begins with a string consisting of only a single start symbol, and then successively applies the rules (any number of times, in any order) to rewrite this string. This stops when we obtain a string containing only terminals. The language consists of all the strings that can be generated in this manner. Any particular sequence of legal choices taken during this rewriting process yields one particular string in the language. If there are multiple different ways of generating this single string, then the grammar is said to be ambiguous.
For example, assume the alphabet consists of adisplaystyle a and bdisplaystyle b
, with the start symbol Sdisplaystyle S
, and we have the following rules:
- 1. S→aSbdisplaystyle Srightarrow aSb
- 2. S→badisplaystyle Srightarrow ba
then we start with Sdisplaystyle S, and can choose a rule to apply to it. If we choose rule 1, we replace Sdisplaystyle S
with aSbdisplaystyle aSb
and obtain the string aSbdisplaystyle aSb
. If we choose rule 1 again, we replace Sdisplaystyle S
with aSbdisplaystyle aSb
and obtain the string aaSbbdisplaystyle aaSbb
. This process is repeated until we only have symbols from the alphabet (i.e., adisplaystyle a
and bdisplaystyle b
). If we now choose rule 2, we replace Sdisplaystyle S
with badisplaystyle ba
and obtain the string aababbdisplaystyle aababb
, and are done. We can write this series of choices more briefly, using symbols: S⇒aSb⇒aaSbb⇒aababbdisplaystyle SRightarrow aSbRightarrow aaSbbRightarrow aababb
. The language of the grammar is the set of all the strings that can be generated using this process: ba,abab,aababb,aaababbb,…displaystyle ba,abab,aababb,aaababbb,dotsc
.
See also
- Formal grammar
- Finite automata
- Generative grammar
- L-system
- Rewrite rule
Backus–Naur form (A compact form for writing the productions of a context-free grammar.)- Phrase structure rule
Post canonical system (Emil Post's production systems- a model of computation.)
References
^ See Klaus Reinhardt: Prioritatszahlerautomaten und die Synchronisation von Halbspursprachen; Fakultät Informatik der Universität Stuttgart; 1994 (German)