from tock import *
Read Section 2.1, but skip "Chomsky Normal Form."
We're beginning the second unit of the course, in which we look at context-free grammars and pushdown automata. Recall that in the first class we briefly introduced Turing machines, and then introduced finite automata as being like Turing machines but restricted to have only a one-way, read-only tape. Pushdown automata are a little bit less restricted: they have a one-way, read-only tape, but they also have a stack. But the book starts with context-free grammars, probably because they're a little more familiar and/or easier to understand.
Context-free grammars were invented in the late 1950s by Noam Chomsky as a way of describing the syntax of human languages (he called them phrase-structure grammars). Later, they were appropriate by the inventors of the programming language ALGOL-60 as a way to describe that language's syntax (which came to be called Backus-Naur form).
There is some variation in terminology that the book doesn't mention. Most importantly, variables are also called nonterminal symbols or nonterminals. You'll hear me call them nonterminals almost exclusively -- I can't get myself to call them variables.
We can start with a simpler example than the book's. \begin{align*} S &\rightarrow \mathtt{a}~S~\mathtt{b} \\ S &\rightarrow \varepsilon \end{align*} Question. What language does this CFG generate?
And here's a version of Example 2.3 that is rewritten using parentheses: \begin{align*} S &\rightarrow \mathtt{(}~S~\mathtt{)} \\ S &\rightarrow S~S \\ S &\rightarrow \varepsilon \end{align*}
Read Section 2.2, up to but not including "Equivalence with Context-Free Grammars".
Pushdown automata are equipped with, in addition to the input tape, a pushdown store, said to have been inspired by the tray dispenser in a university dining hall:
Nowadays, a pushdown store is more commonly known as a stack, which you should be quite familiar with. Here's an example (2.14):
m1 = read_csv("pda-m1.csv")
to_graph(m1)
to_table(m1)
ε,ε | ε,$ | 0,ε | 1,0 | |
---|---|---|---|---|
>q1 | q2,$ | |||
q2 | q2,0 | q3,ε | ||
q3 | q4,ε | q3,ε | ||
@q4 |
run(m1, "0 0 0 1 1 1")
The run graph shows both the state and the stack at each time step. The square brackets just indicate the top of the stack. Note that when the stack gets deeper, ellipses (…) are used to keep the node labels compact.
This happens to be a deterministic PDA: at every time step, there's only one possibility for the next step.
More examples from the book:
m2 = read_csv("pda-m2.csv")
m2
Question. Does the above PDA accept the strings $\mathtt{aabb}$? $\mathtt{aabc}$? $\mathtt{aacc}$?
m3 = read_csv("pda-m3.csv")
m3
run(m3, "0 1 0 0 1 0")
Question. Design a PDA that recognizes the language of matching left and right parentheses (like Example 2.3).
Question. Design a PDA that recognizes the language over $\Sigma = \{\mathtt{a}, \mathtt{u}, \mathtt{c}, \mathtt{g}\}$ such that every symbol is paired with exactly one other symbol -- $\mathtt{a}$ with $\mathtt{u}$ and $\mathtt{c}$ with $\mathtt{g}$, and the pairings are nested like parentheses in the previous question.
Question. Do you think a queue automaton would be more or less powerful than a pushdown (stack) automaton?