Notebook

19-23¶

In [1]:

from tock import *

Monday reading¶

Read Section 3.1 and 3.3.

Tuesday class¶

Turing machines, finally¶

By now, the definition of Turing machines (TMs) should be easy to understand. But pay special attention to the distinction between Turing-recognizable (a.k.a. "recursively enumerable" or "computably enumerable") and (Turing-)decidable (a.k.a. "recursive" or "computable"). When a TM runs on an input string, there are three possible outcomes: halt and accept, halt and reject, and loop (which is treated as a reject).

We distinguish between two kinds of TMs: TMs and decider TMs. A TM can accept, reject, or loop on a string, whereas a decider TM must either accept or reject, but never loop. Then, a language $L$ is Turing-recognizable iff there is a TM that accepts all and only the strings in $L$. And $L$ is (Turing-)decidable iff there is a decider TM that accepts all and only the strings in $L$.

This is a distinction we haven't worried about in the past. Why? Deterministic FAs and PDAs always halt, so there's no distinction between accepting and deciding. Nondeterministic FAs and PDAs can have $\varepsilon$ loops, but it's easy to change such machines into an equivalent ones that do not loop. So the distinction is unimportant. With Turing machines (which are deterministic unless stated otherwise), the distinction is extremely important.

There are many variations out there in the definition of TM. Be careful if you refer to other sources. We're going to allow one small extension to Sipser's definition. In Sipser's definition, $\delta(q, a) = (r, b, d)$, where $d \in \{\text{L}, \text{R}\}$. Following Turing's original definition, we'll also allow $d = \text{N}$, which means "no move".

Examples: formal descriptions¶

Here are the book's two examples of formal descriptions. Tock's notation is slightly different from Sipser's: it uses a caret (^) to specify the movement of the head. So $a \rightarrow \hat{} b$ means "read $a$, write $b$, and move left," while $a \rightarrow b \hat{}$ means "read $a$, write $b$, and move right." And $a \rightarrow b$ means "read $a$, write $b$, and don't move."

In [2]:

m2 = read_csv("tm-m2.csv")
m2

In [3]:

run(m2, "0 0 0 0")

Since the machine is deterministic, the run is easier to view as a list of configurations:

In [4]:

def print_run(*args): print "\n".join(map(str, run(*args).only_path()))

print_run(m2, "0 0 0 0")

q1,[0] 0 0 0
q2,_ [0] 0 0
q3,_ x [0] 0
q4,_ x 0 [0]
q3,_ x 0 x ^
q5,_ x 0 [x]
q5,_ x [0] x
q5,_ [x] 0 x
q5,[_] x 0 x
q2,_ [x] 0 x
q2,_ x [0] x
q3,_ x x [x]
q3,_ x x x ^
q5,_ x x [x]
q5,_ x [x] x
q5,_ [x] x x
q5,[_] x x x
q2,_ [x] x x
q2,_ x [x] x
q2,_ x x [x]
q2,_ x x x ^
accept,_ x x x _ ^

In [5]:

print_run(m2, "0 0 0")

q1,[0] 0 0
q2,_ [0] 0
q3,_ x [0]
q4,_ x 0 ^
reject,_ x 0 _ ^

In [6]:

m1 = read_csv("tm-m1.csv")
m1

Note that in this example, the reject state has been omitted. If at any point in the run, there is no transition defined for the current state and symbol, the machine rejects. You can do this when you write formal descriptions of TMs also.

In [7]:

print_run(m1, "0 1 0 # 0 1 0")

q1,[0] 1 0 # 0 1 0
q2,x [1] 0 # 0 1 0
q2,x 1 [0] # 0 1 0
q2,x 1 0 [#] 0 1 0
q4,x 1 0 # [0] 1 0
q6,x 1 0 [#] x 1 0
q7,x 1 [0] # x 1 0
q7,x [1] 0 # x 1 0
q7,[x] 1 0 # x 1 0
q1,x [1] 0 # x 1 0
q3,x x [0] # x 1 0
q3,x x 0 [#] x 1 0
q5,x x 0 # [x] 1 0
q5,x x 0 # x [1] 0
q6,x x 0 # [x] x 0
q6,x x 0 [#] x x 0
q7,x x [0] # x x 0
q7,x [x] 0 # x x 0
q1,x x [0] # x x 0
q2,x x x [#] x x 0
q4,x x x # [x] x 0
q4,x x x # x [x] 0
q4,x x x # x x [0]
q6,x x x # x [x] x
q6,x x x # [x] x x
q6,x x x [#] x x x
q7,x x [x] # x x x
q1,x x x [#] x x x
q8,x x x # [x] x x
q8,x x x # x [x] x
q8,x x x # x x [x]
q8,x x x # x x x ^
accept,x x x # x x x _ ^

In [8]:

print_run(m1, "0 1 0 # 0 1 1")

q1,[0] 1 0 # 0 1 1
q2,x [1] 0 # 0 1 1
q2,x 1 [0] # 0 1 1
q2,x 1 0 [#] 0 1 1
q4,x 1 0 # [0] 1 1
q6,x 1 0 [#] x 1 1
q7,x 1 [0] # x 1 1
q7,x [1] 0 # x 1 1
q7,[x] 1 0 # x 1 1
q1,x [1] 0 # x 1 1
q3,x x [0] # x 1 1
q3,x x 0 [#] x 1 1
q5,x x 0 # [x] 1 1
q5,x x 0 # x [1] 1
q6,x x 0 # [x] x 1
q6,x x 0 [#] x x 1
q7,x x [0] # x x 1
q7,x [x] 0 # x x 1
q1,x x [0] # x x 1
q2,x x x [#] x x 1
q4,x x x # [x] x 1
q4,x x x # x [x] 1
q4,x x x # x x [1]

Writing Turing machines¶

Writing Turing machines is extremely tedious -- much more so than assembly language. Sipser quickly graduates you from formal descriptions of machines to implementation-level descriptions. These are like pseudocode, maybe even a little higher-level than pseudocode. Implementation-level descriptions still speak in terms of the head moving on the tape and reading and writing symbols on the tape. They shouldn't, for example, make use of variables or arithmetic. See the examples in the book to get an idea of what is allowed and what isn't.

The Church-Turing thesis¶

Recall that Turing's original paper (1936) proposed Turing machines as a model of what humans do when they compute. He imagined a computing person in an idealized scenario:

He has an infinitely long paper tape, divided into squares.
He can write one symbol in each square, and the number of possible symbols is finite (e.g., 0 to 9).
He can only look at a finite number of squares at a time.
He can only move a finite distance at a time.
He has only a finite number of “states of mind.”

Basically this is an appeal to intuition that when people compute, this is what they do. Furthermore, he had to argue (quite briefly) that Turing machines can perform any such computation. This came to be known as the Church-Turing thesis (CTT).

One brief statement of the thesis: "[Turing] machines can do anything that could be described as 'rule of thumb' or 'purely mechanical'." Elsewhere, he characterized these problems as "those problems which can be solved by human clerical labour, working to fixed rules, and without understanding."

In addition to Turing's original justification, another justification for the CTT, the one more commonly cited today, is that many proposals have been made for models of computability, and for the most part they have all turned out to be equivalent to Turing machines. (If time permits, we'll discuss some possible exceptions later.)

Postscript¶

Some people have built actual Turing machines which are really fun to watch.

Wednesday reading¶

Read Section 3.2.

Thursday class¶

Question. Name every possible thing you could imagine adding to a Turing machine to make it more powerful.

First of all, there are some extensions in Turing's image of a human computer:

He is allowed to look at multiple (but finitely many) squares at a time.
He is allowed to move multiple (but finitely many) squares to the left or right.
Furthermore, Turing also mentions the possibility of a two-dimensional grid.

It is easy to convert a TM which can look at multiple squares or move multiple squares into a standard TM by adding more states. A two dimensional grid can be simulated in a manner similar to multiple tapes (see below).

The book discusses two bigger extensions which are used later: multiple tapes, and nondeterminism. Multiple tapes are described clearly enough in the book; a few comments here on nondeterminism.

Nondeterminism¶

We can make Turing machines nondeterministic, similarly to how we made finite automata nondeterministic. Remarkably, NTMs are no more powerful than TMs. But if it were possible to build a real NTM, it would be useful for all kinds of very very practical things, because it would be (we think) so much faster than a deterministic TM. That speed difference is the focus of Unit Four.

When defining NTMs, we must pay attention to the distinction between recognizing and deciding. A NTM accepts a string iff at least one branch of the computation accepts it. The other branches can reject or loop. A NTM is a decider iff all of its branches halt (not loop).

So a NTM recognizes a language $L$ iff for each string $w \in L$, there is at least one branch that accepts $w$, and the other branches either reject or loop on $w$; and for each string $w \notin L$, all branches either reject or loop on $w$. And a decider NTM decides a language $L$ iff for each string $w \in L$, there is at least one branch that accepts $w$, and the other branches reject $w$; and for each string $w \notin L$, all branches reject $w$.

The proof of Theorem 3.16 treats the computation as a directed tree. Each node is a configuration, and each edge is a step of the computation. The simulating DTM does a breadth-first search (BFS) on this tree. Each node in the tree is given an address (known as a Gorn address), which is stored on tape 3. The proof is a little bit unclear on one point: Stage 4 just says "Replace the string on tape 3 with the next string in the string ordering," but doesn't explain what the ordering is -- it's the order defined on page 14, which causes the addresses to be listed in breadth-first order.

Alternatives¶

The end of Section 3.2 alludes to other models of computation that are equivalent to Turing machines. The most important ones are probably:

Accidentally Turing Complete is a collection of other weird things that are equivalent to TMs.