#!/usr/bin/env python # coding: utf-8 # # Week 1: 2016/01/11-15 # # Monday reading # # Read 0.1, which is an overview of the whole course. # # Tuesday class # # The central question of this course is: **What is a computer?** # # It seems obvious at first, but it's really not. Is a calculator a computer? Is a toaster oven a computer? Am I a computer? # # ## What does it mean to compute? # # Historically, the answer to this question came from a totally different direction. Although people had built calculating machines, like Leibniz's [stepped reckoner] or Babbage's [difference engine], the question was not "what is a computer," but "what does it mean to compute?" For in 1936, everyone knew what a computer was -- a computer was a person who computed things. Just like a dishwasher is a person who washes dishes. # # [stepped reckoner]: https://en.wikipedia.org/wiki/Stepped_Reckoner # [difference engine]: https://en.wikipedia.org/wiki/Difference_engine # # For example, when you do arithmetic or solve a system of linear equations or find a derivative, you're computing -- you're following an algorithm that you learned in school. But some math requires more ingenuity, like writing proofs, and in the early 20th century, mathematicians were starting to wonder whether there was an "effective procedure" (we would say, an algorithm) for writing proofs too. The theory of computing developed as the answer to these questions; computing _machines_ were only somewhat incidental. # # ## Turing machines # # In 1936, Turing published the paper that many consider to be the founding document of computer science. His answer to the question about what was meant by an "effective procedure" was not the first, but it was the most convincing. He imagined a computing person in an idealized scenario: # # - He has an infinitely long paper tape, divided into squares. # - He can write one symbol in each square, and the number of possible symbols is finite (e.g., `0` to `9`). # - He can only look at a finite number of squares at a time. # - He can only move a finite distance at a time. # - He has only a finite number of “states of mind.” # # He then imagined building machines (which he called a-machines, but which we now call Turing machines) that would do the same thing automatically. # # Turing machines were just a mathematical construct, a way of formalizing what it means to compute. Nevertheless, the obvious next step was to actually build a computing machine. The University of Pennsylvania's [EDVAC] and Turing's own [Automatic Computing Engine] (ACE) were the earliest such attempts. # # [EDVAC]: https://en.wikipedia.org/wiki/EDVAC # [Automatic Computing Engine]: https://en.wikipedia.org/wiki/Automatic_Computing_Engine # # ## Three big questions # # For the rest of the semester, we will ask and answer three questions: # # _What happens if we decrease or increase the machine’s power?_ In Units I and II, we will look at two kinds of machines, finite automata and pushdown automata, that are less powerful than Turing machines, but have interesting properties in their own right. In Unit III, we will look at extensions of Turing machines and show that they are no more powerful than Turing machines, leading to the conclusion (called the Church-Turing thesis) that Turing machines should be our definition of “computable.” Then Unit III continues with the question... # # _What can and can’t it do?_ One thing you can do with Turing machines is build a Turing machine that simulates any other Turing machine. That means that you only have to physically build that one machine, and it can simulate any other machine – hardware and software. What can’t it do? For example, you can’t build a Turing machine that can tell you whether another Turing machine runs in an endless loop. # # _What can and can’t it do tractably?_ Our investigation of Turing machines and models equivalent to it will lead to a reasonable definition of what “tractable” means, namely, those that can be solved in polynomial time. In Unit IV, we’ll see that there is a large class of problems, called NP-complete, that are interesting and useful but don’t seem to be tractably solvable. One of the most well-known is the Traveling Salesman problem: If you have to visit $n$ cities, what is the shortest way to visit all of them exactly once? I say that these problems don't _seem_ to be tractably solvable, because no one knows if they are. The amazing thing is that if you were to discover a tractable solution for one of them, then you would have a solution for all of them! # # Wednesday reading # # Skim Sections 0.2-4, which cover mathematical preliminaries that you should have gotten in Discrete Math. If anything seems unfamiliar to you, study it a little more carefully. The subsection "Strings and Languages" is especially important (and surprisingly short); we'll be focusing on it in class. # # Thursday class # # ## Everything’s a string, every problem a language # # In this course we want to study all the things that a computer can do. That seems very diverse, and it would seem that we have to study many different kinds of computations separately. But maybe we can reduce them all to one kind of problem, and focus our study on that one kind of problem. And we claim that any computational problem can be reduced to problems of the form: Does string $w$ belong to language $L$? # # People are fairly comfortable these days with the idea that any kind of object we might want to compute about can be represented as a string. We deal every day with messages, music, pictures, movies, books, etc., in digital form, with the awareness that these are just strings of 0’s and 1’s. # # Less obvious, perhaps, is that it makes sense to reduce all kinds of computations down to questions about whether strings belong to languages. First of all, our experience with computers today is highly interactive. But user-computer interaction can be thought of as a sequence of inputs and outputs. Second, the output from a computer is much more than just a "yes" or "no" answer. But if you have any function $f(x)$ that takes an input $x$ and outputs another string, we can think of it as the related function $f'(x,y)$ that takes an input $x$ and a possible output $y$, and returns "yes" iff $f(x) = y$. # # Caveat: The function $f'$ might be *much slower* than the original $f$. This will matter when we get to Unit IV, and we'll deal with this issue when the time comes. # # ## Strings # # An _alphabet_ is a nonempty finite set of _symbols_. We use $\Sigma$ (uppercase sigma) or sometimes $\Gamma$ (uppercase gamma) to stand for an alphabet. We write actual symbols as $\texttt{a}, \texttt{b},$ etc., but write variables standing for symbols as $a, b$, etc. # # A _string_ is a finite sequence of symbols. Unlike with sets, order matters and duplicates matter. We usually use $w$ for a variable that stands for a string and $u,v$ or $x,y,z$ when we need more variables. Here's a formal definition of strings: # # - If $a$ is a symbol, it is also a string. # - We write either $vw$ or $v \circ w$ for the concatenation of $v$ and $w$. (Some authors write $v \cdot w$.) Concatenation is associative: $(uv)w = u(vw) = uvw$. # - We write $\varepsilon$ for the _empty string_: $\varepsilon \circ w = w \circ \varepsilon = w$. # # Note that $\varepsilon$ is not a symbol! It is a "meta-symbol" that stands for the empty string. (Also, it is more commonly written $\epsilon$, but we'll try to stick to the book's style.) # # **Question.** (This question and all questions below are not for credit; they will be discussed in class.) True or false? Assume $\Sigma = \{\mathtt{a},\mathtt{b}\}$. # # \begin{align} # \mathtt{ab}&=\mathtt{ba} \\ # \mathtt{aa}&=\mathtt{a} \\ # \varepsilon &\in \Sigma \\ # \varepsilon \varepsilon &= \varepsilon # \end{align} # # **Question.** Write a formal definition of the length of a string. It should be defined inductively, following the structure of the definition of a string. # # ## Languages # # A set of strings is called a _language_. We write $\Sigma^\ast$ for the language of all strings over $\Sigma$. As usual, $\emptyset$ stands for the empty language. # # **Question.** True or false? # # \begin{align} # \{\mathtt{a}, \mathtt{a}\} &= \{\mathtt{a}\} \\ # \{\mathtt{a}, \mathtt{b}\} &= \{\mathtt{b}, \mathtt{a}\} \\ # \{\texttt{a}\} \cup \emptyset &= \{\texttt{a}\} \\ # \{\texttt{a}\} \cup \{\varepsilon\} &= \{\texttt{a}\} # \end{align} # # **Question.** Is there such a thing (according to the definitions above) as # # - an infinite alphabet? # - an infinite string? # - an infinite language of finite strings? # - an infinite language of infinite strings? # # Operations on strings often induce analogous operations on languages. For example, if $L$ and $L'$ are languages, # # \begin{align} # L^R &= \{ w^R \mid w \in L \} \\ # L \circ L' = L L' &= \{ w w' \mid w \in L, w' \in L'\}. # \end{align} # # The $\ast$ operator that we saw above, known as the Kleene star, can be applied to any language. Thus $L^\ast$ is defined as the smallest language such that: # # - $L \subseteq L^\ast$. # - If $v \in L^\ast$ and $w \in L^\ast$, then $vw \in L^\ast$. # - $\varepsilon \in L^\ast$. # # ## Language classes # # Since every language corresponds to a computational problem (given a string $w$, does $w$ belong to $L$?), and since some problems are harder than others, it follows that some languages are harder to recognize than others. Some languages can be recognized by a Turing machine and some can't. Some can be recognized by a Turing machine in polynomial time ($O(n^k)$ time for some $k$) and some can't. So we can talk about sets of languages, commonly called _language classes_. # # There are only a handful of language classes we will learn this semester. They are: # ![Overview of language classes](overview.pdf) # Each of the six boxes is a language class, with the name of the class written above; inside each box are models of computation that accept that language class. # Each class is a strict superset of the previous one, _except_ NP is not known to be a strict superset of P.