Wednesday, May 30, 2012

Limitations

Computers can be frustrating sometimes. Staring at a spinning hourglass (or a spinning beachball if you are Mac user) gets annoying very quickly. The problem is, after waiting patiently for a few seconds, you do not know what you are supposed to do. Is that spinning hourglass ever going to stop or will it keep spinning forever? Sometimes it starts spinning when you ask your computer to do a simple task, like open a file, and you start wondering why such a simple task is taking so long.  Eventually you give up and start canceling everything you asked your computer to do. Sometimes canceling the tasks stops the spinning hourglass. But sometimes it does not.  Then you start randomly hitting keys on your keyboard, and yelling at your computer. And, finally you give up, and reboot the computer by turning its power off. Not a very pleasant experience.

You know what would be great, if there was a way of guaranteeing that when you run a program on your computer, the program eventually terminates. Then, you will know that, when the spinning hourglass appears, it will eventually disappear and the program will finish the task you asked it to do. But, how can you guarantee such a thing? If you rely on software developers manually checking the programs then you are likely to have human errors. The best option would be to develop a program-checking-program. This program checking-program would take a program as input and automatically determine if the input program eventually terminates or not. There will be no room for human errors.

It turns out it is not possible to write such a program-checking-program. And, in fact, there is a proof that shows that this is a problem that can never be solved using a computer. The speed of the computer's microprocessor or the size of its memory does not change this result. It is a fundamental limitation of computers!

How can one prove such a result? The basic idea is the following: Assuming that there is a program-checking-program leads to a contradiction. Since existence of a program-checking-program leads to a contradiction, we conclude that it cannot exist. It is not too difficult to explain how the contradiction arises.  Let's assume that there is a program-checking-program. Now, let's write a new program using the program-checking-program. The new program first asks the program-checking-program the following question: "Do I terminate?". After the new program gets the answer, it does the following: if the answer was yes,  then it does not terminate (just keeps spinning the hourglass forever), if the answer was no, then it immediately terminates. The point is, this new program we construct just does the opposite of what the program-checking-program says it is supposed to do. This is a contradiction because we were assuming that the program-checking-program can figure out if any given program can terminate or not. Apparently, no such program exists.

Based on this result we know that there are problems that we cannot solve with computers. We call such problems "uncomputable" problems. Computers do many wonderful things but they have their limits, they cannot solve uncomputable problems. The problem I described above is called the "halting problem,"  the problem of figuring out if a program terminates (halts) or not. Once computer scientists figured out that halting problem is uncomputable, they went on to identify many other problems that are uncomputable. Assume that we are working on a  problem and we show that given a program that solves that problem, one can solve the halting problem. Then we can conclude that this new problem is also uncomputable since we know that the halting problem is uncomputable. So there is no point in trying to solve that problem using a computer.  For example, in general it is not possible to write a program that determines if a mathematical theorem is provable or not, because it is possible to show that if there was such a program then we can use that program to solve the halting problem. So, we know that we cannot replace mathematicians with computers.

In addition to uncomputable problems that can never be solved with computers, there are also some problems that are solvable using computers but we do not know how to solve them efficiently. The programs we write to solve these problems are able to find the solution eventually but they can take a very long time. One of the most famous problems that we do not know how to solve efficiently is called the traveling salesman problem. The problem is simple to state: There is a traveling salesman who has to visit a set of cities and he wants to minimize the length of his trip. The input to the problem is a list of cities and the distances between them. And as output we want a route that covers all the cities and  has the minimum length.  The most efficient algorithm we have for this problems is not very efficient. Every time we add a city to the list, the time it takes to compute the minimal path can double. For example, if this algorithm computes the shortest path for 100 cities in 1 second, for 101 cities it could take 2 seconds, for 102 cities it could take 4 seconds, for 105 cities it could take half a minute, for 110 cities it could take sixteen minutes, for 120 cities it could take about 4 months, for 130 cities it could take about 30 years, and for 160 cities it could take more time than the time it took from the big bang to now. We call this type of increase exponential, the time it takes to solve the problem doubles when we increase the size of the problem just by one. For some problems, like traveling salesman problem, the best solution we have is an exponential-time solution. This type of problems are called intractable problems since it can be infeasible to solve them for large problem instances.

One thing to note here is that when computer scientists analyze algorithms for solving a problem, they typically focus on the worst case scenario. The question they want to answer is: What would be the longest possible time an algorithm can take given an input of certain size? For example, we may have an algorithm that solves the traveling salesman problem for the cities in California very fast. But the question is, can it solve all possible instances of the traveling salesman problem fast? And, as I stated above, the best algorithms we have for this problem will take exponential amount of time in the worst case.

Interestingly, if one claims to know the solution to a given traveling salesman problem instance, then it is not difficult to check if it is the correct solution. If someone claims that a particular path is the shortest path for visiting a set of cities, we can check if that particular path is indeed the shortest one efficiently. So, if we guess a solution, we can check if it is the correct solution efficiently. The problem is, since there are an exponential number of possible paths, if we start guessing and checking, it takes an exponential amount of time to find the correct solution. But the best algorithm we know for this problem, in the worst case, is not much better than guessing and checking.

Traveling salesman problem is among a class of problems for which we do not have an efficient solution, but  solving one of them efficiently will imply that we can solve all of them efficiently. In some sense these problems are equivalent. Using a solution to one of them, one can solve all the others. One of the biggest open questions in computer science is figuring out if there is an efficient solution for this class of problems. This is not known. Although nobody has been able to figure out an efficient solution to these problems yet, no one has been able to prove that there is no efficient solution either.

Wednesday, May 23, 2012

Biological Computers


There are some interesting similarities between computers and humans.

All computation done by computers can be thought of as manipulating numbers. Each computer program just reads and writes zeros and ones based on a set of instructions. The set of instructions (also known as the code, or the program, or the software) tells the computer when to read some digits of the input number and what to do after reading those digits (which is either read some more digits or write some digits, and go to the next instruction). Although manipulating numbers sounds like something that would be useful only for math, it turns out to be a very powerful concept since we can map almost everything to numbers (this mapping is sometimes called a digital representation). One of the key observations in computing is the fact that the programs themselves can be mapped to numbers. Any computation can be represented as a sequence of zeros and ones. The nice thing about this is that once you map a program to a number, then you can easily make copies of it just by copying the number. So it is very easy to mass produce copies of a program.

It seems like evolution also figured out that the best way to copy something is to first map that thing to a number and then copy the number. I am saying this because we can think of the DNA molecule (which is the key to the replication mechanism in all living organisms) as a number written in base 4 (each digit of the DNA number has 4 possible values. Numbers in base 4 are also called quaternary numbers, as opposed to binary or decimal numbers). The DNA number for a living organism somehow encodes all the information necessary to build that organism. Not surprisingly, for humans this is a very very large number. It is a quaternary number that contains 3 billion digits! If we were to write that number in binary, like our computers prefer to do, it would be a number that contains 6 billion digits. If you think that is very large, consider this, the number that represents Microsoft's Vista operating system is a binary number that contains 24 billion binary digits! Well, I guess at this point in our civilization we are producing programs that are more complicated than ourselves.

OK, so the first point I am trying to make is that the genetic code for living organisms is written and copied in a digital format and encodes the instructions of how to build that organism. Kind of like a program written and copied in digital format, encoding the instructions about how to execute the program.

Here is the second connection between humans and computers: The way the humans are built have some similarity to the way the computers are built. It is clear that human brain is a very special organ. Loss of no other organ would make ending someone's life acceptable, but when the brain of a person stops functioning it is somehow acceptable to turn off the life support. If you think about it, this means that we are very strongly convinced that brain is what makes us us. Brain is the center of our consciousness, it is the place where our thoughts originate and it is the organ that controls the movement of our body. Very much like the microprocessor, aka the central processing unit (CPU), of a computer. Actually, a human brain also contains that individual's memory, so it would be more like the CPU and the memory of the computer altogether.

How does the human brain communicate with the rest of the organs, how does it tell them what to do? It turns out it does this by sending electrical signals using the nervous system which consist of wires (made of cells called neurons) that establish connections between different parts of the body and the brain. This is exactly how a computer's CPU communicates with the rest of the computer. For example, the inputs from the computer keyboard and mouse are sent to the CPU as electrical signals through some wires, and the CPU sends messages to the screen by sending electrical signals through some wires. Interestingly, the first computer that was invented was not a computer that used electrical signals. The first computer was proposed by Charles Babbage and it was a mechanical device. However, he could not build it because it is very difficult to build a computing device using mechanical components. Once we discovered how to store information and send signals using electricity, it did not take too long for computing technology to flourish. Is it a coincidence that both humans and computers process and communicate information using electrical signals? Maybe, based on the physical laws of this universe, that is the best way to build a computing device.

Let me mention yet another connection between humans and computers. Noam Chomsky is a famous linguist who in 1956 developed a mathematical model for grammars that define the syntaxes of natural languages. It turns out the structure of sentences in natural languages can be described using recursive rules such as: "A sentence is a noun-phrase followed by a verb-phrase," "a verb-phrase is either a verb or a verb followed by a sentence." (Notice that a sentence is defined by referencing a sentence again, this is why these types of rules are called recursive).  The formalism Noam Chomsky developed for explaining grammatical structures of natural languages has been tremendously influential in computer science. It has been especially influential in development and specification of programming languages as one might expect. Interestingly, it has also been very influential as a model of computation. It turns out that the recursive rules that our brains use in constructing sentences can be a very effective way to describe computation. And the most general model for grammars can actually describe any computation that can be executed in the most general computing model. So, while our brains are constructing sentences, they are doing a type of computation.

I can keep going, listing more connections between humans and computers. But here is my basic point: I see humans as biological computers. In an earlier post I said that the most essential quality of a computer is programmability. I even said that anything that is programmable should be considered to be a computer. Then, in order to assert that humans are biological computers, I need to argue that humans are programmable. And they definitely are. However, the mechanism of programming humans is different than programming a computer.

Since we build the computers ourselves, we know exactly how to write a set of instructions that the CPU of a computer can execute. Because we know the exact encoding that the CPU uses (we know how it maps instructions to numbers). However, we did not build the human brain, evolution did. And, we do not yet know how the human brain encodes instructions and we do not know how to send it instructions. But we are still able to program humans. Humans are programmed by a process called "learning". I observe this process everyday watching my kids. Most of the time they learn by example, they see something and they imitate it (this is why people with kids start to use spell-speak for curse words). After enough examples and imitation they become good at whatever they are learning.

If we had a way of copying instructions of how to do something, like to play piano, from the brain of a pianist and then copy it to other peoples' brains, we would save a lot of practice time. That is what we do with computers, we copy programs from one computer to another. And immediately the computer that gets the copy of the program becomes as good as the previous computer in executing that task. But, someone has to write the program first. This is where humans have superiority. You see, the human brain writes the program of how to do something itself during the process of learning. It somehow figures out the instructions for how to play piano, how to speak, how to write, how to play tennis, etc. by just observing and imitating. Interestingly, learning as a way of programming has become a very influential topic in computing in recent years. The general area is called "machine learning". The basic idea is to make computers learn how to do something by observing lots of examples. This is how Google Translate works for example, it uses a machine learning algorithm that figures out how to translate a sentence from one language to another language by looking at existing translation examples on the web.

The basic idea that computers in some ways can imitate humans has a long history. There is an area of study in computer science called "artificial intelligence" which has been investigating how to make computers "intelligent" like humans since 1950s. However, artificial intelligence does not sound very futuristic anymore since computers were able to beat  the best human players in chess and Jeopardy! The problem with the term "artificial intelligence" is that it seems very hard to define what intelligence is. We tend to call people who are good at analytical reasoning and math intelligent. Computers are much better than humans on some analytical and mathematical tasks, so one can claim that computers are very very intelligent. But they do lack something. They do not have consciousness. Although consciousness is yet another thing that is hard to define clearly, I personally consider it to be the most significant difference between humans and computers. Humans are aware of their existence whereas computers are not. Is this something humans are born with? Or is it something they learn? Is it something that can be programmed to a computer? Or is it something computers can learn in some way, maybe using machine learning? Time will tell. But, if a computer gains consciousness someday somehow, I bet turning its power off will become a tough decision.

Wednesday, May 2, 2012

Recipes for Computing


I love Turkish food (try it and you will see why). Unfortunately, there are very few Turkish restaurants in US and there isn't any in Santa Barbara. So, when I want to eat a Turkish dish, I have to make it myself. Luckily, I have several Turkish cookbooks that contain recipes for most of my favorite dishes. Without those recipes I would need to fly back to Turkey to eat the food I like (which is not a short flight).

But why am I talking about cooking and recipes? Because in this post I want to talk about a concept computer scientists call "algorithm" and the best analogy I can find is that an algorithm is like a recipe.

Let's first think about what a recipe is. A recipe is a description of how to take a bunch of ingredients and process them to generate the output you want: something to eat. So, the first thing a recipe describes is the list of ingredients, which are the input to the cooking process. Then, the recipe gives you step by step instructions on how to process the ingredients. At some point the recipe tells you that the food is ready to eat, and that is when the cooking process ends.

An algorithm is a recipe for computing. An algorithm describes how to take an input, say a set of numbers, and transform them into an output, say another set of numbers. Like a recipe, the algorithm gives step by step instructions on how the input (the ingredients) are processed to produce the output.

Using algorithms you can write programs, but an algorithm is not a program. You cannot take an algorithm and run it on a computer. You see, the algorithms are not written for computers, they are written for people. A programmer has to take the algorithm and translate it to a language that a computer can understand (we call those programming languages). In fact, there are books on algorithms and, like cookbooks contain recipes for cooking, algorithm books contain recipes for computing that programmers use to build their programs. Depending on the type of computing you want to do, you can open one of these algorithm books and pick a recipe. Then, you can translate that recipe to the computer's language to do the computing.

It is time for an example. Assume that I want to do the following computation: Given a set of numbers, I would like to produce a list that lists the given numbers in ascending order. In other words, I want to sort a set of numbers using a computer. How can I do it? Well, many computer scientists already thought about this and they came up with many sorting algorithms. If you pick one of them and write it in a programming language, you will end up with an implementation of the sorting computation. (The task of translating the algorithm to a language that computers understand is called implementing the algorithm).

It turns out that the best way to sort a list of numbers using computers is the following: First divide the given list to a bunch of lists that are as small as possible. Well, the smallest possible list is a list of size one, so we will first divide the input list to a bunch of lists of size one. Here is the nice thing, a list of size one is always sorted! Now start merging the lists. The key is, you can assume that you are always merging sorted lists since we start with lists of size one which are always sorted. And, it is not difficult to merge two sorted lists to obtain a merged sorted list. You can just go over the two lists together while always removing the next smallest element (which is going to be in front of one of the lists) and adding it to the merged list. If you keep merging the lists, eventually you will obtain one sorted list which is what we wanted. The algorithm that describes this process is called the merge sort algorithm.

But why do I think that this is the best way to sort a list of numbers? How can one know if an algorithm is better than another algorithm? This is a question that computer scientists study a lot. Let me make it more clear. Assume that there are two algorithms that produce identical outputs for the same input. So, in terms of getting the computation done, they are both good. How can I say one is better than the other if they can both get the job done? It is like having two recipes for cooking a pie where both recipes use identical ingredients and produce identical pies. So, then, which recipe would you prefer? I don't know about you but I would prefer the recipe that gets me the pie as fast as possible. In comparing algorithms, we think the same way, we prefer the algorithms that get the computation done as fast as possible.

There is a problem with this. As I said before, algorithms do not run on computers so we cannot measure the time it takes to get the computation done on a computer without implementing the algorithm. Can we measure how fast an algorithm is without implementing it and running it on a computer? Well, I can figure out how long it will take me to cook with a recipe without actually cooking with the recipe. I can count the number of steps required to cook with a recipe and add the length of each step to get the total time. If I am comparing two recipes, I can count how many steps each recipe requires and pick the recipe with the smallest number of steps. But number of steps required for cooking can change based on how many people are invited to dinner. A recipe may require me to repeat a certain number of steps for each person I am cooking for (like seasoning each steak separately) and a recipe can also contain certain number of steps that does not depend on how many people I am cooking for (like heating the grill). So depending on the number of people we are cooking for different recipes could be faster. But, if you want to cook for a large number of people, you are likely to pick a recipe which requires the least amount of work for each person.

We use the same basic ideas in comparing algorithms. For each algorithm we count the number of steps required and then we pick the algorithm with smallest number of steps. When we are counting the steps in an algorithm we divide the algorithm to small enough steps so that the length of each individual step is the same. Like with recipes, the number of steps required can depend on the size of the task. For example, for the sorting algorithm, the number of steps required will depend on the size of the input list. In studying an algorithm, computer scientists first figure out the relation between the input size and the number of steps required for the algorithm. When choosing an algorithm we typically pick the algorithm that requires the least amount of work for each input. Such an algorithm will produce the fastest program when a programmer translates that algorithm to a program.

There is another important criteria one can use in choosing an algorithm. Assume that you have a small kitchen and what you really care about when choosing a recipe is the amount of pots and pans you will need to get the cooking done. Since you do not have too much space in your kitchen you want the recipe that will let you cook using the smallest amount of space. And, this may be more important for you then the amount of time it takes to cook. The analogous concept in computing is the amount of intermediate results one requires to store while computing. The space required for storing the intermediate results could be very large and the computer we plan to use may not have a lot of memory. If that is the case, one can prefer an algorithm that requires more number of steps as long as it requires less amount of memory.

Algorithms are a very important body of knowledge for computer scientists. They save us from reinventing the wheel every time we write a program. When we are faced with a computing task, we first ask ourselves if we can use some existing algorithm to solve that task. Actually, a single program is likely to use many algorithms, not just one. Typically there are many subtasks a program has to do and each one may require a different algorithm. But there will be times when you will be faced with a task without a known algorithm. Then you have to make up your own. It is like creating your own recipe for your own dish. The knowledge of existing algorithms is very helpful when you are making up your own algorithm. There are some basic approaches to developing an algorithm, like different cuisines for cooking, and based on the problem you are facing you can use one of the existing approaches to develop your own algorithm. The more styles of cooking you know the more likely you will be able to develop the best recipe for the occasion. And, while developing your recipe you may want to pay attention to how long it is going to take to cook the dish and how much space you are going to need to cook it.

Like creating your own recipe, creating your own algorithm is a very creative activity and, in  my view, it is one of the most fun activities for a computer scientist. You may base your algorithm on some existing styles, but in the end the algorithm you come up with is your own recipe. And if it is really good it may end up in a cook/algorithms book one day and other people can cook/program with it!

Wednesday, April 18, 2012

Everything is a number.

Computers do many amazing things. They are clearly very good with manipulating numbers. They can do calculations that would take us a lifetime to do with paper and pencil just in seconds. They can search ridiculously large amounts of information and get back an answer to a query almost instantaneously. They can let you play and edit songs and videos. They can play video games with you. They can steer trains, cars and airplanes. They can also beat the best human chess player in chess and the best human Jeopardy! player in Jeopardy!

How come they are so versatile? As a computer science professor if I had to give a one-short-sentence explanation for this amazing diversity of computers' capabilities, it would be: Everything is a number.

You may be thinking that I have some kind of fancy numbers in my mind when I say this. But I don't. I am not talking about irrational, or complex, or imaginary, or transcendental, or some other type of fancy numbers. I am not even thinking about real numbers. I am thinking about good old natural numbers which start with 0 and can be obtained by just adding one to the previous number in the sequence like so: 0, 1, 2, 3, . . .

I said "Everything is a number." and everything is a lot to cover, so let's start. The sentence "Everything is a number." is itself a number. The sentence consists of a sequence of alphabet symbols and blank spaces followed by a period at the end.  There is something called American Standard Code for Information Interchange (in short ASCII code) which assigns a number to each letter of the alphabet and each punctuation symbol. In the ASCII encoding, the lower case letter "i" corresponds to the number 105 and the lower case letter "s" corresponds to the number 115. So the word "is" which is concatenation of "i" and "s" corresponds to the number 105115 (which I obtained by concatenating 105 and 115). In the ASCII encoding, the code for " " (the space we leave between words) is the number 32 and the code for the period is 46.  Using the ASCII encoding, the number for the sentence "Everything is a number." happens to be 69118101114121116104105110103032105115032097032110117109098101114046. Seems like a pretty large number, but, then, I did not claim that everything was a small number.

Following this approach we can map any text to a single number, we just keep concatenating the numbers that correspond to the characters and the punctuation symbols. Keep in mind that we will never run out of numbers because there is an infinite number of numbers out there! There is a number for each text. There is a number for Cervantes' Don Quixote and there is a number for Shakespeare's Macbeth. There is a number for any text that anyone has ever written. I can even give you a finite range of numbers and claim that any novel with a reasonable length that anyone has ever written or will ever write is a number within that finite range. So, if you are an author who is having writer's block, there is nothing to worry about, you can just pick a number from that range and map it back to text based on the ASCII encoding, and there, you have a novel. If you use this strategy to write novels, your novels are very likely to be unreadable. But the point of this post is not to discuss which method to use to write novels. The point is, if you do ever write a novel, it will be a number within that finite range.

There is one issue I need to clarify. ASCII code is not the only way one can map characters to numbers. So, although I showed you that there is a way to map any text to a number using the ASCII code, that is not the only way. There are many ways one can do it. So, mapping between the texts and the numbers depends on the encoding you use. Once you decide on the encoding, you can map any text to a number and you can map any number back to the text it represents.

OK, so far we covered written text. What next? How about speech? Or, let's go even further and cover every sound. Yes, there is a number for every sound. Here is one way of mapping a sound to a number. First record the sound using a recording device, and then convert the recording to the MP3 format (the format used by music players like iPod). If you open the MP3 file and look at the contents of it, you will see a very long sequence of 0s and 1s. The MP3 file is a very very very long binary number. So what ever the sound you recorded, it is converted to a number when you store it in the MP3 format.

Conversion of the sounds to numbers are a little more complicated than the conversion of the text. I cannot tell you how it is done for the MP3 format in particular but I can give you a general idea. Any sound we hear corresponds to a wave that travels in air. It is kind of like a wave in the ocean, it has peaks (the highest points of a wave) and troughs (the lowest points of a wave). To convert a sound into a number, you need to measure the height of the sound wave frequently and store those height measurements as numbers one after the other.

The question is how frequently should we measure the height of the wave? If you do not measure frequently enough, you may only measure the wave at its peaks and then you may get the wrong impression that there is no wave at all and the ocean is still, or in our case we might think that it is silent. Luckily, there is a result from the area of signal processing that tells us how frequently we need to measure the height of a wave in order to capture it perfectly. It depends on how rapidly the wave changes from a peak to a trough. You just have to measure the height of the wave two times as fast as that. It turns out, humans cannot hear sound waves that change from a peak to a trough more than twenty thousand times a second and less than twenty times a second. So we do not need to bother with sound waves that are not in that range. This means that it is enough to measure the height of the sound wave forty thousand times a second if we only care about recording a sound for humans. That is a lot of measurements.

There is also the question of how to convert the height of the wave to a number. How precise should we be in measuring the height? It has been observed that a binary number that contains 16 binary digits would give us pretty good precision for storing the height of the wave. Let's recap. One second of sound can be represented with a number that is the concatenation of forty thousand sixteen digit binary numbers. That is one large number, but again, do not worry, we will never run out of numbers. Using this approach, we can convert any sound, no matter how long, to a (very large) number by writing the height measurements one after another. In principle, this is how your MP3 files represent a recording. They also do a lot of clever things to make the resulting numbers smaller but I will not be able to get into those here.

What about images and videos? How do we map them to numbers? First, a video is a concatenation of images like a text is a concatenation of characters. For example each second of a DVD video consists of 25 images. So if we can map an image to a number then we can map a video to a number.

How do we map an image to a number? Did you ever try to manually copy a picture by dividing it into little squares? When you make the squares small enough the image in each square becomes simple enough that you can copy it easily. If you keep making the squares smaller, at the end you will get a square which is just a single color. This is what computers do, they divide an image to small picture elements (called pixels) that can be represented as a single color. It turns out that many colors can be represented as a combination of red, green and blue. To convert an image to a number, you first divide it into small rectangles called pixels, and then for each pixel you measure how much red, how much green and how much blue you need to get the color of that pixel. The amount of red, green and blue needed can all be recorded as numbers and together they represent the number for that pixel. Then, the number for the whole image is the concatenation of all the numbers for all its pixels.

The things I discussed so far are all called data. Computers store data as numbers. It does not mater what type of data, computers convert all data to numbers. In fact, they convert them to numbers stored in binary format as sequences of 0s and 1s. But what about processing of data? The programs that process the data, how are they stored? Interestingly enough, computers also store programs as numbers! Actually this is not very difficult to see. Programs are typically written very much like regular text using characters of the alphabet and punctuation symbols. The languages they are written in are different than the natural languages like English, but, still, the programs are written like any other text. Since we know that any text can be mapped to a number, then clearly every program can also be mapped to a number and that is exactly what computers do, they store programs also as numbers like they store data.

So far I discussed how computers can represent information about the real world as numbers and represent the programs that do things with that data also as numbers. If you think about it, this is a very general concept and in addition to explaining how computers store and play music and video files it also explains how computers can be used for many other things, like physical simulations. For example, gathering as much information about the weather as we can by measuring anything related to it, and then using this data and the rules of physics for weather forecasting is another application of this concept. But does this cover everything? Well, now let's speculate a little. There is this idea that physicists have been working on for a long time: Finding the theory of everything. This would be a set of rules that explain everything that happens in our universe. If they do find these rules, we can implement a program that simulates these rules and map that program to a number. Then, if we can measure everything we can about the universe, and map those to a number, we would end up having a number for everything and a computer that can run the universe for us! Well, that is a lot of "if"s, but it is plausible!

Wednesday, April 11, 2012

Universality


1936 was an important year for computing. Before I explain why, let's imagine that we travel in time to 1936, find a person on the street, and ask "Where can I buy a computer?" What would be the response of the person from 1936? Maybe the person would say "What is a computer?" This would be a reasonable response since the electronic devices that we call computers nowadays were not invented yet. But, the person is likely to say something like "What do you mean buy a computer? You mean hire a computer, right?"

Hire a computer? Maybe he means rent a computer? It seems like we are having a communication breakdown. Actually, the problem is we are using the word "computer" to mean different things. For the person from 1936, a computer is not a device. Instead, it means a person who performs mathematical calculations. If you had some difficult calculations to do in 1936, you could hire a person called a computer and that person would compute them for you, using paper and pencil. Human computers were used for many purposes including scientific computations such as calculating orbits of comets in astronomy, and military computations such as calculating ballistics trajectories during the First and Second World Wars. In 1936 being a computer was simply a profession.

What is special about 1936, why is it an important year for computing? It is the year a mathematician named Alan Turing published an article that became the foundation of computing. And the ideas in Alan Turing's article are as important today as they were then. They explain what computing is all about.

In his article, Alan Turing was trying to answer the question that is the title of this blog: What is computing? And he was trying to do this at a time when the devices that we call computers today were yet to be invented. Turing was imagining a device that did not exist yet. He was tremendously successful in this endeavor. He managed to come up with an imaginary device that captures the essence of all computation.

Turing was a mathematician and he was thinking about automating computation. He was thinking of a machine that can automatically do the same things that human computers of his time were doing. To design his machine he first thought about how human computers work. Well, how did human computers work? They worked like we would if we wanted to compute something without the assistance of an electronic device. They used a notebook to store the numbers that result from their calculations, a pencil to write the numbers, and an eraser to erase the numbers. They probably started their computation by writing the input numbers for the computation in their notebook. At any moment during the computation, they would look at the intermediate results in their notebook and compute a new result based on the intermediate results and either write it to a new page on the notebook or maybe erase some of the previous intermediate results and write the new result.

Let me give an example to make this a little more concrete. Assume that we hired a human computer to compute the cubes of a set of numbers. So, we will give this person a set of numbers, like 2, 5, 10 and we want human computer to give us back the following results: 8, 125, 1000. Well, maybe that does not seem that difficult. How about if we gave 61, 173, 467 as input? It is not that easy to compute the answer (which is 226981, 5177717, 101847563) without using an electronic device, right? But there were no electronic devices in 1936. If you did not have much time yourself, but had some money, you could hire a human computer to do the computation for you. The human computer would probably first write the input numbers to his notebook. Then he would start with the first number. Multiply the first number with itself (which is done, if you remember the long multiplication, digit by digit while producing some intermediate results which are then added together).  Then he would multiply this intermediate result again with the input number to get the cube. He would continue like this until finishing the list.

OK, we can see how a human computer can compute some stuff. But Turing was interested in figuring out how a machine can do the computation. Being a mathematician, he wanted to come up with a very precise and simple description. The description he came up with was pretty similar to how a human computer works. Here it goes: Turing assumed that the computing machine would need something to write on the input and the intermediate results. Let me call this the notebook of the machine, but it does not have to be a real notebook of course. It is something that Turing's imaginary machine uses as a scratch pad to record its input and intermediate results. To keep things simple, Turing assumed that each page of the notebook could either be blank or could contain a single digit (actually you can even assume that it contains a single binary digit, 0 or 1). He also assumed that at any point during the computation, a single page of the notebook would be open, and the machine could read the digit in that page, and then choose to erase that digit and write another digit or leave it as it is, and then move to the next page or the previous page or stay in the same page.

So far so good. It is very similar to how human computers work. But there is something missing, right? A human computer knows the steps of the computation (like how to do long multiplication) and follows those steps to do the computation. How are we going to tell the computer machines the steps of the computation? Turing again had a simple solution to this. He imagined that his machine would contain a list of steps where each step had a single action describing what to do at that step. An action for a step would be in the following form: Read the digit on the current page and based on that do the following 1) decide what to write to the current page (or to leave it as it is), 2) decide either to go to the next page or the previous page or stay in the current page, and 3) decide which step to go to next.

That's it. That is the model Turing came up with. Let's recap. Turing's computing machine consisted of a set of steps (with associated actions) and a notebook. Each step identified what to write to the notebook's current page and which page to go to and what the next step would be. Seems pretty simple.

Some of the details of Turing's model can sound arbitrary. For example, why just write a single digit in each page? Why are we wasting all that paper? First, remember that this is an imaginary machine that Turing proposed for modelling computation, so we are not really going to implement it using a real notebook. But the key point is, Turing wanted to keep things as simple as possible while capturing the essence of computation. So, why keep a single digit in each page? Because it is just simpler to define it that way and it is enough to capture the essence of computation. Same goes for moving one page at a time. It is simpler to just assume that at any step of the computation you can go one page forward or backward. Any computation that can be done by moving multiple pages at a time can also be done by moving one page at a time.

Here is something you might find surprising: Any computation that our modern day computers can do can be modeled this way. Any computation can be modeled as a Turing machine (by the way, Turing did not use the term "Turing machine" in his paper, which could have sounded pretentious, the name was given later by others). Not just multiplying numbers or computing cubes or other mathematical computations, but searching the web, playing an mp3 file, editing a photo, playing angry birds, all of them. Any computation you do on your computer or on anybody else's computer can be modeled this way. Any computation anyone does on any modern day computer can be modeled this way!

I said that Turing's computing machine consisted of a set of steps and a notebook. So what do these correspond to in a modern day computer? The notebook is what we call memory nowadays. And what about the set of steps that describes a computation? We call that a program.

A key assumption in Turing's model was that the notebook used by his machines never run out of pages. What does this mean in modern terms? He assumed that his machines can use arbitrary amount of memory. This is a key feature that gives Turing's machines the power to model any computation that can be done on any computer.  

Here is something I skipped earlier. As you know, a computer can run multiple programs, not just a single program, and programmability is an essential quality of a computer. So, can a Turing machine run different programs? Well, it seems like we would need to change the set of steps that represent the computation since that is the part of a Turing machine that corresponds to a program. How do we do that? In his 1936 article Turing showed that there is a special type of Turing machine that is programmable.  This programmable Turing machine takes the set of steps describing the computation as an input written on its notebook and then executes that computation based on the given set of steps. This is exactly how modern day computers work, they take a program written to their memory as input and then execute the steps of the program. In fact, any modern computer, including any smart phone, any laptop or desktop or any super computer, corresponds to this special type of programmable Turing machine. So, in this sense, your smart phone and the most powerful supercomputer are equivalent. They are capable of doing the same things, the difference is one does it much faster than the other.

The title of this post was universality. This is the universality I was referring to. Alan Turing was able to come up with a universal model of computation that is powerful enough to capture all computation even before the electronic computers were invented. His universal model of computation gives computer scientists a very useful tool for studying computers. For example, even the most powerful super-duper computer in the world cannot solve any problem that cannot be solved by a Turing machine. So by studying Turing machines we can figure out the limits of computation. And this is exactly what Turing did at the end of his 1936 paper, but that is a topic for another time.

There is a universal essence to computation and Alan Turing was able to capture that essence in his 1936 article. This year computer scientists are celebrating centenary of Alan Turing's birthday. Now you know why.

Thursday, March 22, 2012

Transformers

You know the transformers, the alien robots called Autobots and Decepticons from the planet of Cybetron that fight each other in an epic battle that will determine the fate of the Earth? My son is a fan. He loves watching a giant robot transform into a truck and then into an airplane. The idea that the things he loves---trucks, robots, planes---share a common essence and can morph from one to another must be appealing to him.

My daughter, on the other hand, is not a big transformers fan. She will play with them sometimes if she is really bored, but not very much. Her favorite game is something else. It is not a particular game actually, but a style of play. She likes "make believe", she likes to pretend to be somebody else. Recently, she is really into pretending to be a teacher and teaching my son how to read (my son will look like a genius in kindergarden thanks to her). Her other pretend game favorites are being a pastry chef and a pop singer.

Sometimes while watching my kids, my brain inadvertently switches from the father mode to the computer scientist/academician mode. I look at them and think, there is something common to a lot of their play. It seems like they are fascinated by transformation! Maybe this is a genetic condition, because I am also fascinated my transformation. In fact, that is what I work on---I study the most successful transformers of all time. But I do not call them transformers, I call them computers.

Yes, that is what computers are, transformers. Computers can transform to many things: a music player, a phone, a typewriter, a chess game, a TV---too many to name. In fact, computers have been so successful in transforming into things, in many cases they completely eliminated the devices that they transform into. I am sure hundred years from now a well educated adult will ask the following question while reading an early 20th century novel "What exactly is a typewriter?"

Their ability to transform is the reason I find computers exciting. Using a computer is like driving a transformer truck. Actually, if you are a computer person (also known as a programmer) it is even more exciting than that. It is like driving a transformer truck that you also know how to transform to new things.

The "ability to transform to something" is the most essential quality of computers. In fact one could argue that we should call all the things that have "the ability to transform to something" computers. So, by this reasoning the Transformers and my daughter are also computers! And, yes, I believe in a way they are. I have a rather inclusive view of computing. You might say "Well everything changes in one way or another and transforms, so do you mean everything is a computer?" No, that is not what I mean. The "ability to transform to something" is not the same thing as changing in time.

If that sounds cryptic, here is what I mean. The "ability to transform to something" is interesting because it means that the computer is able to transform to many things, not just one thing. Actually, it is more than that. The "something" in the "ability to transform to something" is the input of the transformation process. The computer takes this input and transforms based on that input. I tell the computer, "Here is 'something', go ahead and become that," and it does it.

How does it do it? Well, the "something" that is the input to the transformation is actually a set of instructions explaining to the computer how to transform itself. This set of instructions is called "a program" (or sometimes "code"). It is like an instruction manual for a transformation. Each transformation has its own instruction manual.

Writing these instruction manuals could be a lot of work. In order to transform a computer to a typewriter, you need to tell it everything about being a typewriter in the transformation instruction manual. Here is the good news. If someone writes an instruction manual for transforming a computer to a typewriter, you do not need to write it again. You can ask that person if you can get a copy of the transformation instruction manual, and if you get it, then you can use it to transform your computer to a typewriter. The transformation instruction manual just has to be written once by somebody and then it can be reused by anybody who has a copy.

In computer science, we call the process of writing the instruction manual for a transformation "programming". When you need to transform the computer to something new, you tell the computer to read the instruction manual for that transformation (also known as executing or running a program) and the computer becomes the thing you want it to be.

Programming is telling the computer what you want it to be. It is a pretty powerful skill and it is definitely fun. It requires creativity, imagination and attention to detail. You think about what exactly you want the computer to become. You think about every little detail, how it should look, how it should behave, how it should interact with its environment. Then you write the instructions: the program. And then when you execute the program, the computer becomes the thing you told it to become. You create something by pure thought. You tell the computer to become a typewriter and it becomes one.

At this point you might be getting a little suspicious about my reasoning. You might say "That is not exactly right because to be a typewriter the computer needs a keyboard, and it could not have become a typewriter if it did not have a keyboard." Not true. When you look at an iPad you do not see a keyboard, but when you tell it to transform itself to become a typewriter, all of a sudden a keyboard appears on the screen. Then, you might say "Well, that works because Apple figured out how to make a touch screen that actually works, if the screen was not touch sensitive then it would not have worked." Yes, that is true. There is a little detail that I avoided so far. When a computer transforms from one thing to another its physical shape does not change like the Transformer robots. The computer's transformation is more like my daughter's transformation when she transforms from being a cute little girl to a pretend teacher, the physical shape does not change but the behavior changes.

Let's call the "ability to transform to something" that I have been discussing programmability. Programmability is the most essential quality of computers (which is why I decided to discuss it first). It is the thing that makes a computer a computer. It is the thing that gives a computer its power. It enables computer to become anything we tell it to become as long as we provide the computer with instructions for how to become that thing.  

But not every part of a computer is trasnformable/programmable. For example, its mechanical features (such as a physical keyboard) are typically not programmable. I say typically, because it is conceivable that one day we will develop the technology to build programmable mechanical devices like the Transformer robots. Maybe the computer keyboard will suddenly transform into a piano keyboard when I ask the computer to become a piano. Alas, typical computers that are built today with our current technology do not have mechanical features that are programmable. The programmable parts of modern day computers are all related to movements of electrons actually, which is a good topic for a future discussion.

Since programmability is so important for computing, it is no surprise that computer scientists spend a lot of time thinking about it. For example, computer scientists thought a lot about how to make programming easier. They have developed languages (many of them!) to write programs in. Mastering one or more of these languages makes you a programmer. Gives you the power to transform the computer to anything you can specify as a program.

I hope that after you read this and go back and google some of the things I mentioned, or post something on your Facebook page about this essay, or send a text message from your smartphone about it, you can look at those devices you use with a fresh pair of eyes knowing that they are transformers. They are transformers that can become infinitely many things, not just what you have been using them for. They just need you to give them the instructions on how to transform.

So, enjoy your transformers---not the ones from Cybetron but the ones from Earth!

Tuesday, March 20, 2012

A Freshman Seminar on Computing

As a professor of computer science I get to teach many courses on computing. I love teaching them because they cover very interesting topics. But they tend to be specialized and technical. Sometimes I feel like I could motivate my students better and get them more excited about computer science if I were to discuss some "big picture" questions about computing. Questions such as: What is the essence of computing? What is its future potential? Unfortunately, one has limited opportunities for integrating such discussions to specialized technical courses. 


So, I decided to offer a freshman seminar! 


The title of the seminar is "What is Computing?" and here is how I describe it:
"Computer technology has produced many impressive gadgets that surround us every day. However, its foundations are still a mystery to many. The purpose of this seminar is to discuss what computing is, with the hope of getting some insights about its essence and its future potential."
The seminar is not exclusively targeting students who plan to get a computer science degree. It is open to everyone who is interested in computing.

I will prepare a post for each topic I plan to discuss in class. I hope to make my posts as readable and as entertaining as I can. No formulas, no graphs, just plain English.

I hope it will be a fun seminar.