Wednesday, May 30, 2012

Limitations

Computers can be frustrating sometimes. Staring at a spinning hourglass (or a spinning beachball if you are Mac user) gets annoying very quickly. The problem is, after waiting patiently for a few seconds, you do not know what you are supposed to do. Is that spinning hourglass ever going to stop or will it keep spinning forever? Sometimes it starts spinning when you ask your computer to do a simple task, like open a file, and you start wondering why such a simple task is taking so long.  Eventually you give up and start canceling everything you asked your computer to do. Sometimes canceling the tasks stops the spinning hourglass. But sometimes it does not.  Then you start randomly hitting keys on your keyboard, and yelling at your computer. And, finally you give up, and reboot the computer by turning its power off. Not a very pleasant experience.

You know what would be great, if there was a way of guaranteeing that when you run a program on your computer, the program eventually terminates. Then, you will know that, when the spinning hourglass appears, it will eventually disappear and the program will finish the task you asked it to do. But, how can you guarantee such a thing? If you rely on software developers manually checking the programs then you are likely to have human errors. The best option would be to develop a program-checking-program. This program checking-program would take a program as input and automatically determine if the input program eventually terminates or not. There will be no room for human errors.

It turns out it is not possible to write such a program-checking-program. And, in fact, there is a proof that shows that this is a problem that can never be solved using a computer. The speed of the computer's microprocessor or the size of its memory does not change this result. It is a fundamental limitation of computers!

How can one prove such a result? The basic idea is the following: Assuming that there is a program-checking-program leads to a contradiction. Since existence of a program-checking-program leads to a contradiction, we conclude that it cannot exist. It is not too difficult to explain how the contradiction arises.  Let's assume that there is a program-checking-program. Now, let's write a new program using the program-checking-program. The new program first asks the program-checking-program the following question: "Do I terminate?". After the new program gets the answer, it does the following: if the answer was yes,  then it does not terminate (just keeps spinning the hourglass forever), if the answer was no, then it immediately terminates. The point is, this new program we construct just does the opposite of what the program-checking-program says it is supposed to do. This is a contradiction because we were assuming that the program-checking-program can figure out if any given program can terminate or not. Apparently, no such program exists.

Based on this result we know that there are problems that we cannot solve with computers. We call such problems "uncomputable" problems. Computers do many wonderful things but they have their limits, they cannot solve uncomputable problems. The problem I described above is called the "halting problem,"  the problem of figuring out if a program terminates (halts) or not. Once computer scientists figured out that halting problem is uncomputable, they went on to identify many other problems that are uncomputable. Assume that we are working on a  problem and we show that given a program that solves that problem, one can solve the halting problem. Then we can conclude that this new problem is also uncomputable since we know that the halting problem is uncomputable. So there is no point in trying to solve that problem using a computer.  For example, in general it is not possible to write a program that determines if a mathematical theorem is provable or not, because it is possible to show that if there was such a program then we can use that program to solve the halting problem. So, we know that we cannot replace mathematicians with computers.

In addition to uncomputable problems that can never be solved with computers, there are also some problems that are solvable using computers but we do not know how to solve them efficiently. The programs we write to solve these problems are able to find the solution eventually but they can take a very long time. One of the most famous problems that we do not know how to solve efficiently is called the traveling salesman problem. The problem is simple to state: There is a traveling salesman who has to visit a set of cities and he wants to minimize the length of his trip. The input to the problem is a list of cities and the distances between them. And as output we want a route that covers all the cities and  has the minimum length.  The most efficient algorithm we have for this problems is not very efficient. Every time we add a city to the list, the time it takes to compute the minimal path can double. For example, if this algorithm computes the shortest path for 100 cities in 1 second, for 101 cities it could take 2 seconds, for 102 cities it could take 4 seconds, for 105 cities it could take half a minute, for 110 cities it could take sixteen minutes, for 120 cities it could take about 4 months, for 130 cities it could take about 30 years, and for 160 cities it could take more time than the time it took from the big bang to now. We call this type of increase exponential, the time it takes to solve the problem doubles when we increase the size of the problem just by one. For some problems, like traveling salesman problem, the best solution we have is an exponential-time solution. This type of problems are called intractable problems since it can be infeasible to solve them for large problem instances.

One thing to note here is that when computer scientists analyze algorithms for solving a problem, they typically focus on the worst case scenario. The question they want to answer is: What would be the longest possible time an algorithm can take given an input of certain size? For example, we may have an algorithm that solves the traveling salesman problem for the cities in California very fast. But the question is, can it solve all possible instances of the traveling salesman problem fast? And, as I stated above, the best algorithms we have for this problem will take exponential amount of time in the worst case.

Interestingly, if one claims to know the solution to a given traveling salesman problem instance, then it is not difficult to check if it is the correct solution. If someone claims that a particular path is the shortest path for visiting a set of cities, we can check if that particular path is indeed the shortest one efficiently. So, if we guess a solution, we can check if it is the correct solution efficiently. The problem is, since there are an exponential number of possible paths, if we start guessing and checking, it takes an exponential amount of time to find the correct solution. But the best algorithm we know for this problem, in the worst case, is not much better than guessing and checking.

Traveling salesman problem is among a class of problems for which we do not have an efficient solution, but  solving one of them efficiently will imply that we can solve all of them efficiently. In some sense these problems are equivalent. Using a solution to one of them, one can solve all the others. One of the biggest open questions in computer science is figuring out if there is an efficient solution for this class of problems. This is not known. Although nobody has been able to figure out an efficient solution to these problems yet, no one has been able to prove that there is no efficient solution either.

Wednesday, May 23, 2012

Biological Computers


There are some interesting similarities between computers and humans.

All computation done by computers can be thought of as manipulating numbers. Each computer program just reads and writes zeros and ones based on a set of instructions. The set of instructions (also known as the code, or the program, or the software) tells the computer when to read some digits of the input number and what to do after reading those digits (which is either read some more digits or write some digits, and go to the next instruction). Although manipulating numbers sounds like something that would be useful only for math, it turns out to be a very powerful concept since we can map almost everything to numbers (this mapping is sometimes called a digital representation). One of the key observations in computing is the fact that the programs themselves can be mapped to numbers. Any computation can be represented as a sequence of zeros and ones. The nice thing about this is that once you map a program to a number, then you can easily make copies of it just by copying the number. So it is very easy to mass produce copies of a program.

It seems like evolution also figured out that the best way to copy something is to first map that thing to a number and then copy the number. I am saying this because we can think of the DNA molecule (which is the key to the replication mechanism in all living organisms) as a number written in base 4 (each digit of the DNA number has 4 possible values. Numbers in base 4 are also called quaternary numbers, as opposed to binary or decimal numbers). The DNA number for a living organism somehow encodes all the information necessary to build that organism. Not surprisingly, for humans this is a very very large number. It is a quaternary number that contains 3 billion digits! If we were to write that number in binary, like our computers prefer to do, it would be a number that contains 6 billion digits. If you think that is very large, consider this, the number that represents Microsoft's Vista operating system is a binary number that contains 24 billion binary digits! Well, I guess at this point in our civilization we are producing programs that are more complicated than ourselves.

OK, so the first point I am trying to make is that the genetic code for living organisms is written and copied in a digital format and encodes the instructions of how to build that organism. Kind of like a program written and copied in digital format, encoding the instructions about how to execute the program.

Here is the second connection between humans and computers: The way the humans are built have some similarity to the way the computers are built. It is clear that human brain is a very special organ. Loss of no other organ would make ending someone's life acceptable, but when the brain of a person stops functioning it is somehow acceptable to turn off the life support. If you think about it, this means that we are very strongly convinced that brain is what makes us us. Brain is the center of our consciousness, it is the place where our thoughts originate and it is the organ that controls the movement of our body. Very much like the microprocessor, aka the central processing unit (CPU), of a computer. Actually, a human brain also contains that individual's memory, so it would be more like the CPU and the memory of the computer altogether.

How does the human brain communicate with the rest of the organs, how does it tell them what to do? It turns out it does this by sending electrical signals using the nervous system which consist of wires (made of cells called neurons) that establish connections between different parts of the body and the brain. This is exactly how a computer's CPU communicates with the rest of the computer. For example, the inputs from the computer keyboard and mouse are sent to the CPU as electrical signals through some wires, and the CPU sends messages to the screen by sending electrical signals through some wires. Interestingly, the first computer that was invented was not a computer that used electrical signals. The first computer was proposed by Charles Babbage and it was a mechanical device. However, he could not build it because it is very difficult to build a computing device using mechanical components. Once we discovered how to store information and send signals using electricity, it did not take too long for computing technology to flourish. Is it a coincidence that both humans and computers process and communicate information using electrical signals? Maybe, based on the physical laws of this universe, that is the best way to build a computing device.

Let me mention yet another connection between humans and computers. Noam Chomsky is a famous linguist who in 1956 developed a mathematical model for grammars that define the syntaxes of natural languages. It turns out the structure of sentences in natural languages can be described using recursive rules such as: "A sentence is a noun-phrase followed by a verb-phrase," "a verb-phrase is either a verb or a verb followed by a sentence." (Notice that a sentence is defined by referencing a sentence again, this is why these types of rules are called recursive).  The formalism Noam Chomsky developed for explaining grammatical structures of natural languages has been tremendously influential in computer science. It has been especially influential in development and specification of programming languages as one might expect. Interestingly, it has also been very influential as a model of computation. It turns out that the recursive rules that our brains use in constructing sentences can be a very effective way to describe computation. And the most general model for grammars can actually describe any computation that can be executed in the most general computing model. So, while our brains are constructing sentences, they are doing a type of computation.

I can keep going, listing more connections between humans and computers. But here is my basic point: I see humans as biological computers. In an earlier post I said that the most essential quality of a computer is programmability. I even said that anything that is programmable should be considered to be a computer. Then, in order to assert that humans are biological computers, I need to argue that humans are programmable. And they definitely are. However, the mechanism of programming humans is different than programming a computer.

Since we build the computers ourselves, we know exactly how to write a set of instructions that the CPU of a computer can execute. Because we know the exact encoding that the CPU uses (we know how it maps instructions to numbers). However, we did not build the human brain, evolution did. And, we do not yet know how the human brain encodes instructions and we do not know how to send it instructions. But we are still able to program humans. Humans are programmed by a process called "learning". I observe this process everyday watching my kids. Most of the time they learn by example, they see something and they imitate it (this is why people with kids start to use spell-speak for curse words). After enough examples and imitation they become good at whatever they are learning.

If we had a way of copying instructions of how to do something, like to play piano, from the brain of a pianist and then copy it to other peoples' brains, we would save a lot of practice time. That is what we do with computers, we copy programs from one computer to another. And immediately the computer that gets the copy of the program becomes as good as the previous computer in executing that task. But, someone has to write the program first. This is where humans have superiority. You see, the human brain writes the program of how to do something itself during the process of learning. It somehow figures out the instructions for how to play piano, how to speak, how to write, how to play tennis, etc. by just observing and imitating. Interestingly, learning as a way of programming has become a very influential topic in computing in recent years. The general area is called "machine learning". The basic idea is to make computers learn how to do something by observing lots of examples. This is how Google Translate works for example, it uses a machine learning algorithm that figures out how to translate a sentence from one language to another language by looking at existing translation examples on the web.

The basic idea that computers in some ways can imitate humans has a long history. There is an area of study in computer science called "artificial intelligence" which has been investigating how to make computers "intelligent" like humans since 1950s. However, artificial intelligence does not sound very futuristic anymore since computers were able to beat  the best human players in chess and Jeopardy! The problem with the term "artificial intelligence" is that it seems very hard to define what intelligence is. We tend to call people who are good at analytical reasoning and math intelligent. Computers are much better than humans on some analytical and mathematical tasks, so one can claim that computers are very very intelligent. But they do lack something. They do not have consciousness. Although consciousness is yet another thing that is hard to define clearly, I personally consider it to be the most significant difference between humans and computers. Humans are aware of their existence whereas computers are not. Is this something humans are born with? Or is it something they learn? Is it something that can be programmed to a computer? Or is it something computers can learn in some way, maybe using machine learning? Time will tell. But, if a computer gains consciousness someday somehow, I bet turning its power off will become a tough decision.

Wednesday, May 2, 2012

Recipes for Computing


I love Turkish food (try it and you will see why). Unfortunately, there are very few Turkish restaurants in US and there isn't any in Santa Barbara. So, when I want to eat a Turkish dish, I have to make it myself. Luckily, I have several Turkish cookbooks that contain recipes for most of my favorite dishes. Without those recipes I would need to fly back to Turkey to eat the food I like (which is not a short flight).

But why am I talking about cooking and recipes? Because in this post I want to talk about a concept computer scientists call "algorithm" and the best analogy I can find is that an algorithm is like a recipe.

Let's first think about what a recipe is. A recipe is a description of how to take a bunch of ingredients and process them to generate the output you want: something to eat. So, the first thing a recipe describes is the list of ingredients, which are the input to the cooking process. Then, the recipe gives you step by step instructions on how to process the ingredients. At some point the recipe tells you that the food is ready to eat, and that is when the cooking process ends.

An algorithm is a recipe for computing. An algorithm describes how to take an input, say a set of numbers, and transform them into an output, say another set of numbers. Like a recipe, the algorithm gives step by step instructions on how the input (the ingredients) are processed to produce the output.

Using algorithms you can write programs, but an algorithm is not a program. You cannot take an algorithm and run it on a computer. You see, the algorithms are not written for computers, they are written for people. A programmer has to take the algorithm and translate it to a language that a computer can understand (we call those programming languages). In fact, there are books on algorithms and, like cookbooks contain recipes for cooking, algorithm books contain recipes for computing that programmers use to build their programs. Depending on the type of computing you want to do, you can open one of these algorithm books and pick a recipe. Then, you can translate that recipe to the computer's language to do the computing.

It is time for an example. Assume that I want to do the following computation: Given a set of numbers, I would like to produce a list that lists the given numbers in ascending order. In other words, I want to sort a set of numbers using a computer. How can I do it? Well, many computer scientists already thought about this and they came up with many sorting algorithms. If you pick one of them and write it in a programming language, you will end up with an implementation of the sorting computation. (The task of translating the algorithm to a language that computers understand is called implementing the algorithm).

It turns out that the best way to sort a list of numbers using computers is the following: First divide the given list to a bunch of lists that are as small as possible. Well, the smallest possible list is a list of size one, so we will first divide the input list to a bunch of lists of size one. Here is the nice thing, a list of size one is always sorted! Now start merging the lists. The key is, you can assume that you are always merging sorted lists since we start with lists of size one which are always sorted. And, it is not difficult to merge two sorted lists to obtain a merged sorted list. You can just go over the two lists together while always removing the next smallest element (which is going to be in front of one of the lists) and adding it to the merged list. If you keep merging the lists, eventually you will obtain one sorted list which is what we wanted. The algorithm that describes this process is called the merge sort algorithm.

But why do I think that this is the best way to sort a list of numbers? How can one know if an algorithm is better than another algorithm? This is a question that computer scientists study a lot. Let me make it more clear. Assume that there are two algorithms that produce identical outputs for the same input. So, in terms of getting the computation done, they are both good. How can I say one is better than the other if they can both get the job done? It is like having two recipes for cooking a pie where both recipes use identical ingredients and produce identical pies. So, then, which recipe would you prefer? I don't know about you but I would prefer the recipe that gets me the pie as fast as possible. In comparing algorithms, we think the same way, we prefer the algorithms that get the computation done as fast as possible.

There is a problem with this. As I said before, algorithms do not run on computers so we cannot measure the time it takes to get the computation done on a computer without implementing the algorithm. Can we measure how fast an algorithm is without implementing it and running it on a computer? Well, I can figure out how long it will take me to cook with a recipe without actually cooking with the recipe. I can count the number of steps required to cook with a recipe and add the length of each step to get the total time. If I am comparing two recipes, I can count how many steps each recipe requires and pick the recipe with the smallest number of steps. But number of steps required for cooking can change based on how many people are invited to dinner. A recipe may require me to repeat a certain number of steps for each person I am cooking for (like seasoning each steak separately) and a recipe can also contain certain number of steps that does not depend on how many people I am cooking for (like heating the grill). So depending on the number of people we are cooking for different recipes could be faster. But, if you want to cook for a large number of people, you are likely to pick a recipe which requires the least amount of work for each person.

We use the same basic ideas in comparing algorithms. For each algorithm we count the number of steps required and then we pick the algorithm with smallest number of steps. When we are counting the steps in an algorithm we divide the algorithm to small enough steps so that the length of each individual step is the same. Like with recipes, the number of steps required can depend on the size of the task. For example, for the sorting algorithm, the number of steps required will depend on the size of the input list. In studying an algorithm, computer scientists first figure out the relation between the input size and the number of steps required for the algorithm. When choosing an algorithm we typically pick the algorithm that requires the least amount of work for each input. Such an algorithm will produce the fastest program when a programmer translates that algorithm to a program.

There is another important criteria one can use in choosing an algorithm. Assume that you have a small kitchen and what you really care about when choosing a recipe is the amount of pots and pans you will need to get the cooking done. Since you do not have too much space in your kitchen you want the recipe that will let you cook using the smallest amount of space. And, this may be more important for you then the amount of time it takes to cook. The analogous concept in computing is the amount of intermediate results one requires to store while computing. The space required for storing the intermediate results could be very large and the computer we plan to use may not have a lot of memory. If that is the case, one can prefer an algorithm that requires more number of steps as long as it requires less amount of memory.

Algorithms are a very important body of knowledge for computer scientists. They save us from reinventing the wheel every time we write a program. When we are faced with a computing task, we first ask ourselves if we can use some existing algorithm to solve that task. Actually, a single program is likely to use many algorithms, not just one. Typically there are many subtasks a program has to do and each one may require a different algorithm. But there will be times when you will be faced with a task without a known algorithm. Then you have to make up your own. It is like creating your own recipe for your own dish. The knowledge of existing algorithms is very helpful when you are making up your own algorithm. There are some basic approaches to developing an algorithm, like different cuisines for cooking, and based on the problem you are facing you can use one of the existing approaches to develop your own algorithm. The more styles of cooking you know the more likely you will be able to develop the best recipe for the occasion. And, while developing your recipe you may want to pay attention to how long it is going to take to cook the dish and how much space you are going to need to cook it.

Like creating your own recipe, creating your own algorithm is a very creative activity and, in  my view, it is one of the most fun activities for a computer scientist. You may base your algorithm on some existing styles, but in the end the algorithm you come up with is your own recipe. And if it is really good it may end up in a cook/algorithms book one day and other people can cook/program with it!