You are on page 1of 16

An Introduction to Computer Science

(Up to Basic Computer Science)

Reader, this is Computer Science. Computer Science, this is the Reader. Well, now that you've met, I'm sure that you will both be good friends. First, however, I think that you'll want to know something about each other. Theoretical Computer Science has its roots in mathematics, where there was a lot of discussion of logic. It began with Pascal and Babbage in the 1800's. Pascal and Babbage eventually tried to come up with computing machines that would help in calculating arithmetic. Some of them actually worked, but they were mechanical machines built on physics, without a real theoretical background. Another person in the 1800's was a man named George Boole, who tried to formulate a mathematical form of logic. This was eventually called Boolean Logic in his honor, and we still use it today to form the heart of all computer hardware. All those transistors and things you see on a circuit board are really just physical representations of what George Boole came up with. Computer Science, however, hit the golden age with John von Neumann and Alan Turing in the 1900's. Von Neumann formulated the theoretical form of computers that is still used today as the heart of all computer design: the separation of the CPU, the RAM, the BUS, etc. This is all known collectively as Von Neumann architecture. Alan Turing, however, is famous for the theoretical part of Computer Science. He invented something called the Universal Turing Machine, which told us exactly what could and could not be computed using the standard computer architecture of today. This formed the basis of Theoretical Computer Science. Ever since Turing formulated this extraordinary concept, Computer Science has been dedicated to answering one question: "Can we compute this?" This question is known as computability, and it is one of the core disciplines in Computer Science. Another form of the question is "Can we compute this better?" This leads to more complications, because what does "better" mean? So, Computer Science is partly about finding efficient algorithms to do what you need. Still, there are other forms of Computer Science, answering such related questions as "Can we compute thought?" This leads to fields like Artificial Intelligence. Computer Science is all about getting things done, to find progressive solutions to our problems, to fill gaps in our knowledge. Sure, Computer Science may have some math, but it is different from math.

In the end, Computer Science is about exploring the limitations of humans, of expanding our horizons.
Fields of Computer Science Computer science is often said to be neither a science nor about computers. There is certainly some truth to this claim--computers are merely the device upon which the complex and beautiful ideas in computer science are tested and implemented. And it is hardly a science of discovery, as might be physics or biology, so much as it is a discipline of mathematics or engineering. But this all depends on

which branch of computer science you are involved in, and there are many: theory, hardware, networking, graphics, programming languages, software engineering, systems, and of course, AI. Theory Computer science theory is often highly mathematical, concerning itself with questions about the limits of computation. Some of the major results in CS theory include what can be computed and how fast certain problems can be solved. Some things are simply imposible to figure out! Other things are merely difficult, meaning they take a long time. The long-standing question of whether "P=NP" lies in the realm of theory. A subsection of theory is algorithm development. For instance, theorists might work to develop better algorithms for graph coloring, and theorists have been involved in improving algorithms used by the human genome project to produce faster algorithms for predicting DNA similarity. Cryptography is another booming area of the theory section of computer science, with applications from e-commerce to privacy and data security. This work usually involves higher-level mathematics, including number theory. Even given all of the work in the field, algorithms such as RSA encryption have yet to be proven totally secure. Work in theory even includes some aspects of machine learning, including developing new and better learning algorithms and coming up with bounds on what can be learned and under what conditions. Hardware Computer hardware deals with building circuits and chips. Hardware design lies in the realm of engineering, and covers topics such as chip architecture, but also more general electrical engineeringstyle circuit design. Networking Networking covers topics dealing with device interconnection, and is closely related to systems. Network design deals with anything from laying out a home network to figuring out the best way to link together military installations. Networking also covers a variety of practical topics such as resource sharing and creating better protocols for transmitting data in order to guarantee delivery times or reduce network traffic. Other work in networking includes algorithms for peer-to-peer networks to allow resource detection, scalable searching of data, and load balancing to prevent network nodes from exploiting or damaging the network. Networking often relies on results from theory for encryption and routing algorithms and from systems for building efficient, low-power network nodes. Graphics The field of graphics has become well-known for work in making amazing animated movies, but it also covers topics such as data visualization, which make it easier to understand and analyse complex data. You may be most familiar with the work in computer graphics because of the incredible strides that

have been made in creating graphical 3D worlds! Programming Languages Programming languages are the heart of much work in computer science; most non-theory areas are dependent on good programming languages to get the job done. Programming language works focuses on several topics. One area of work is optimization--it's often said that it's better to let the compiler figure out how to speed up your program instead of hand-coding assembly. And these days, that's probably true because compiler optimizations can do amazing things. Proving program correctness is another aspect of programming language study, which has led to a class of "functional" programming languages. Much recent work has focused on optimizing functional languages, which turn out to be easier to analyze mathematically and prove correct, and also sometimes more elegant for expressing complex ideas in a compact way. Other work in programming languages deals with programmer productivity, such as designing new language paradigms or simply better implementations of current programming paradigms (for instance, one could see Java as an example of a cleaner object-oriented implementation than C++) or simply adding new features, such as garbage collection or the ability to create new functions dynamically, to languages and studying how this improves the programmer's productivity. Recently, language-based security has become more interesting, as questions of how to make "safer" languages that make it easier to write secure code. Software Engineering Software engineering relies on some of the work from the programming languages community, and deals with the design and implementation of software. Often, software engineering will cover topics like defensive programming, in which the code includes apparently extraneous work to ensure that it is used correctly by others. Software engineering is generally a practical discipline, with a focus on designing and working on large-scale projects. As a result, appreciating software engineering practices often requires a fair amount of actual work on software projects. It turns out that as programs grow larger, the difficulty of managing them dramatically increases in sometimes unexpected ways. Systems Systems work deals, in a nutshell, with building programs that use a lot of resources and profiling that resource usage. Systems work includes building operating systems, databases, and distributed computing, and can be closely related to networking. For instance, some might say that the structure of the internet falls in the category of systems work. The design, implementation, and profiling of databases is a major part of systems programming, with a focus on building tools that are fast enough to manage large amounts of data while still being stable enough not to lose it. Sometimes work in databases and operating systems intersects in the design of file systems to store data on disk for the operating system. For example, Microsoft has spent years working on a file system based on the relational database model.

Systems work is highly practical and focused on implementation and understanding what kinds of usage a system will be able to handle. As such, systems work can involve trade-offs that require tuning for the common usage scenarios rather than creating systems that are extremely efficient in every possible case. Some recent work in systems has focused on solving the problems associated with large-scale computation (distributed computing) and making it easier to harness the power of many relatively slow computers to solve problems that are easy to parallelize. Artificial Intelligence Last, but not least, is artificial intelligence, which covers a wide range of topics. AI work includes everything from planning and searching for solutions (for instance, solving problems with many constraints) to machine learning. There are areas of AI that focus on building game playing programs for chess and go. Other planning problems are of more practical significance--for instance, designing programs to diagnose and solve problems in spacecraft or medicine. AI also includes work on neural networks and machine learning, which is designed to solve difficult problems by allowing computers to discover patterns in a large set of input data. Learning can be either supervised, in which case there are training examples that have been classified into different categories (for instance, written numerals classified as being the numbers 1 through 9), or unsupervised, in which case the goal is often to cluster the data into groups that appear to have similar features (suggesting that they all belong to the same category). AI also includes work in the field of robotics (along with hardware and systems) and multiagent systems, and is focused largely on improving the ability of robotic agents to plan courses of action or strategize about how to interact with other robots or with people. Work in this area has often focused on multiagent negotiation and applying the principles of game theory (for interacting with other robots) or behavioral economics (for interacting with people). Although AI holds out some hope of creating a truly conscious machine, much of the recent work focuses on solving problems of more obvious importance. Thus, the applications of AI to research, in the form of data mining and pattern recognition, are at present more important than the more philosophical topic of what it means to be conscious. Nevertheless, the ability of computers to learn using complex algorithms provides clues about the tractability of the problems we face. Related articles Starting out in Computer Science is an article about things to focus on when learning computer science.

How to Study Computer Science If you're an aspiring computer science student or someone who wants to switch fields into CS, you're in luck; there's a lot of information available on the Internet. CS is a large and rapidly-expanding field; once you've become confident in your abilities to program moderate-sized projects, a lot of topics open up to you. But what do you really need to learn about, and what don't you? A lot of it depends on what you want to get out of your study. If you want to become a researcher, you'll most likely need to know more of the theory of computer science than if you want to become a programmer. There are, however, some basic skills that will help nearly everyone in the field. Learn Multiple Programming Languages No matter what you want to do in computer science, you'll likely do some of it by writing computer programs. Not all languages are created equal, but most of them have some strengths. You'll want to learn a systems language like C or C++. This will give you several advantages: first, you'll understand memory allocation; second, you'll understand more about how the system is designed; and finally, you'll be able to communicate with other programmers more easily. you can see this article for more details on the advantage of learning C . But you want to learn a more flexible language for daily chores--for instance, a scripting language like Perl or Ruby will help you quickly create interesting programs and test ideas. Finally, once you've mastered a language or two, expand your horizons with a functional language like Scheme, ML, or Haskell. This will improve your understanding of programming languages and broaden your horizons about the possibilities. A key thing to remember when learning new languages is that all languages offer the same power--you can do anything in one language that you can in another--but some languages make it easier to do certain things. For instance, if you want to read data from a text file, Perl is a great language. If you want to write an AI engine, then you might be better off with Scheme. Learn to Design Whether you want to work as a software engineer or as an academic, you're going to have to design programs in some form or another. Learning good design principles early will make your life easier. The key thing to remember about design is that the goal is to catch the problems before you've committed to a solution that won't let you fix them. You don't have to do all of your design up front, but if you don't, then you'll want to leave more flexibility later. Of course, some amount of design is absolutely crucial or you'll simply have no idea what should be flexible and what can be hard coded. Overly modular designs can be as deadly and difficult to maintain as extremely inflexible designs. One way of looking at the issue is that modularity is powerful because it makes it easier to replace a bad idea with a good one. But if you know what the good idea is going to be anyway, then modularity doesn't help you, and because it takes more effort, it can hurt you. A good way to learn design is to practice on well-known systems projects, like writing an interpreter or

web server. These kind of projects have the advantage of having well-known implementions that you can look at once you realize that there are problems with your original design. Whatever you do design, you definitely want to implement at least parts of your designs or you'll never really come to understand the drawbacks in your ideas. It's running into these drawbacks that will teach you the most. Learn Basic Algorithms and Data Structures There are a variety of important algorithms and data structures that are the lingua franca of computer science. Everyone needs to know what a or binary tree is because they get used all the time. Perhaps just as important are fundamental algorithms like binary search, graph searching algorithms, sorting algorithms, and tree-based searches such as minimax. These algorithms and their variants show up a lot, and people will generally expect that any computer scientist will understand how they work. Learn Basic Theory There are a couple of things you should definitely be aware of: you should understand how to represent numbers in different bases and how to manipulate boolean expressions using boolean logic. Both of these tools will come in handy in a variety of circumstances, especially when reading other people's code or trying to clarify your own. (Boolean logic is especially useful for formulating clear conditional statements.) Even more important, you should have a good sense of the limits of current computers. In particular, it really helps to understand the ideas in algorithmic efficiency and Big-O notation. Knowing these topics makes it clearer why certain programs and algorithms will take a very long time to run, and how to spot them. It also makes it signifcantly easier to optimize your program when you know which algorithms to choose. Finally, it's useful to have a sense of the absolute limits of what computers can do; it turns out that there are just some things that are not possible for a computer to do. Sometimes it can be tempting to try to write a program that does just that!

What we Cannot Know: the Halting Problem
(By Alexander Allain of Cprogramming.com: Your Resource for C/C++ Programming) One of the most interesting non-programming parts of computer science is the study of what can (and cannot) be computed. For instance, take the question, "does this program complete?" I.e., will it not go into an infinite loop. How would you answer this question, given an arbitrary piece of code? You could try running it. But what if it takes a long time? How long are you willing to wait? Think about whether there is a general solution to this problem -- a method that you could apply to any piece of C code in order to demonstrate that it will eventually come to a stop. Let's say that you discovered such a solution. It's a program that takes two arguments: the code for a program, and its input. This solution returns true if the program halts on the given input, and false if the program runs forever. Let's call this program DOES-HALT. We can use DOES-HALT as follows: DOES-HALT(program, input)

Now, given DOES-HALT, let's see what kinds of cool things we can do. Well, we can pass in the code of any program that's been running for a long time and ensure that it's working. This could certainly prove useful for complex programs implementing a great deal of recursion or with complicated loop conditions. We could also use DOES-HALT to quickly compute whether or not a program passes its test cases. This is a little bit trickier. How could we use DOES-HALT to determine if a program produces the correct output for a specific input? Keep in mind that DOES-HALT does just that -- it halts. So what if we constructed a second program, let's call it COMPARE-OUTPUT, that would take three arguments: a program, the input to the program, and the expected output. Here's the key: COMPARE-OUTPUT will halt when the output of the program is the same as the expected output, and it will go into an infinite loop otherwise. Now, all we need to do is run DOES-HALT(COMPARE-OUTPUT, [program, input, expected output]) to know whether or not program passes the test case, input, and outputs the expected output. If it halts, well, it does; otherwise, it doesn't! Think of the time such a program might save us for testing complex algorithms! Let's look at some of the other consequences. In particular, what if we decided we wanted to write a program that would test what happens when we run a program on itself as input. For instance, the *nix program cat, when run on itself, would output the binary executable file. Some programs might never halt when run on themselves, though -- so let's use DOES-HALT to write pseudo-code for a program that checks to see what happens when a program is given itself as input. let's call it SELF-HALT, and SELF-HALT will halt if the input program would *not* halt on itself. SELF-HALT(program) { if(DOES-HALT(program, program)) infinite loop else halt } This code is pretty straight forward: if the program would halt on itself, then SELF-HALT goes into an infinite loop. Otherwise, it halts. This is pretty nifty because we can use it to tell whether a program that is designed to analyze other programs (for instance, DOES-HALT) will actually halt when given itself as input. In fact, what if we use SELF-HALT to analyze itself? Well, let's see. SELF-HALT(SELF-HALT) should loop forever if DOES-HALT(SELF-HALT, SELF-HALT). So if SELF-HALT halts on itself, it should loop forever. That doesn't make sense, so DOES-HALT(SELF-HALT, SELF-HALT) must be false. SELFHALT(SELF-HALT) must not halt. But if DOES-HALT(SELF-HALT, SELF-HALT) is false, SELFHALT(SELF-HALT) must halt! A contradiction. So where does this leave us? There's nothing inherently wrong with our SELF-HALT program; it's

structure is just fine. Everything we pass as arugments is perfectly reasonable as well. In fact, the only thing that looks at all questionable is this DOES-HALT program we've been using. In fact, the above argument is essentially a proof that the halting problem, as it is termed, cannot be solved in the general case. No DOES-HALT program exists. If it did, we would be able to generate contradictions such as the above -- a program that halts when it should loop forever, and that loops forever when it halts.

Understanding Different Base Systems This essay is targeted at new students of computer programming or computer science who want to understand how base two (binary), base eight (octal), and base sixteen (hexadecimal) work. First of all, it's important to realize that each of these base systems is just another way of writing down the same number. When you convert a number between different bases, it should still have the same value. In this essay, when I want to refer to the actual value of a number (regardless of its base), I'll do it in base 10 because that's what most people are used to. It's generally easiest to understand the concept of different bases by looking at base 10. When we have a number in base 10, each digit can be referred to as the ones digit, tens digit, the hundreds digit, the thousands digit, or so forth. For instance, in the number 432, 4 is the hundreds digit, 3 is the tens digit, and 2 is the ones digit. Another way to think about this is to rewrite 432 as 4 x 102 + 3 x 101 + 2 x 100 Each digit is multiplied by the next power of ten. Numbers in other bases, such as base 16, are merely numbers where the base is not ten! For instance, we could interpret 432 as though it were in base 16 by evaluating it as 4 x 162 + 3 x 161 + 2 x 100 This would be the same as the number 1074 in base 10. So to convert a number from a given base into base 10, all we need to do is treat each place as a power of the given base times the value of the digit in that place. Note that customarily for a given base, only digits from 0 to the base minus one are used. For instance, in decimal, we only use the digits 0 through 9. That's because we don't need any more digits to express every possible number. (But we do need at least that many; if we only had 8 digits, how would we ever express the value 9?) Now, bases greater than 10 will require more than 10 possible digits. For intsance, the number 11 in base ten can be expressed in base 16 with only a single digit because the ones place in base 16 can range from 0 to 15. Since we only have 10 digits, the letters A through F are used to stand for the

"digits" 10 through 15. So, for instance, the hexadecimal number B stands for the decimal number 11. Bases less than ten will require fewer digits--for instance, binary, which works using powers of two, only needs two digits: one and zero. The binary number 1001, for instance, is the same as writing 1 * 23 1 * 22 1 * 21 1 * 20 which comes out to the decimal value 9. Numbers written in octal use a base of 8 instead of 2 or 16. See if you can figure out what the number 20 written in octal would be in base ten. Because octal, hexadecimal, and decimal numbers can often share the same digits, there needs to be some way of distinguishing between them. Traditionally, octal numbers are written with a leading 0; for instance, 020 is the same thing as the number 20 in base 8. Hexadecimal numbers are written with the prefix of "0x". So 0x20 would be the number 20 in base 16; we'd interpret it the same as the decimal number 32. Converting from decimal to octal or hexadecimal It turns out that when you wish to convert from decimal to octal or hexadecimal, there is a very easy formula that you can use. I'll give you the one for octal, and let you puzzle out the hexadecimal version (which may come quite naturally to some of you). To convert from octal to hexadecimal, all you need to do is group the binary digits into pairs of three and convert each one into the corresponding octal number. For instance, given the binary number 010011110, you would group 011 and 110 together. 010 is 2, 011 is 3 and 110 is 6. So the octal number is 0236. So why exactly does this work? Well, let's take a look at what 011110 looks like: 0 * 28 1 * 27 0 * 26 0 * 25 + 1 * 24 + 1 * 23 + 1 * 22 + 1 * 21 + 0 * 20 That's actually the same as

0 * 22 * 26 + 1 * 21 * 26 + 0 * 20 * 26 + 0 * 22 * 23 + 1 * 21 * 23 + 1 * 20 * 23 + 1 * 22 * 20 + 1 * 21 * 20 + 0 * 20 * 20 Whoa! First, notice that the far right column is actually turning into powers of 8! 23 is 8, and 26 is 64! So this means for each group of three digits, we have the base increasing by a factor of 8. Moreover, look at the right hand column. It can sum up to at most 7 (since 20 + 21 + 22 = 1 + 2 + 4 and the binary digit just decides whether each power of two is included into the sum or not). That's exactly the same as having eight digits, 0 through 7, and once we sum them all together, we multiply the sum by a power of eight. That's just the same as making each group of three binary digits an octal digit! Knowing this, can you come up with the way to do the same thing for hexadecimal numbers?

Efficiency and the Space-Time Continuum
(Up to Basic Computer Science) Get Printable Version

A lot of computer science is about efficiency. For instance, one frequently used mechanism for measuring the theoretical speed of algorithms is Big-O notation. What most people don't realize, however, is that often there is a trade-off between speed and memory: or, as I like to call it, space and time. Think of space efficiency and time efficiency as two opposite ends on a band (a continuum). Every point in between the two ends has a certain time and space efficiency. The more time efficiency you have, the less space efficiency you have, and vice versa. The picture below illustrates this in a simple fashion:

Algorithms like Quicksort and Mergesort are exceedingly fast, but require lots of space to do the operations. On the other side of the spectrum, Bubble Sort is exceedingly slow, but takes up the minimum of space. Heap Sort, for instance, has a very good balance between space and speed. The heap itself takes up about the same space as an array, and yet the speed of the sort is in the same order of magnitude as Quicksort and Mergesort (although it is slower on average than the other two). Heap Sort has the additional benefit of being quite consistent in its speed, so it is useful in programs where timing is crucial (i.e. networks). For data trees, 2-3 trees and 2-3-4 trees are faster and more balanced than the normal binary search trees, but they take up an extraordinary amount of space because they usually have tons of unused variables lying around. The Red-Black tree is a compromise between space and time. The Red-Black tree is basically a binary

tree representation of a 2-3-4 tree, and so it takes up less space than the 2-3-4 tree (it doesn't have all of those empty fields), but it still retains the search efficiency of the 2-3-4 tree! Thus, there has to be a balance in the space and time aspects of computing. Most of the research in Computer Science these days is devoted to Time efficiency, particularly the theoretical time barrier of NP-Complete problems (like the Traveling Salesman Problem). These days memory is cheap, and storage space almost seems to be given away. With networking and robotics, however, the necessity of a balance becomes apparent. Often, memory on these machines is scarce, as is processing power. Memory has to be used conservatively, otherwise the network servers could become stalled until the operation is finished. Robots often have to function with the limited resources installed on their own structures, and thus many times they do not have the memory to be spared for vast computing speed. In these situations, a compromise must be made. With networking, this issue becomes mainly about topology. Network Topology is basically a description of how the physical connections of a network are set up. Maybe you know the term "daisy chain": that is a kind of network topology. A daisy chain (in which all computers are connected to two others in one chain) uses the minimum of cables, but the relative speed of the connections is smaller, because if the computer on one end tries to send a signal to a computer at the other end, it must first go through every computer in between. On the other hand, if every computer were connected to every other computer (called a "fully connected mesh network"), signals would be fast, but you would use a lot more cables, and the network may become hard to maintain. So, in this case, space correlates to the number of connections in a network, while time refers to the speed that signals travel the network. Thus, although it may seem a trivial issue, it is really quite important, even now, to have efficiency in both space and time. Of course, the type of compromise made depends on the situation, but generally, for most programmers, time is of the essence, while for locations in which memory is scarce, of course, space is the issue. Maybe someday we'll be able to find algorithms that are extremely efficient in both speed and memory, bridges in the Space-Time continuum.

Recursion and Iteration
(Up to Basic Computer Science)

Recursion and iteration are two very commonly used, powerful methods of solving complex problems, directly harnessing the power of the computer to calculate things very quickly. Both methods rely on breaking up the complex problems into smaller, simpler steps that can be solved easily, but the two methods are subtlely different. Iteration, perhaps, is the simpler of the two. In iteration, a problem is converted into a train of steps that are finished one at a time, one after another. For instance, if you want to add up all the whole numbers less than 5, you would start with 1 (in the 1st step), then (in step 2) add 2, then (step 3) add 3, and so on. In each step, you add another number (which is the same number as the number of the step you are on). This is called "iterating through the problem." The only part that really changes from step to step is the number of the step, since you can figure out all the other information (like the number you need to add) from that step number. This is the key to iteration: using the step number to find all of your other information. The classic example of iteration in languages like BASIC or C++, of course, is the for loop. If iteration is a bunch of steps leading to a solution, recursion is like piling all of those steps on top of each other and then quashing them all into the solution. Recursion is like holding a mirror up to another mirror: in each image, there is another, smaller image that's basically the same.

This is best explained with an example. How would you tell a computer to see if someone is a descendant of Ghengis Khan? Perhaps the algorithm to define who is and isn't a descendant of Ghengis Khan would look like this: The person is an heir to Ghengis Khan if his/her father is named Ghengis Khan, or his/her father or mother is an heir to Ghengis Khan. Wait, haven't we violated a basic rule that we learned in elementary school - not to use word (in this case, phrase) in its own definition? Well, we aren't exactly using circular logic here, since we aren't saying "an heir to Ghengis Khan is an heir to Ghengis Khan." Let's trace this definition out a bit: Take Jim. He doesn't know it, but he's the direct great-grandson of Ghengis Khan, through a series of male heirs. Since Jim's father isn't named Ghengis Khan, we have to see if his father is an heir...so, we just try the definition again on his father, just like we did with Jim. Jim's father's father (his grandfather) isn't named Ghengis Khan, so we have to look at Jim's father's father's father (his greatgrandfather). But wait! His great-grandfather is named Ghengis Khan. So, this means that Jim's grandfather is an heir, which means that his father is an heir, which means that Jim is an heir. If you notice, we went deeper and deeper into the problem, using the same method over and over until we reached something that was new, an "endpoint". After this, we sort of "unwound" the problem until we arrived where we began, except with the problem solved. This kind of "self-cloning" technique of recursion leads to a famous programmer's joke: Recursion /ree-ker'-zhon/: See Recursion. So, the difference between iteration and recursion is that with iteration, each step clearly leads onto the next, like stepping stones across a river, while in recursion, each step replicates itself at a smaller scale, so that all of them combined together eventually solve the problem. These two basic methods are very important to understand fully, since they appear in almost every computer algorithm ever made.

What we Cannot Know: the Halting Problem
(By Alexander Allain of Cprogramming.com: Your Resource for C/C++ Programming) One of the most interesting non-programming parts of computer science is the study of what can (and cannot) be computed. For instance, take the question, "does this program complete?" I.e., will it not go into an infinite loop. How would you answer this question, given an arbitrary piece of code? You could try running it. But what if it takes a long time? How long are you willing to wait? Think about whether there is a general solution to this problem -- a method that you could apply to any piece of C code in order to demonstrate that it will eventually come to a stop. Let's say that you discovered such a solution. It's a program that takes two arguments: the code for a program, and its input. This solution returns true if the program halts on the given input, and false if the program runs forever. Let's call this program DOES-HALT. We can use DOES-HALT as follows: DOES-HALT(program, input)

Now, given DOES-HALT, let's see what kinds of cool things we can do. Well, we can pass in the code of any program that's been running for a long time and ensure that it's working. This could certainly

prove useful for complex programs implementing a great deal of recursion or with complicated loop conditions. We could also use DOES-HALT to quickly compute whether or not a program passes its test cases. This is a little bit trickier. How could we use DOES-HALT to determine if a program produces the correct output for a specific input? Keep in mind that DOES-HALT does just that -- it halts. So what if we constructed a second program, let's call it COMPARE-OUTPUT, that would take three arguments: a program, the input to the program, and the expected output. Here's the key: COMPARE-OUTPUT will halt when the output of the program is the same as the expected output, and it will go into an infinite loop otherwise. Now, all we need to do is run DOES-HALT(COMPARE-OUTPUT, [program, input, expected output]) to know whether or not program passes the test case, input, and outputs the expected output. If it halts, well, it does; otherwise, it doesn't! Think of the time such a program might save us for testing complex algorithms! Let's look at some of the other consequences. In particular, what if we decided we wanted to write a program that would test what happens when we run a program on itself as input. For instance, the *nix program cat, when run on itself, would output the binary executable file. Some programs might never halt when run on themselves, though -- so let's use DOES-HALT to write pseudo-code for a program that checks to see what happens when a program is given itself as input. let's call it SELF-HALT, and SELF-HALT will halt if the input program would *not* halt on itself. SELF-HALT(program) { if(DOES-HALT(program, program)) infinite loop else halt } This code is pretty straight forward: if the program would halt on itself, then SELF-HALT goes into an infinite loop. Otherwise, it halts. This is pretty nifty because we can use it to tell whether a program that is designed to analyze other programs (for instance, DOES-HALT) will actually halt when given itself as input. In fact, what if we use SELF-HALT to analyze itself? Well, let's see. SELF-HALT(SELF-HALT) should loop forever if DOES-HALT(SELF-HALT, SELF-HALT). So if SELF-HALT halts on itself, it should loop forever. That doesn't make sense, so DOES-HALT(SELF-HALT, SELF-HALT) must be false. SELFHALT(SELF-HALT) must not halt. But if DOES-HALT(SELF-HALT, SELF-HALT) is false, SELFHALT(SELF-HALT) must halt! A contradiction. So where does this leave us? There's nothing inherently wrong with our SELF-HALT program; it's structure is just fine. Everything we pass as arugments is perfectly reasonable as well. In fact, the only thing that looks at all questionable is this DOES-HALT program we've been using. In fact, the above argument is essentially a proof that the halting problem, as it is termed, cannot be

solved in the general case. No DOES-HALT program exists. If it did, we would be able to generate contradictions such as the above -- a program that halts when it should loop forever, and that loops forever when it halts.

General Computer Science Introduction to Computer Science The Fields of Computer Science Starting out in Computer Science Understanding Binary, Hexadecimal, Octal and Other Base Systems Efficiency and the Space-Time Continuum Introduction to Recursion and Iteration What Computers Cannot Do: The Halting Problem

Lists and Arrays Lists and Arrays: Introduction Sorting Algorithms: Introduction Basic Sorting Algorithms: Bubble Sort Selection Sort Radix Sort Insertion Sort Advanced Sorting Algorithms: Heap Sort Binary Search Algorithm Stacks Queues

Data Trees Data Trees: Introduction Binary Search Trees Heap Trees and Heap Sort Minimax Game Trees 2-3 Trees

Code Journal Code Journal is a free, biweekly newsletter on programming and computer science provided jointly by Cprogramming.com and AI Horizon. Each issue features an article by Alexander Allain, webmaster of Cprogramming.com and an article by Eric Suh, webmaster of AI Horizon. Read the Latest Issue of the Code Journal online or browse through the web archive of the issues. To get pure text versions of the Code Journal, go to the text archive of the Code Journal. All articles from the column "Algorithms and Programming" by Eric Suh are also organized by topic in the Essays section of AI Horizon. (Please note that neither Cprogramming.com nor AI Horizon support unsolicited emailing. Your email address will not be sold or released to advertisers).

Computer Science and AI Tutorials and Essays Here, you will find essays on a variety of subjects ranging from red-black trees to linked lists. Some of them even include source code that you can download to explore the concepts presented in the essay. You can find a complete list of these source code items at the Source Code Repository. (Please see this note about our source code.) Basic Computer Science: It is essential to have a good understanding of the basics before learning something such as Artificial Intelligence. These essays provide a smooth transition from simply knowing how to use a programming language to knowing how to program. They make a very good next step up from the C++ tutorials at Cprogramming.com. General Artificial Intelligence: These essays cover the basics of widely used artificial intelligence topics, such as neural networks and the genetic algorithm. Also covered here is Artificial Life, an interesting sub-field. Chess Artificial Intelligence: Although some people may say that Chess AI is dead, the field is far from over, and new developments are still arising from research. Still, it is a very well-covered topic and makes for a wonderful way to learn how to apply AI concepts in a specific field. Go Artificial Intelligence: Those who know of it call go the new frontier of computer strategy. Most of the traditional methods used in chess work poorly in go, making it the subject of exciting new research in vision, human decision-making, pattern matching, neural networks, and the genetic algorithm.

Other Resources Here you will find various extra resources that will expand your knowledge of Artificial Intelligence. These are here to supplement the essays and help you find out even more about Computer Science, AI, and the Mind. Source Code Repository: We have many pieces of code that you can study to further your understanding of Artificial Intelligence concepts. The code is fully functional, and we welcome you to use them in your programs. (If you are unfamiliar with C++, please read this note about our site.) Recommended Books: A listing of some recommended books to read, along with a review of the books and a link to the book in an online store. They are great for learning more about Artificial Intelligence.

Internet Links: Finally, an extensive listing of links to other internet resources that have information pertaining to Artificial Intelligence.

Artificial Intelligence.
Welcome to the world of Artificial Intelligence Programming, the newest and most exciting
field of computer science! Artificial Intelligence (AI) is the research aimed at creating thinking, more independent computers. Because of AI, computers today are assisting experts in fields ranging from medicine and pharmacology to weather, economics, and internet research. Even though great leaps have been made in AI from its beginning, the complexity and sophistication of today's AI computer systems aren't anywhere near that of the human mind. Still, computer scientists are always looking to the goal, that AI Horizon!!

Artificial Intelligence and Games
People have always been fascinated by games, and computer scientists are no exception. In recent years, computers have become extraordinarily good at playing many different types of games, often dominating humans in many fields, but there is one have of gaming in which humans still reign dominant: strategy games. The human mind has qualities exercised in strategy board games that computers have yet to match quite so well: using intuition, developing strategies, matching patterns, visualizing goals, and so on. Even with the great speed and power of the computers of today, computer systems are still far away from matching humans in those respects. Thus, strategic gaming is a fertile ground for the growth of Artificial Intelligence. It contains many challenges and obstacles in a nice, controlled environment in which to test the computer systems. The solutions to these obstacles and problems are priceless and can be used in AI systems for many other uses. This is why AI Horzion dedicates two sections to the two of the most difficult classic strategy games in the history of the world: chess and go. Together, these two board games represent crowning acheivments of the human mind, demonstrations of the awesome dexterity of human thought. And they are goals to reach. In their solutions lie many concepts that can make great advances in Artificial Intelligence.