You are on page 1of 28

Week 01 Lecture 01 01/10/2012

Introduction to computation


How can we get a recipe into a mechanical process, it is by building Machines to do a specific
job, these are called fixed program computers, and where very popular when algorithmic
calculation began, and it useful for machines that are built for a single purpose, like
calculators. Those that can do both store information and manipulate sequences of
instructions are called stored program computers.
This is how a program interacts,
it storage the results in the
memory, which are run through
the ALU over and over again
until a control point is reached,
where the control unit can make
the algorithm jump forward,
backwards or even end it
sending out the output.

Alan Turing showed that it is posible to compute enything with just 6 primitives, and those
programs that have these six primites are have a property called Turing complete. Although
conviniently there are usually more than those 6 primitives, but any algorithm built in a Turing
complete program could be run in another one.
Thing a
computer does
two things only
Simple
calculations
Primitive
calculations
Algorithms
Storage the
result
Two types of
knowledge
Declarative
knowledge: Refers to
a statement or a fact
Imperative
knowledge: Methods
or recipes for finding
things
Creating recipes: we should consider that every language includes primitives, how do they
interact in legal expressions, and how are the information packages declared and stored.
Syntax is how a valid expression is built, is the form of an acceptable sentence (legal
expression). Semantics is the meaning asociated to each syntatically-valid string. There are
two types of semantics: Static semantics refers to the correct syntatic of the sentence. Formal
(or full) semantics a sentence could be correct according to static semantics but gives space to
diferent meanings, wich could lead to errors or bugs.
Types or errors:
- Syntatic errors
- Static semantic errors (for example: int+string)
- Not doing what was intended
o Crashes
o Infinite loops
o Wrong answers

Week 01 Lecture 02 01/10/2012
Core Elements of programs

You could also use a high level source code with an interpreted language, like python. Though
there are some tradeoffs that we must do when choosing either a compiled language or an
interpreted language, because the first one is much more efficient but, when you reach an
error or a bug is much more difficult to find it. An interpreted language is a bit slower, because
we do the interpretation on the fly, but is much easier to find mistakes.
- Program or script is a sequence of definitions and commands, and it is written in the
shell.
- Command or statement is an instruction for the interpreter.
- Object programs will manipulate data objects which can be:
o Scalar which cannot be subdivided, there are three types in python
Int
Float
Boolean
o Non-scalar which have a structure that can be accessed.
- Expression is a combination of operators and objects
- Operators for python are:
o i+j ; sum
o i-j ; subtraction
o i*j ; multiplication
o i/j ; if both are ints it returns the quotient without remainder
o i%j ; remainder
o i**j ; power
Strings are declared either between or and there are some useful command lines that apply
to this kind of object:
- type(a) returns str
- str(123) returns 123
- 3*a returns aaa
- a+b returns ab
- abcde[0] returns a
- abcde[4] returns e a bigger number would turn out to be error
- abcde[-1] returns e
- len(abc) returns the length, in this case 3
- abcde[0:3] returns abc it stops before the 3
rd

Assigning the same operator to different task is called operator overloading and that what is
done with + in this case.
Programming a script (in python interpreter, not the shell)
- Print() is used to specify the output
- Input= raw_input(introduce your input) it bounds a string to name
- int(raw_input(introduce your age))
- # is used for commenting
Straight line program reads each line sequentially
For branching out programs we will use different kind of branching out, the most easy to use is
if, which is stated with a Boolean and after it, a :, then indentation will give me the set of
instructions within that, then an else: with a set of instructions marked with the indentation,
and after that we return to the indentation to return to the straight line program. Else if is
written as elif Boolean:

Week 02 Lecture 03 04/10/2012
Simple algorithms
Iteration is to reuse a code a desired number of times, with a
logical test.
The whiles syntax is while Boolean:, then intended
instructions, we have to make sure that the variable that is
tested changes within the loop body.
The fors syntax is slightly different, it uses a set of possibilities.
To build this set is useful to use range(m,n), which gives an array
from m to n-1, if it has only one parameter it assumes that m is
zero, if we add a third number it would be the step that is taken
between m to n-1, another simple way to use it is by using chars
in a string. So the syntax would be for Variable in Array:, and it
will do the code with all the variables in the array. We can put a
test within that loop body that could lead us to a break, in which case the for cycle would
be over.
We represent numbers in decimal form, because we have ten fingers, and computers use
binary because its easy to have a switch either on or off.
A decimal number, for example 302, can be read as 30*10^2+0*10^1+2*10^0, and binary
number, say for example 10011, could be read as 1*2^4+0*2^3+0*2^2+1*2^1+1*2^0, which
in decimal is the same, or 19. And to convert a decimal into binary form we have to use the
module expression by 2, then divide the number by two, taking the whole part, and doing it all
over again. Lets take form example 19, 19%2=1, 19/2=9.5, 9%2=1, 9/2=4.5, 4%2=0, 4/2=2,
2%2=0, 2/2=1, 1%2=1, 1/2=0.5, we have reached the ending condition. So the number is read
backwards and is 10011, which is 19 in decimal form.
To get the binary form of a fraction we have to find a power of 2 big enough that when it is
multiplied by the fraction returns a whole number, and then we just shift the whole binary
number across the dot.
An algorithm to resolve the binary form of a fraction problem, will not always reach an exact
solution, because there isnt always an exact solution, what python does is to give us an
approximate solution. So in order to get approximate solutions we have to write a Boolean like
abs(x-y)<epsilon, instead of x==y, that way the code wont fail to reach a solution when there
isnt one.
When using approximation methods we have to be very careful with the step size, if its too
big, we might skip over the solution, or not get close enough, and if its too small we might
take way too long to figure out a solution.

Bisection search
This method consist in trying the middle point between the possible solutions, if its too big try
the middle point between the first and this solution, if its too small try the middle point
between this point and the biggest possible solution, and so on, until we have reached an
acceptable solution. This method only works for nice ordered solution sets, for monotonic
functions.
Newton-Raphson
It works for a polynomial p(x), and creates a better guess starting from a guess g


Week 02 Lecture 04 04/10/2012
Functions
To define functions, or methods, we have to use the syntax def <function name> (<input>):
and then intend the set of instructions it does, using return at the return points. If there is no
returned value, it simple returns None.
The environment is the set of defined parameters or procedure objects (functions), python has
a set of default, or global, environment. Each procedure object takes its input from its parent
environment, and then works in its own environment, than returning the result back to the
parent environment, without interfering with the variables from both environments. This let
us build encapsulated procedures, or black boxes.
Working with completely separated environments, built by procedure objects, is called static
or lexical.
If a procedure object built a new environment and doesnt find an object that it needs it turns
to its parent environment.
We can work with modules, importing user predefined environments, within .py files. To do so
we use the code import <name>, and then to use something we just have to use the correct
syntax <name>.<object>. We can also use from <name> import * which will import everything
that doesnt come in conflict with my environment, in which case I will also have to use
<name>.<object>.

Week 03 Lecture 05 15/10/2012
Recursion
Iterative algorithm review, which with a simple example of multiplication by adding numbers
say a*b state variables i iteration number, starts at a number lets say b and result, which
starts at 0, we also have update rules, i-1, which ends the algorithm at i=0, and result which
changes within a formula (result+a).
Recursion is done by reducing the problem to a simpler version, which is the recursive step
and then taking an ending condition, a problem that can be solved directly, or base case.
Recursion logic is the same as induction logic, thats a pretty good explanation of why it works.
Examples of multiplication by adding recursively and factorial by successive multiplications.
Hanoi towers example, Fibonacci and palindrome strings using slice, this last example is a case
of divide and conquer, because it has the two main characteristics, it takes a problem and
turns it into a set of smaller, and the easier problems to solve and the solutions of the
subproblems can be combined to build a solution for the original problem.
Assert boolean is an instruction that that stops the code if Boolean is false and returns error.
Global variables can be seen and modified at any level of the code, it just has to be called
within the method as global variable name and then its good to go and be modified.

Week 03 Lecture 06 20/10/2012
Objects
Compound data: Tuples
Lists
Dictionaries
Tuples work like strings, they are defined as name=(element 1, element2, , elementn),
a tuple can be inside another one, as an element. Tuples can also be concatenated, using +.
To pull an element the syntax is (tuple)[i], and also use slice with [i:n]. A singleton is a tuple
with just one element, and is defined name=(element,) the coma is important, and the
interesting thing is that it allows us to treat that singleton as a tuple.
List is defined pretty similar to a tuple, but instead of using round parenthesis () we use
square parenthesis [] and when defining a singleton we dont need to use the comma. The big
difference with tuples is that in a list we can directly change a specific value inside the list,
which is called mutable, the opposite of immutable.
Mutable gives us flexibility but, can lead to a bug.
Aliasing is the capability to mutate information from different paths, affecting where it is
storage, it is useful but treacherous.
The for iteration is defined within a list, we used range(n,m,d) before.
When concatenating we define a whole new flat list, that isnt affected by aliasing.
High order programming is taking functions and inserting them into a list.
Map is a built-in function to apply functions to a list.

Week 04 Lecture 07 01/11/2012
Debugging
Testing includes first the syntax and semantics error, but many code encryptions like python
check this, and doesnt let you run the code with those errors. The second, and more difficult
to understand, are the errors that let your code run but give either bad outputs, or receive
incorrect inputs, and for that we need to methodically check the code, to do so we can do it
with a glass box or a black box.
To perform a black box debugging you dont need to be the one that programmed the code, it
consists in testing the code with extreme values, partitioning the space. It consists in testing
the code with different inputs and outputs.
The glass box test is path complete, which means that the test input goes through every
possible way, including boundary cases. This means that both branches of every if are tested,
ensure each except clause is executed, and for every loop dont execute it, execute it once, and
more than one time, also catch every end case for the loop.
There are different ways to apply to test suites; first we need to star with unit testing, which is
just that, tests that detect algorithm bugs. Integration testing detects interaction bugs. And to
do a proper test we have to iterate between the two kinds of tests.
Test drivers are pieces of codes that sets up an environment needed to run the code, runs it,
saves results and report them. Stubs simulate parts of the code being tested instead of
invoking it; it is useful because it lets you test code that invokes code parts not yet written.
After having run the cycle of unit and integration testing properly its needed to run a
regression test which is rerun many of the tests already passed, because with the different
tests and corrections the code may have been altered in a way that develops new bugs.
There are two types of bugs, overt which are those that prevents the code from running, and
covert are those that let the code run but returns a value that isnt correct.
Another classification is persistent vs. intermittent errors.
Defensive programming is just to evaluate every time that the value is correct, printing an
error when an input is wrong.
Using debugging as a search is to look for the explanation for an incorrect behavior, to do so
first we have to study all available data, both correct and incorrect cases, form an hypothesis
consistent with the data, then design and run a repeatable experiment with potential to refute
the hypothesis. We must remember to type some prints to use a black box suite.
Doing this we narrow down the error possibility, a good tool to do this is using binary search.

Week 04 Lecture 08 02/11/2012
Efficiency and orders of growth
We must be able to build an algorithm that is as efficient as possible, to do so we need to
measure. Measuring isnt quite easy as to just time how much it takes to run, because it
depends on the computer speed, the compiler version, and the input value. To eliminate this
non comparative variables we need a standard measure unit, and one is measuring the basic
steps using a Random Access Machine (RAM) as model for computation where steps are
assignments, comparisons, arithmetic operations, and accessing an object from de memory.
If we get a function which haves different orders of growth, for example a polynomial function
we will just consider the biggest order of growth because the rest will fade away as the input
grows bigger. The notation used is big O and it is used as O(1), O(log n), O(n), O(n log n), O(C
n
),
or O(n
C
), this is the asymptotic behavior of the time, they are in order of growth, and represent
the upper bound.


Week 05 Lecture 09 02/11/2012
Memory and search
To do a proper search we need to have a key to reach the information, the problem lies in
finding that key. One way is a linear search, which takes a long time, but we can do a binary
search with a bisection search, that is divide and conquer.
When sorting the most common is selection sort which is just a comparison between every
element. The problem is that this is expensive, it takes too long, more than a linear search then
it will only be worth it if we can amortize the cost between k searches. Another sorting
algorithm is merge sort, which uses a divide and conquer approach.

Week 06 Lecture 10 18/12/2012
Exceptions
Sometimes programs wont return the expected result for a number of different reasons, and
we have to be prepared to face these different scenarios. In order to identify these different
ways we have to make exceptions to treat these values as different. There are mainly 3 ways
when dealing with errors.
Do nothing, and there will be an answer, though not what we wanted but there will be
something. The problem with this is that the user will never know that the answer is
wrong. This should never be done.
Return an error value, like NoNe instead of a number, and when callers try to use this
value it will show up a cascade of errors.
Stop execution with a signal. Python has code called raise an exception, and is used:
o raise Exception(descriptive string)

Refining a bit this code:

Types of error are:
SyntaxError: Python cant parse program
NameError: local or global name not found
AttributeError: attribute reference fails
TypeError: eprand doent have correct type
ValueError: eporant type okay, but value is illegal
IOError: IO system reports malfunction (eg, file not found)
After a try body there are some useful pieces of code, like the except we have already seen,
there is the else code, that will call its body only if there were no exceptions, and there is the
finally that will call its body always when the code is finished, even if there was some break,
continue or return in the middle. Finally statements are usually used for clean-up, like closing
files.

Object oriented programming (OOP)
The first programming languages didnt include things like objects; everything had to be stated
in order to be treated differently. Around 60 and 70 programming languages started including
objects, which have their own methods and are stacked in a lot. Objects are a data abstraction
that encapsulates internal representation and interaction with other objects, like defining
behavior, hiding implementation and attributes such as data or methods (interaction with
other objects). Every object has a type, or is made of compounds of these, like lists or tuples.
The objects can be created or destroyed (by explicitly deleting them, with del for instance, or
by forgetting them, which is to re-write a value, and it will become inaccessible or a garbage
collection).
The main advantages of OOP are allowing divide-and-conquer development (easy to
implement, and increases modularity which reduces complexity), classes are easy to reuse
(there is no collision between different classes, and inheritance allows to redefine or extend
subset of a superclass behavior)
In python class definition is used to define a new type of object, like def was for methods.
Coordinate is a subclass of object, and object is a superclass of Coordinate, fot the example:
class Coordinate(object):
The code self. is used to access an attribute of the objects itself (only if its within a method
declared within the object, in other case it has to be object.)
There are some built in functions that are commonly override, like __init__ which is used to
initialize the object with some default values, so it prevents errors, another one is __str__
which overwrites a code of the allocation in the memory of the object (which virtually tells us
nothing) for another string, like its vales.
Inheritance
There are many methods that come with python that are already optimized and in a good way,
like sort, but it uses less than __lt__ which by default compares integers. It can be overwritten
to be used to compare and sort strings, or any object we like, by returning a Boolean which is
either true or false. To inherit the properties a specific object we have to declare in its class
declaration, like: class NewObject(OldObject).
When overriding methods for subclasses we have to be careful to make the superclass support
the behavior of all subclasses. This is called the Substitution principle.

Week 06 Lecture 11 19/12/2012
Yield
Yield is a very useful piece of code; it works as a side stop, and when next() is executed it goes
to the next yield until the method has no more yields, it can be useful to add a loop to make it
infinite in some cases.

Week 07 Lecture 12 20/12/2012
Plotting
The main programming part of the course is over, now we should focus on the problem solving
part. A big part of problem solving is presenting the data, and one way is by plotting it, we can
use the PyLab library to do so.
To use the library first we have to import (import pylab) it, then start a figure the the code line
pylab.figure(1), then use the plot line, pylab.plot([listxcoords],[listycoords]), i.e.
pylab.plot([1,2,3,4],[7,4,2,8]), and a line will connect all the pairs, and finally a show plot code,
which is pretty obvious: pylab.show(). If no x cords are given it will take range(len(ycoords)),
so 0, 1, 2 but for a plot to be useful we need a title and axis titles, to do so we use the code
lines pylab.title(str), pylab.xlabel(str), and pylab.ylabel(str). After the lists in cords we can add
some parameters like color, linewidth, or linestyle. To change the default parameters you can
change the .rc file which contains it.
Default and keyword assignment
When building complex functions its useful to predefine a default value, so when it isnt
specified it uses that value. It can be done by setting the keyword with an equal sign inside the
parenthesis. And to edit the final (just an example) variable without reassigning values to the
other variables just use the keyword to do so.

Week 07 Lecture 13 20/12/2012
Random walks and simulation
Essentially, all models are wrong, but some are useful
George E. P. Box
For most of science history people have used analytic models, but as science developed it
became increasingly difficult for the analytic model to keep up. This is why during the XX
century simulation model became more and more common. Simulation attempts to build and
experimental device called model, which intents is to provide as much information as possible
of the actual system being modeled. Simulations are descriptive and not prescriptive of the
process. Simulation models can be classified along three dimensions:
Deterministic vs. Stochastic
Static vs. Dynamic
Discrete vs. Continuous
To choose a random value from a list we have to use the code random.choice(list), after
importing the library. random.seed(int) will set the seed so we get the same result in different
iterations, and we can debug it without problem. random.random() returns a random number
between 0 and 1.

Week 08 Lecture 14 22/12/2012
Monte Carlo methods
For most of our history the world of physics was known to be deterministic, everything could
be calculated using Newtons formulas, and this is called Newtonian doctrine. But during the
beginning of the past century the so called Copenhagen Doctrine was developed, and it
included some stochastic calculation for everything in the physical world. The reason of this
seems to be one of two:
Causal nondeterministic
Predictive nondeterministic

Hashing
When storing information it can sometimes be a bit hard to find them in a dictionary. To save
time we can hash the information and save (insert) it in different buckets from an arrange. This
way we would be using more space but saving time, so we just do one small linear searches
instead of a really big one (this would be the n/buckets if the hashing is equally divided).

Monte Carlo simulation
Monte Carlo simulation is the method of estimating the value of an unknown quantity using
the principles of inferential statistics


Week 09 Lecture 15 22/12/2012
Statistical thinking
Law of large numbers (AKA: Bernoullis law)
In repeated independent tests with the same actual probability p of a particular outcome in
each test the chance that the fraction of times that outcome occurs differ from p converges to
zero as the number of trials goes to infinity. This does not imply the Gamblers fallacy, which
states: if deviations from expected behavior occur, these deviations are likely to be evened out
by opposite deviations in the future. In other words, the curves arent necessarily symmetric.
It is never possible to be assured of perfect accuracy through sampling, unless you sample the
entire population.
This raises a question, how much iteration is necessary? And the answer will be determined by
the confidence needed to prove that the value is significant. This will be given by the standard
deviation, but a good measure to get a pretty good idea is the coefficient of variation.

Week 09 Lecture 16 23/12/2012
Using randomness to solve non-random problems
Normal distribution
We saw before that we needed to get the confidence level of a simulation result, to do so its
really common to adjust a normal distribution to the result (if they adjust) and use the
empirical rule +-1 sigma is ~68% of the data, +- 2 sigma is ~95% of the data and +-3 sigma is
~99.7% of the data. To get a random number within a normal distribution the code is
random.gauss(mu, sigma).
When polling the correct way to do it would be to run a lot of polls, get the standard deviation
of them and the mean, and then get the confidence interval, but thats really expensive;
instead they run a simple poll with a representative sample and they get to the p percent of
the population n, with that they then get the standard error (SE), which is calculated by:

Normal distributions are rather common in nature. In games devised by humans we frequently
find the uniform distribution, which need only its range as a parameter. Another common
distribution is the exponential distribution.

Exponential distribution
This appears in nature quite frequently, for examples inter-arrival times, autos that enter a
highway, or people that enters a web page. This is the only distribution that has memoryless
property.

Week 10 Lecture 17 23/12/2012
Curve fitting
A common pattern in science and engineering
It is rather common to try understanding our environment, and to do so we usually develop a
hypothesis. Then design an experiment and take measurements in order to prove the
hypothesis. Our experiment might be very expensive both in resources and time, making them
impossible to be carried on, and we might want to use computation to evaluate the
hypothesis, then determine de value of the unknowns, and more importantly predict
consequences.
We will get the results of our model, as observations, and then a predicted value; but we want
to get the likehood of this prediction, so we would like to have a method to do so, and one is
the Log likehood.
We would like to maximize the probability of not having error, in other words:

Then taking the logarithm:


But we know that errors behave normally, so:


Constants wont affect the minimization result (as long as they are positive, and in this case
they are):

And this is called the sum of square of errors (SEE).
By minimizing the SSE we get the most likely choice of the parameters
There will always be some kind of variability, and its divided in two subgroups, the fraction of
variability explained by the model, and the fraction of variability not explained by the model.

The fraction not explained will be:

And the one explained:

This is called coefficient of determination (R
2
). It is used to give us a sense of how well the
model is explaining the observations.

Week 11 Lecture 18 24/12/2012
Optimization problems
We face optimization problems every day, when we have to make a choice between two or
more options, o combination of options, this are the variables; these are usually constrained to
some restrictions, like budget or weight capacity. There are a lot of optimization problems
already solved, and when we face a new one its likely that we can reduce it to one that has
already been solved.
One of the most common problems is the travelling knapsack, where we have some items that
can be either carried or not [0, 1], we also have a weight restriction and prices or benefit from
every item we take into the knapsack. There are a lot of algorithms to try solving this
problems, one is the greedy algorithm, which takes in some order (most benefit per item, less
weight per item, more benefit/weight ratio per item) and fills up until the knapsack is full,
none of this orders is either better than the rest nor will give us the global optimal solution,
but the method is really cheap in terms of operations. Another solution is to do an exhaustive
search for the solution, which considers all the legal combinations and chooses the one with
most benefit. An exhaustive search will return the global optimal solution, but its really
expensive. Lets take the orders of magnitude for each method: for the greedy algorithm we
have to first sort the items, O(n ln
2
n), then go through them O(n), then the order is the
addition of those orders, resulting in O(n ln
2
n). In other hand an exhaustive search will cost
first to generate all possible solutions O(2
n
) and check if they are legal, then go through it and
select which is better O(n), and this case we multiply them resulting in an order O(n 2
n
). To
seize this numbers, lets take for example n=50, and a cost of 1 microsecond per action, the
first would take 0.3 milliseconds, and the second would take 187 years.


Week 11 Lecture 19 24/12/2012
Graphs

We often encounter problems that are graph represented, like the one above, problems like
the distance travelled, or the cost of travelling between different nodes. To work with them in
python is good to build objects for nodes and arcs, with nodes having different arcs. A graph
with costs o weights is called a weighted graph.

We can first do a representation of the different possibilities to explore; these are collapsed
into a tree of solutions. To do so there are a couple useful methods, like depth first search and
breadth first search.
To implement depth first search we first have to choose a start node, set a possible paths, if
not at goal node, then extend current path by adding each child of current node to path
[unless child already in path]. Then add these new paths to potential set of paths, at the front
of the set, selecting the next path recursively. If the current node has no children, just return
none. When there are no more paths to explore or when it reaches the goal node.
This will only give us a path that fulfills all the conditions but isnt necessarily the best one, to
do so we can add a piece of code to save the possible paths, and only continue exploring a
branch if its shorter than the better we have found so far.
To use this method we have a data structure called stack, which has a LIFO behavior.
The second kind of search is the breadth first search were we star from the root node, and
then explore all the possibilities that have a length only one step longer than actual one (that
fulfill all the constrains) and stops when all possibilities are explored or when the goal node is
reached.
This method uses a data structure called queue, which has a LIFO behavior.

Week 12 Lecture 20 10/01/2013
Implicit graph search
In some problems there might be difficult identify every node or state, because there might be
many, and in those cases the memory might collapse. For this problem the solution is not to
build every possible state, but to build every possibility as we reach it, and we can then use a
breadth first search to go building every state up until we reach the solution or we have
redundancy. We shouldnt use a deep breadth search because in that case it can take really
long before trying a new route, and the memory might collapse.
A clique is a sub-graph that is complete, which means that every node within that sub-graph is
connected, and sometimes is very useful to find the maximum clique. The power set is all of
the possible combinations of subsets. This way we should be able to find the best route for
every clique saving processes for use.

Week 12 Lecture 21 10/01/2013
Dynamic programming
Some optimization problems have two properties which are:
- Optimal substructure: Where the problem can be solved by solving two (or more) sub-
problems
- Overlapping sub-problems: Where some of the sub-problems answers are the same.
These kinds of problems are usually solved by doing a ton of computations, and repeating
processes, like Fibonacci which costs 1.6
n
iterations to be solved. This can be solved by saving
into the cache the overlapped sub-problems answers and it is done memoizing the answers,
which is saving them into a list so when we reach the same problem again it will just return the
answer that it has in storage.
Memoizing has to be casted like FunctionName = memorize(FunctionName), because its
already built into python.

Week 13 Lecture 22 12/01/2013
Statistical fallacies
In some