You are on page 1of 5

10/14/2020 Recitation 1: Introduction to OCaml Syntax

Recitation 1: Introduction to OCaml Syntax


We will use the Objective Caml (OCaml) programming language this semester.  OCaml is
a functional language rather than a procedural language; the key difference between these
two classes of languages is the execution model---the way in which programs are executed.
Procedural (or imperative) languages, such as C and Java, are based on commands that
change the state of the machine on which they execute.  For example, the assignment
statement (in both C and Java) explicitly changes the value stored in some memory
location.  Functional languages, in contrast, are based on evaluating expressions to produce
values.  OCaml provides good ways to build correct, understandable, and large
expressions.  The focus of this lecture is understanding expressions.
For real applications, programming with expressions may seem like a stretch. After all,
applications often do not "compute" anything; instead, they allow you to "do" something.
Later in the course, we will introduce the notion of side-effects: evaluate this, and do that
while you're at it---e.g., evaluate an expression, and (as a side-effect) display a window on
the screen. But for the moment, we will live in a pure world, one without side-effects.
To start learning OCaml, you should begin playing with the OCaml language by writing toy
expressions.  Like most programming languages, Ocaml has a compiler that can be run on
source files to produce object files.  But to really understand expressions, there is a better
way to interact with the OCaml compiler, called the OCaml toplevel system.  This system is
something like a very fancy calculator---you interact with it by typing in an expression, and it
responds by evaluating that expression and telling you the resulting value. 

IMPORTANT NOTE: In previous semesters, this course has been taught in SML, a
language very similar to OCaml. They share both a common origin and almost all of their
syntax, but OCaml is currently better supported than SML, and looks to have better support
in the future as well. OCaml also has better secondary features, such as debugging. We
have tried to update recitation and lecture notes to use the correct OCaml syntax, but we
may have missed some things.
If we do miss things, don't panic. For aid in understanding notes from previous years or
notes that may not have been updated, you may wish to consult the SML vs. OCaml page,
which provides a detailed list of the differences between the two. Again, the two languages
are very similar: for most of the standard syntax, almost everything is the same except the
keywords. With a good understanding of OCaml, it is easy to understand SML.
If you notice any SML/OCaml discrepancies in the notes, please let us know.

Starting the OCaml toplevel system.  You can find current releases of OCaml at the
OCaml home page.  Once you have OCaml properly installed, you will be able to start an
interactive session by typing ocaml at the command line (in a Unix shell or Windows
console), or by clicking on Start/Programs/Objective Caml/Objective Caml (in Windows). 
You will get a top level prompt "#", which indicates that the compiler is ready to accept
expressions and evaluate them.  To exit the toplevel system, enter the end-of-file character
for your platform (Ctrl-D on Unix, Ctrl-Z + Return on Windows).
Using the OCaml toplevel system.  Type in an expression (possibly on multiple lines,
which after the first line will have a secondary prompt "="), then type ;; and press Return.

https://www.cs.cornell.edu/courses/cs3110/2014sp/recitations/1/rec01.html 1/5
10/14/2020 Recitation 1: Introduction to OCaml Syntax

Note that the ;; is not part of the expression to be evaluated. It is simply an indication to the
compiler that you have finished typing in the expression and that you are ready to evaluate.
Before evaluating the expression, the compiler will type check it.
TIP: The toplevel system is very picky about some things when reading input, and will
accept a request to exit only if it is sitting exactly at the top level prompt—i.e. if nothing else
has been already typed in. When in doubt (especially if input seems to be behaving
strangely), press Ctrl-C to interrupt and reset the prompt to a sane state.
TIP: It can be useful to put expressions into a file so that you're not stuck entering them
over and over again at the prompt. Just use an editor to edit a file, and you can load it into
OCaml with the operation #use "file";; which behaves as though the expressions have
been entered at the top level prompt (don't forget the semicolons!). The big question is
where should the file be stored. Operation #use by default looks for files in the current
working directory.  To change the current working directory, use the operation #cd
"path";; where path is the path where you want to go to. To add a directory to the list of
directories that OCaml looks for files in, use the operation #directory "path";;. Use the
Unix convention (even on Windows): the path separator is "/". 
TIP: Even more details on the toplevel system can be found at the OCaml manual page
on it.

Basic Expressions and Types


Expressions evaluate to values. Values can be classified according to their type:

Type Example values


int 0,1,2,-1,-2
bool true, false
float 3.141592, 1.618034
string "Hello world!","xyzzy"
char 'A','Z'

Let us start looking at simple expressions, with the following syntax: 

e ::= c  |  unop e  |  e1 binop e2  |  if e1 then e2 else e3  |  (e)

where c are constants (the values described above), unop are unary operations, binop are
binary operations.  We expressed this syntax using Backus-Naur Form (BNF), a common
notation for the grammar of computer languages.  NOTE TO INSTRUCTOR: The BNF for
expressions, and later for declarations, should be put on one side of the board, and kept
there as we will be adding expression types and declaration types as the class goes.
Unary operators include:

Operator Meaning
- take an int (or a float) and return its negation
not take a bool and return its negation
Binary operators include:

Operator Meaning

https://www.cs.cornell.edu/courses/cs3110/2014sp/recitations/1/rec01.html 2/5
10/14/2020 Recitation 1: Introduction to OCaml Syntax
+,-,*, / take two ints and return their sum, difference, product, or quotient
+.,-.,*., /. take two floats and return their sum, difference, product, or quotient
mod take two ints and return their integer remainder
>,>=,<,<= take two ints (or two floats) and return their comparison
= take two values and return whether they are equal
^ take two strings and return their concatenation into a new string
Expressions are transformed into values by the application of evaluation rules.  The
evaluation rules for simple expressions are:

Constants are already values. 


Unary and binary operators first evaluate their arguments to values, and then perform
the operation. 
A conditional if e1 then e2 else e3 evaluates e1 to a vallue (of type bool): if it is true,
e2 is evaluated, if it is false, e3 is evaluated.

Evaluation of operators only makes sense if the types of the operands agree.  For
example, the + operator is defined if both operands are integers, but adding an integer to a
boolean is meaningless.  So type checking is performed to ensure that expressions are
meaningful, and this requires giving a type to every expression.  How do we determine the
type of an expression?
Every constant has a type (42 has type int, true has type bool, etc).  Operators also
have types (which we gave informally above); in OCaml syntax, these types are effectively:

- : int -> int float -> float


not : bool -> bool
+ : (int * int) -> int
+. : (float * float) -> float
> : (int * int) -> bool (float * float) -> bool
^ : (string * string) -> string

Every operator specifies the types of values it takes and the type of the value it returns.
Notice the * when more than one argument is expected. 
Now we can give rules for determining the types of expressions.  The principle embodied
in these rules is:

The type of an expression is always the type of its result.

For expression unop e, if e has type t1 and unop has type t1 -> t2, then unop e has
type t2. 
For expression e1 binop e2, if e1 has type t1, and e2 has type t2, and binop has type
(t1 * t2) -> t3, then e1 binop e2 has type t3.
For conditional if e1 then e2 else e3, if e1 has type bool, and then if e2 and e3 both
have the same type t, then the conditional itself has type t. 
Why must e2 and e3 have the same type? From the compiler's point of view,
either branch could be taken. If the then branch is taken, the resulting value
will have the same type as e2.  Similarly, the else branch produces a value of
the same type as e3.  So the type of the conditional must somehow generalize

https://www.cs.cornell.edu/courses/cs3110/2014sp/recitations/1/rec01.html 3/5
10/14/2020 Recitation 1: Introduction to OCaml Syntax
the types of e2 and e3.  A safe way to guarantee that this is possible is to
require that both branches return the same type.

If the conditions in these rules are not satisfied by an expression, then it is not possible to
give a type to that expression.  It does not type check, and the compiler rejects it as a
program. This prevents the evaluation of meaningless expressions.  It is important to do so,
because such programming errors could otherwise cause run-time crashes.

Declarations
Value can be given names. This is not a form of expression, but rather an indication to the
compiler that you are declaring a new name.

d ::= let id = e

For example, let pi = 3.141592 is a declaration.  A declaration indicates to the compiler


to evaluate expression e, produce a value, and bind that value to the name id. In subsequent
expressions, id can be used to refer to that value. We therefore extend our syntax for
expressions with:

e ::= ...  |  id

Evaluating id means to lookup what value it is bound to. The type of id is the type of the
value which it is bound to.
Declarations last until another declaration of the same name is encountered (which
shadows---i.e., replaces---the previous declaration). We can also introduce local declarations
that last only for the evaluation of an expression.

e ::= ...  |  let d in e

To evaluate a let expression, declaration d is evaluated, then e is evaluated, taking the


binding from the declaration into account.

QUESTION: what is the type of let d in e ?  Answer:  The type of e, since the type
of an expression is the type of the result.

At this point, we can write large expressions, but we can't easily reuse expressions. Thus,
we introduce functions. A function declaration is a new form of declaration:

d ::= ...  |  let id (x1:t1, ..., xn:tn):t = e

In this new form of declaration, e is called the body of function id.  For example, let
square (x:int) : int = x * x is a function definition, and its body is x * x. 
Functions have types that are similar to the types of operators.  For example, the type of
square is int -> int.
For recursive functions, it is necessary to use let rec:

d ::= ...  |  let rec id (x1:t1, ..., xn:tn):t = e

This is the same as normal function declarations, but any uses of id inside the function
body will be bound to the value of the function itself.
How do we use functions? We introduce a new expression called function application:
https://www.cs.cornell.edu/courses/cs3110/2014sp/recitations/1/rec01.html 4/5
10/14/2020 Recitation 1: Introduction to OCaml Syntax

e ::= ...  |  id (e1,...,em)

How do we type check a function application f (e1,...,em)?  As follows:

1. If f has type (t1 * ... * tn) -> t, and


2. If the number of arguments supplied to the function is the same as the number of
arguments the function expects, meaning m equals n, and
3. If e1 has type t1, and e2 has type t2, and ... en has type tn,
4. Then f (e1,...,em) has type t.

So square (10) type checks, but square (true) does not.


How is function application f (e1,...,em) evaluated? Evaluate e1,...,em to values v1,...,vm,
then evaluate the body of f with x1 bound to v1, ..., xm bound to vm.  So square (10+10)
--> square (20) --> 20*20 --> 400.  We'll see many more details on evaluation in a
week. For now, we just want to give you an intuition about the simple cases.
Function bindings can be used within a let expression to obtain a local function (like
Pascal, unlike C).  For example:

let fourth (y:int) : int =


let
square (x:int):int = x * x
in
square (square (y))

TIP: It is not strictly necessary to annotate functions with types. Most of the time, OCaml
can infer the type of a function from its definition. For example, given function fun inc (x)
= x + 1, the compiler can infer type inc : int -> int.  The reasoning it uses,
intuitively, is: since 1 is an integer, and + takes two integers, x must be an integer, and thus
x+1 must be an integer as well.
PITFALL: As you will quickly notice, it can be extremely hard to debug a program that
does not typecheck.  It is very difficult to figure out why the compiler inferred this type for
that expression.  Solution:  use type annotations to catch type errors!
�BER-TIP: It is very good practice always to annotate functions with types.  We will
require you to do this in programming assignments.

https://www.cs.cornell.edu/courses/cs3110/2014sp/recitations/1/rec01.html 5/5

You might also like