Untitled

[REEL-TO-REEL PLAYER STARTING]
[MUSIC PLAYING]
DAVID J. MALAN: All right, this is CS50.

And this is week 1, wherein we continue programming,
but we do it in a different language because recall last time, we
focused on this graphical language called Scratch.
But we use Scratch, not only because it's sort of fun and accessible,
but because it allows us to explore a lot of these concepts here,
namely functions, and conditionals, Boolean expressions, loops, variables,
and more.
And so, indeed, even if today's syntax, as we transition to this new language
called C, feels a little bit cryptic, maybe a little intimidating at first,
and you don't quite see all of the meaning of the symbols
beyond the syntax itself, realize that the ideas are ultimately
going to be the same.
In fact, as we transition from what was last week--
a Hello World program that looked a little something like this-- this week,
of course, it's going to now look a little more cryptic.
It's going to look a little something like this.
And now even if you can't quite distinguish
what all of the various symbols mean in this code,
it turns out that at the end of the day, it's
indeed going to do what you expect.
It's just going to say, hello, world on the screen,
just like we did in Scratch.
So let's start to apply some terminology to these tokens first.
So what we're about to see, what we're about to write henceforth,
we're going to start calling source code.
Code that you the human programmer write is just henceforth called source code.
Doesn't matter if it's Scratch.
Doesn't matter if it's C. Doesn't matter if it's Python before long.
Source code is the general term for really
what you and I as human programmers will ultimately write.
Of course, computers don't understand source code, it turns out.
Computers don't understand Scratch and Puzzle Pieces, per se, or C code
like we're about to see.
They only understand this, which we called what last week?
AUDIENCE: Zeros and ones.
DAVID J. MALAN: Yeah.
So this is binary, zeros and ones.
But really, it's just information represented in binary.
And in fact, the technical term now for patterns of zeros and ones
that a computer not only understands how to interpret as letters, or numbers,
or colors, or images, or more, but knows how to execute as well henceforth
is going to be called machine code to contrast it with source code.
So whereas you and I, the humans, write source code,
it's the computer that ultimately only understands machine code.
And even though we won't get into the details of exactly what pattern
of symbols means what, you'll see that in this kind of pattern of zeros
and ones, there's going to be numbers.
There's going to be letters.
But there's also going to be instructions because, indeed, computers
are really good at doing things-- addition, subtraction, moving things
in and out of memory.
And suffice it to say that the Macs, the PCs, the other computers of the world
have just decided as a society what certain patterns of zeros and ones
mean when it comes to operations as well-- so not just data,
but instructions.
But those patterns are not something we're
going to focus on in a class like this.
We're going to focus on the higher level software side of things,
simply assuming that we need to somehow output machine code.
So it turns out, then, that this problem we
have to solve getting from source code to machine code
actually fits into the same paradigm as last time.
But the input in this case is going to be source code on the one hand.
That's what you and I ideally will write so that we
don't have to write zeros and ones.
But we need to somehow output machine code
because that's what your Macs, PCs, phones
are actually going to understand.
Well, it turns out there are special programs in life whose purpose is
to do exactly this conversion-- convert the source code you
and I write to the machine code that our phones and computers understand.
And that type of program is going to be called a compiler.
So indeed today, we'll introduce you to another piece of software.
And these come in many forms.
We'll use a popular one here that allows you to convert source code in C
to machine code in zeros and ones.
Now, you didn't have to do this with Scratch.
In the world of Scratch, it was as simple as clicking the green flag
because essentially MIT did all of the heavy lifting there figuring out
how to convert these graphical puzzle pieces to the underlying machine code.
But now starting today, as we begin to study programming and computer science
proper, now that power moves to you.
And it's up to you now to do that kind of conversion.
But thankfully, the fact that these compilers exist
means that you and I don't have to program
in machine code like our ancestors once upon a time did,
be it virtually or with physical punch cards,
like pieces of paper with holes in them.
You and I get to focus on our keyboard, as such.
But it's not just going to be a matter today of writing code.
It's going to be a matter ultimately today
onward of writing good code as well.
And this is the kind of thing that you don't just learn overnight.
It takes time.
It takes practice.
Just like writing an essay in any subject
might take time, and practice, and iteration over time.
But in a programming class like CS50, we're
going to aspire to evaluate the quality of code along these three axes,
generally.
Is it correct, first and foremost?
Does the code do what it's supposed to do?
After all, if it doesn't, well, what was the point
of writing it in the first place?
So it goes without saying that you want code you write to be correct.
And it's obviously not always.
Again, anytime your Mac, or PC, or phone has crashed,
some human somewhere wrote buggy--
that is code with mistakes.
But code correctness is going to be the first and foremost goal.
But then there's a more subjective goal see in time, a matter of design.
And we saw a little bit of this last week
when I proposed that we could design even scratch programs better,
maybe by using loops instead of just by copying and pasting
the same blocks again and again.
So design is more subjective.
It's more of a learned art whereby two people might ultimately
disagree as to which version of a program is better designed.
But we'll give you building blocks and principles over the coming weeks
so that you can have a better sense for yourself
if your own code is well designed.
And why is that valuable?
Well, the better designed your code is, often the faster it's going to run,
the more maintainable it's going to be by you or colleagues
if you're working with others in the real world.
So good design is a good thing.
It helps you communicate your ideas, just like in a typical English essay.
And then lastly, we'll talk this week onward about style.
And this is really just the aesthetics of your code.
It turns out that computers often don't care how sloppy your actual code is,
where in the world of code, it turns out that you don't really need
to indent things in a beautiful way.
You don't need to paginate things like might in an essay.
The computer generally does not care, but the human does.
The teaching assistant does.
You will care the next day when you're just trying
to understand what your code does.
So we'll focus lastly on style, the aesthetics of the code
that you're writing.
So where are we going to write code?
Where are we going to compile code?
So for this class, not only with C, but the other languages we use later
in the term, we're going to use a free text
editor that is program called Visual Studio Code, AKA VS Code.
It's super popular nowadays, not just for C, but for C++, and Python,
and Java, and any number of other languages.
It's a text editor in the sense that it lets you edit text.
And that's all code is going to be.
Now, strictly speaking, you could write code on paper/pencil.
In fact, in high school if you took a class,
you might have done that one or more times as an in-class exercise.
You can't run it on paper, of course, but you could write it, certainly.
You could use something like Microsoft Word,
or Notepad.exe, or Text Edit on the Mac.
But none of those programs are really designed
to format the code in the best way for you,
nor are they designed to let you compile and run the code.
So VS Code is going to be a tool via which you can do all that
and more-- write the code, compile the code, run the code.
So that you all don't have to wrestle with stupid technical support
headaches at the beginning of the course by installing this software and that
on your Macs or PCs, we'll use a cloud-based version of VS Code
at code.cs50.io.
And that's going to be the exact same tool.
And the goal, then, is by the end of the semester
to migrate you off of that cloud-based environment to your own Mac and PC
so that even if CS50 is the only CS class you ever take,
you're 100% equipped to continue writing code after the class using
not something that's even CS50-specific, but a de facto industry
standard, at least for some time.
So what's this program VS Code going to look like, be it on your Mac,
PC, or initially in your browser?
It's going to look a little something like this.
And there's going to be several different regions to the screen.
And pictured here is that very same code I
keep proposing as the simplest program you can write in C.
And what are these different regions of the screen?
Well, there's essentially these four here.
So first, highlighted up top is going to be one or more tabs where
you're going to actually write code.
So much like in Google Docs or Microsoft Word,
you can have tabs open with files.
Similarly in VS Code-- or, really, any programming environment--
you generally nowadays have tabs of some sort.
And this is going to be a tab containing a file, it seems, called hello.c.
And that's going to be the very first file we write in just a moment.
Down here, though, is going to be an interface that many of you
might not know.
This is what's called a terminal window.
And a terminal window provides what's generally called a Command Line
Interface, or CLI.
And this is in contrast with a Graphical User Interface, or GUI.
Now, you and I, every day, are using GUIs on our phones, on our PCs.
And a GUI is literally graphical-- so menus, and buttons, and icons.
And you generally use your finger or a trackpad
or a mouse or something like that to interact with it.
But it turns out that many programmers-- they're
saying most programmers, at least over time, come to prefer,
not a GUI, but a CLI, a Command Line Interface
where you actually do everything somewhat arcanely via keyboard alone.
Why?
Well, it turns out, there's just more features built in to most computers
if you can access them with a keyboard.
It turns out, most of us can type faster than you can point and click.
And so that ends up being an efficiency gain over time.
So in time, will you get comfortable using this terminal
window to do things like compile your code or make your program
as well as run it.
So you won't be in the habit initially of just double clicking icons
like we do in our typical real world.
You'll do it the programmer's way.
But it's not to the exclusion of adding icons, and clickability, and more.
On the left-hand side of VS Code, there's
going to be a somewhat familiar File Explorer,
some kind of hierarchical tree, like on your Mac or PC where you can
see all of the files in your account.
Pictured here, for instance, is just hello.c,
which I'll create myself in a moment.
And then far away on the left is the so-called Activity Bar,
and this is where you just get a lot of traditional menus and buttons.
So VS Code itself gives you both a GUI and a CLI.
But it's within the CLI, the terminal window, the bottom region of the screen
that we're actually going to type most of our commands.
And in general in class, I'm going to hide all of the graphical stuff that's
just not of all that much interest.
So with that said, let me actually change over
to a live version of VS Code.
And I've indeed hidden in the Activity Bar.
I've indeed hidden the File Explorer.
So what I have here for visibility sake is a really big area
for writing code and a really big terminal window at the bottom.
You'll see in the terminal window, there's dollar sign.
And this doesn't mean any form of currency.
This is just the standard symbol that represents type commands here.
So the fact that there's just dollar sign and a cursor
means, eventually, that's where I'm going to type commands.
But first, I'm going to actually create some code.
So how might I program using VS Code-- be it on my Mac, PC,
or in this cloud-based environment that you'll get set up for problem set 1--
go about writing my first file?
Well, perhaps the easiest way is this.
Literally run the command code and then the name of the file
you want to create.
Notice that I deliberately end the file with .c in lowercase.
Notice that I've deliberately lowercased the whole file name.
And these are just conventions.
You could use a capital H.
You kind of could use a capital C. But just don't do that.
Follow best practices so that it's consistent with what most everyone
else would do.
When I hit Enter, I just get an empty tab,
just like the screenshot a moment ago.
And it's in this tab where I can now write my very first program in C.
Unfortunately, it's not quite as user-friendly as Scratch
where you drag and drop a couple of puzzle pieces and, boom, it's done.
So I'm going to do this for memory.
But this, too, will become familiar to you over time.
I'm going to include something called stdio.h.
I'm going to type int main, void in parentheses.
On a new line, I'm going to insert some curly braces, as we'll call them.
And then I'm going to type printf, and then some
parentheses, and then in quotes, hello, comma, world, then a backslash, then
a lowercase n, then a close quote, and then a semicolon
at the very end of the line.
So all I've done is recreate, just from memory, that very first program.
In a little bit, we'll make clear what most of this does.
But for now, let's just actually run this thing.
And just like I clicked the green flag last week for the first time,
let's actually compile and run this program.
If it were your Mac or PC and Google, or Microsoft, or someone else
had made the software, at this point in the story,
we'd be double clicking an icon.
But we can't do that yet.
This is still source code.
So I'm going to click back down in my terminal window.
Notice I have a second dollar sign below the first, which just
means it's ready for a second command.
And now the command via which to make this an actual program, to compile it
from source code to machine code is, going to be quite simply make
and then the name of the program I want to make.
Slight subtlety-- I'm omitting deliberately .c because the program I
want to make, I just want to call hello.
Don't write make hello.c.
Just write make hello.
And this program make is essentially our compiler.
Technically speaking, it's a program that automates
the compilation of my program for me.
But it is going to see that I've typed the word hello.
It's going to automatically look now for a file on the hard drive called hello.c
and convert it from source code in C to machine code in zeros and ones.
So if I didn't make any typos, Enter, nothing seems to happen.
And that's a good thing.
Almost always, if nothing gets outputted on the screen, you did good.
You didn't make any mistakes.
You didn't get yelled at.
There's no error messages.
So this is actually a good thing.
How do I now run this program?
Well, notice I've got a third dollar sign, which just
means I'm ready for a third command.
And now I'm going to go ahead and run ./hello.
And this is admittedly a little weird that you have to do dot slash.
But for now just take on faith that this is
how you run a program called hello In your current folder,
in your current directory in this cloud-based environment.
All right, crossing my fingers again, hitting Enter.
And voila.
My very first program in C, hello, world.
And now let me go ahead and reveal the File Explorer that I proposed
exists earlier.
I'm just going to use the keyboard shortcut to reveal that.
And generally, I keep it closed because I don't really need to constantly what
files are in my account.
But you'll see now in the File Explorer, similar in spirit to a Mac or PC
but graphically a little different, here's my file, hello.c.
It's highlighted because I have that tab open.
But now there's a second file here called just hello.
That's the name of my program.
So if you were on a Mac or PC, you would ideally double click that thing.
You can't do that in a command line environment.
You have to run it down here.
But that's all we've done.
We've created a file called hello.c, and then my compiler
made the program from that.
Let me pause here and see if there's any questions because that's
a lot of magical phrases.
Yeah?
Yeah.
So if you're currently following along, playing along at home
and you're getting some kind of error message, part of today
will be for me to deliberately induce some of those error messages.
For now let me just propose that if you literally did what I did,
you must have made a typo somewhere.
And notice that it's indeed standard io-- stdio.h.
Maybe you typed studio.h?
OK, super common mistake, I could call you out.
It is not studio.h.
It is stdio.h-- so common.
But this is exactly representative of the kind of stupid headaches
you're going to run into this week, probably for a few weeks,
probably, honestly, for a few years.
But you start to see past these stupid mistakes over time,
and it just gets easier and easier because the computer is
going to be so regimented.
It will only do what you tell it to do.
And if you say because it's verbally sounds like studio.h,
it's not going to know what the file is.
So actually, thank you for tripping over that so early.
That's super common to happen.
Yeah?
AUDIENCE: Why do you have two hello files?
DAVID J. MALAN: Why do I have to hello files?
AUDIENCE: Yes.
DAVID J. MALAN: So why do I have to hello files?
One is the one I created as the human called hello.c,
and it's pictured right here.
But then when I ran make hello, that process compiled my source code
into machine code.
So this second file just called hello is the file that
contains all of those zeros and ones that the server actually understands.
All right, so yeah, question?
AUDIENCE: The access to the hello [INAUDIBLE]
DAVID J. MALAN: If you try clicking on the hello file,
you'll see in this environment the VS Code,
quote/unquote, The file is not displayed in the editor
because it is either binary--
AKA zeros and ones-- or uses an unsupported text encoding.
In this case, it's binary.
It's zeros and ones.
Now, you could use software to see those zeros and ones.
It won't be intellectually enlightening to most any human.
So VS Code just takes the choice of not showing it to you at all.
So that would be a common mistake too, clicking on a file you don't intend.
But the source code is indeed going to be editable by us.
All right, so I've written this program.
It seems to magically work, at least with some effort
if you get every single keystroke right.
Well, what is it that's going on?
And how is this working?
Well, first of all, notice that even without my highlighting things
or choosing buttons for menus, notice that it's already color coded.
And yet, I wasn't highlighting along the way in Google Docs style,
changing the color, certainly.
Well, it turns out, what VS Code and most programming environments
nowadays do for you automatically is syntax highlighting.
So syntax highlighting is just this feature of typical text editors
nowadays that analyzes the code that you've typed.
And when it notices certain types of keystrokes,
things that represent functions, or conditionals, or loops, or variables--
a lot of the vocab from last week-- it just highlights it ever so differently
for you.
So main, for instance, which we'll soon see, is in purple here.
Int, and void, and include are in red.
Hello, world is in blue.
My parentheses are in green.
This will totally vary by programmer too.
In fact, if you do want to change these colors for problem set 1
for your own environment, you can poke around
VS Code Settings via the gear icon.
You can change to a different color theme.
Syntax highlighting isn't some specific color scheme like it is in Scratch.
It just generally is to each human their own preference.
But that's all that's happening here is this notion
of syntax highlighting at the moment.
Well, what more is going on in this code before I run it, but rather write it?
Well, it looks a little something like this if I take away all of the colors.
And then just for discussion's sake, let me go ahead
and color it a little more like Scratch.
Recall that our very first Scratch program
that just said hello, world on the screen had a green flag clicked icon--
puzzle piece, roughly in orange, and then a purple say block beneath it.
So whereas this is the C version, if we run rewind to last week,
this was the same program in Scratch.
But what's happening now is exactly the same.
So if you think back to last week and you've
got some function, like the say function in purple,
that might take one or more arguments, like inputs that
influences what it says on the screen.
And then functions, recall, can sometimes
have side effects, like the speech bubble appears on the screen.
So last week when we used the say block and we
passed in an argument of hello, world at left,
we got this visual side effect on the screen that says now hello, world
in the speech bubble.
And that's exactly what just happened in VS Code but much, much more textually.
And let's look a little closer now at the code itself.
Let me wave my hand at the equivalent of when
green flag clicked part of my code, and let's focus only
on the say block in Scratch and the corresponding function in C.
So if I step through this and I wanted to convert what
we did last week with the say block to C,
I would first use the print function-- although that's actually
a bit of a white lie.
It's actually the printf function.
Printf means formatted.
And it's just a function that allows you to format text on the screen.
There is no say function in C. There's a printf function.
What MIT did down the road years ago was they took what existed historically
as printf, and they simplified it for a broader audience
by just calling it essentially say instead.
But notice that now if I want to convert the Scratch code at left to C code
it right, it's the same shape.
So MIT deliberately used this white oval,
if only because it conjures this idea of having parentheses too.
So on the right, if I want to pass an argument or an input
to the printf function, I use an open parentheses and a close parentheses.
In those parentheses, I then type whatever
it is I want to print on the screen--
in this case, hello, comma, world.
But notice I've deliberately left some room
because you need some extra keystrokes in the world of C.
Any time you type out some text--
otherwise known as a string of text, to use computer science jargon--
you need to quote it, in this case with double quotes.
Double quote at the left, double quote at the right.
And notice too I'm going to include some slightly cryptic symbol
here too-- backslash n, which I also typed and said verbally earlier,
and then one last nuisance at the end of this, which is a semicolon.
So suffice it to say, this is why we start with Scratch.
This, drag and drop, you're good to go.
In a language like C, printf, parentheses, double quotes, the text
you want, backslash n, semicolon at the end.
There's just so much syntactic overhead.
But at the end of the day, it's just a function.
And you'll get used to these nuisances like the parentheses, the quotes,
the semicolon, and the like.
But things can very easily you go wrong, and it's very easy to make mistakes,
even with lines of code like this.
So let me do this.
Let me go back to VS Code where I have the exact same code.
Notice that on line 5 is exactly that line of code.
So this is the equivalent of the say block.
And let's consider what mistakes I may make early on or even now
20 years later after learning this that are quite common in general.
Suppose I forget the semicolon there.
So easy to do.
You will do this eventually.
Let's see what happens now when I go back to my terminal window
and try to compile my code again.
Just to keep things tidy, I'm going to clear my screen.
But that's just for lecture's sake so that we can focus only
on the most recent command.
But I'm going to go ahead now and rerun make hello.
This will ensure that my program is recompiled.
And this is a manual process.
I changed my code.
The zeros and ones on the hard drive have not changed.
I need to recompile it to output the latest machine code.
So here we go.
I'm going to hit Enter, crossing my fingers as before.
But again, I remove the semicolon by accident.
Oh, my god.
There's more lines of errors now than there are of actual code.
And this, too, takes some getting used to.
The programs we're using were not necessarily written
with the least comfortable audience in mind
but, really, professional programmers back in the day.
But through practice, and through experience, and through mistakes,
you'll start to notice patterns here too.
So here's what I typed.
Make hello after the sign prompt.
Now I get yelled at as follows, hello.c, colon, 5, colon, 29.
Well, what's that referring to?
I've screwed up somewhere-- on line 5, on the 29th character on that line.
Generally, the specific character is not that useful
unless you actually want to count it out.
But line 5 is a good clue.
Why?
It means I screwed up somewhere on line 5 here.
All right.
Well, what is the error?
Expected a semicolon after expression.
And this error is actually pretty obvious
now that I see it and I realize, oh, wait a minute.
All right, here's my line of code.
Here in sort of ASCII art, so to speak-- textual text representing graphics--
it wants me to put in green here the semicolon at the end of that line.
1 error generated, builtin--
so some esoteric stuff there.
But my program did not compile.
When you see an error like this, it means it did not work.
So what's the fix?
Well, obviously, the fix is to go back up here, put the semicolon there.
And now if I recompile my code with make hello--
I won't clear my screen just yet just to show you the difference-- now
it just worked.
So we're back in business as before.
All right, let me pause here, though, and ask if there's
any questions about what I just did.
These error messages will become frequent initially.
Yeah?
AUDIENCE: So do you need a semicolon after line or just some of them?
DAVID J. MALAN: Really good question.
Do you need a semicolon after every line or just some?
It turns out, just some.
This is something you'll learn through practice, through demonstrations
and examples today.
Generally, you put a semicolon after a statement, so to speak.
And this is the technical term for this line of code.
It's a statement.
And think of it as it's the code equivalent of an English sentence.
So the semicolon in code is like a period in English
when you're done with that particular thought.
You don't need semicolons for now anywhere else.
And we'll see examples of where else you put them.
But it usually is at the end of a line of code
that isn't purely syntactic like curly braces instead.
Other questions on the mistake I just fixed and created for myself?
Yeah?
AUDIENCE: [INAUDIBLE]
DAVID J. MALAN: Correct.
So line 5 is where the error is most likely.
Character 29 means it's 29 characters that way.
And then it's actually, in this case, giving me a suggestion.
The compiler won't always know how to advise me,
especially if I've made a real mess of my code.
But often, it will do its best to give you the answer like this.
Yeah?
AUDIENCE: How come you first put code hello.c?
DAVID J. MALAN: Ah, so how come I first typed code, space, hello.c,
and now I'm typing make hello?
Two different processes.
So when I typed code, space, hello.c, that
was because I wanted to open VS Code and create a new file called hello.c.
It's like going to File, New in a Mac or PC.
Thereafter, though, once the file exists and is actually open here--
and it does autosave, you don't need to hit Command-S or Control-S all
the time--
I can now compile it with make hello again and again.
So theoretically, I should never need to type code, space, hello.c
again unless I want to create a brand-new file called the same thing.
All right, so what about this other piece of syntax here?
Let me clear my terminal window here.
You can also hit Control-L just to throw everything
away just to clean it up aesthetically.
Suppose that I omit whatever this sequence of symbols
is, backslash n, since I'm not really sure at first glance
why that's even there.
Does anyone want to conjecture, especially
if you've never programmed before, what might happen now if I recompile
and rerun this version of the program?
I left the semicolon, but I took away the backslash n.
Any instincts?
All right, well-- yeah?
AUDIENCE: Will the next dollar sign appear
straight after your hello, world?
DAVID J. MALAN: It will.
The next dollar sign will appear right after my hello, world.
But what makes you think that?
AUDIENCE: Because the backslash n creates a new line?
DAVID J. MALAN: Exactly.
Backslash n is actually a special sequence
of symbols that creates a new line.
And so, to your point, if I recompile this program, make hello, Enter--
no syntax error, so it did compile this time.
So you don't need the backslash n.
You do need the semicolon.
But if you don't have the backslash n, watch what happens when I do ./hello
this time.
Now, indeed, I see hello, comma, world and then a weird dollar sign.
And this is still a prompt.
I can still type commands at it, like clear, and everything gets cleaned up.
But it just looks kind of stupid.
If I run it again here with ./hello, it's just not very user friendly.
It is convention that when you're done running your program,
you should ideally clean things up, move the cursor
to the next line for the user.
And so the backslash n is simply the special symbol,
otherwise known as an escape sequence that C knows
means move the cursor to the next line.
In other languages, Python among them, uses this same symbology as well.
Now, if I go back to the code here and, for instance,
I try to do this differently.
Suppose I don't put the backslash n.
I just hit Enter like a normal person would in Google Docs or Microsoft Word.
Let me go ahead and try compiling this program.
And this, you would hope, would work, right?
You would hope this would print out hello, world and then a blank line
because I move the cursor to the next line.
But no.
If I run make hello now and try to compile that, C does not like this.
Now I get a different error, still on line 5,
this time starting at character 12--
error, missing terminating double quote character and then
some other esoteric stuff.
And then this does not sound good-- fatal error this time,
too many errors emitted, stopping now.
So I really screwed up here.
So why can't I do this?
Just because.
The humans who designed C decided that if you have a string of text,
it must stay on the same line.
It can get really long.
It can soft wrap-- that is, without you hitting Enter.
But you can't hit Enter to create a new line.
If you deliberately want a new line, you have to indeed use this backslash n
escape character.
So let me go ahead and do this.
Let me put it back.
Let me go back to my terminal window.
I'll clear the screen again.
Let me go ahead now and do make hello to recompile to that version-- ./hello.
And voila.
We're back in business with hello.
All right, so now let's tease apart some other aspects
of this code because there's a lot going on just to get us to say hello,
world on the screen.
For today, we're largely going to ignore this-- int
main(void) and these curly braces here.
We'll come back to that before long as to why it's there.
But for now just think of int main(void) and these curly braces
here as really being the C equivalent of when green flag clicked.
Why?
You just need it there.
That's how you get your program going.
And main is indeed going to be some special function, but more on that
another time.
But why do I have this line of code here?
The correct spelling is indeed stdio.h, S-T-D-I-O dot H.
And they're angled brackets this time, so that's a little new.
There's a hash and then an include keyword.
If you don't know what something is, there's
not really that much harm in just getting rid of it and see what happens.
So let me delete that line.
Let me go back to my terminal window, clear the screen,
and then run make hello again.
And let's try compiling this program now without that first line.
Why?
I don't understand it, so let's see what happens.
All right, here's yet another error, but let's see--
hello.c, line 5, character 5-- so it's pretty early on--
error, implicitly declaring library function printf with type int
and then dot, dot, dot.
So implicitly declaring library function printf--
so this is very cryptic sounding.
You'll get better at understanding phrases like these.
But apparently, I do need the include line for stdio.h.
But why?
Based on this symptom, what might your instinct
be for what that first line of code is doing for us in the first place?
Why intuitively must it be there?
AUDIENCE: It's how the [INAUDIBLE] functions.
It's like importing a library so that you can do things like print things
out on the screen.
Now, in Scratch, you didn't have to do this for most of the puzzle pieces.
But you might recall that partway through week 0,
I went to the Extensions button at the bottom left of the Scratch screen,
and I imported some extra puzzle pieces for text
to speech that gave us the creepy humanized voice that actually came out
of the cat's mouth.
Well, that was like adding a library--
code that someone else wrote.
In that case, it was a third party.
But I gave myself access to it.
Same here.
Turns out that you don't really get printf automatically in C.
You have to include a so-called header file that
declares that function to exist.
Now, the reason for this historically is just efficiency.
Back in the day when computers were really slower and resource constrained,
you don't want to just give yourself access to the entire kitchen
sink of functionality.
You only want to include only the functions you actually care about.
Nowadays, it's a copy/paste step because you almost always
want to print something out on the screen,
at least when writing programs like these.
But these so-called header files contain enough information
about all of the functions in what's called the Standard I/O Library.
And standard I/O just means standard Input and Output.
And that's appropriate, right?
Because printing is pretty basic output.
Turns out, there's other functions for getting input from the human's
keyboard-- more on that in a bit.
But any time you want to print something on the screen in C,
you indeed need to include this header file at the top of your code.
And that's going to essentially inform the compiler, hey, compiler,
I want to use functionality from the Standard I/O Library,
including printf in this case.
And if you omit the header file by accident,
it's just not going to work because it doesn't know what printf is.
It's some unrecognized symbol in that case.
All right, questions, then, about this line of code, this line of code here,
or what these header files are?
All right, you might wonder, well, how do you know what functions exist?
How do you know what files you might indeed want to include?
Well, it turns out that C is a many-year-old language,
and it has ample documentation.
A caveat is that its documentation isn't necessarily all that user friendly.
But what we have for the course is a simplified version
of the official documentation for C at this URL here, manual.cs50.io.
So in the world of C, and other languages too,
there are what are called manual pages.
And these are just text-based documentation
that, honestly, is typically written in a voice
that you have to be an experienced programmer to understand some of it.
So what we've done in this version of the same documentation
is we've imported all of the original official documentation,
but we've added less comfortable translations in English
for a lot of the functionality that you might use in class just
to help onboard you.
So at the end of the day, you don't need this documentation long term.
But just to get started, we'll translate it into terminology
that you might appreciate from a teaching assistant,
for instance, as opposed to the original author of these documents.
And so, for instance, if you were interested in reading up
on what functions exist in the stdio.h file, well,
you could go to a URL like this, or you could search for it at manual.cs50.io.
That would show you a list of all of the available functions in that library,
and print if indeed would be one of them.
And then you could click further on that, reaching a URL
like this that's going to give you all of the documentation for how
to use printf.
It turns out, you can do even more than just printing out hello, world.
And we'll scratch the surface of that today.
But it turns out that the documentation will always
be your authoritative source ultimately for questions like, what can I do,
and how can I do it?
Meanwhile, it turns out that CS50 has its own library
and accessible via header file called cs50.h.
It turns out in C that output is actually
pretty easy, relatively speaking, once you
get used to all the curly braces, parentheses, quote marks, and the like.
But input is a little more difficult.
And if you have programmed before, input's not that hard to do in Python.
It's not that hard to do in Java.
It's more difficult to do in C. And we'll see why in a couple of weeks.
But for the first couple of weeks of the class,
we actually provide you with some training wheels,
of sorts, whereby we have a number of functions
that are declared in this file, cs50.h.
It lives its documentation at a URL like this.
And in a moment, we'll use a few of these.
You'll see that CS50 provides you with some functions like get_char
for get a single character from the user's keyboard,
get_int to get an integer from the user's keyboard,
get_string to get a sequence of text from the user's keyboard, and a bunch
of others as well.
So let's actually use some of these functions, how about,
by revisiting, really, the second program we
wrote in Scratch last time, which adds some input to the output.
So first version of Scratch was just hello, world.
Said the same thing every time you click the green flag.
Version 2, recall, though, did this.
It asked the user, what's your name?
And then that somehow gave it back a return value, we called it.
And we then joined hello and that name to say something a little more
interesting on the screen.
So what did that model look like?
Same thing as before.
We've got a function in the middle where function is like the code
implementation of our algorithm.
That takes in one or more arguments, like what is it you
want to say on the screen ultimately?
And return value, in this case, is going to be actually a value that comes back.
So in the case of getting input, we can consider this ask block again,
like last week.
The input to it is whatever words of English you want to ask the user.
And then it returns a value.
And this was called by default in MIT'S world answer.
But we'll see in C, you can call these return values anything
you want ultimately in variables.
But this is different from a side effect.
A side effect is just something visual often that happens on the screen,
like the speech bubble or hello, world.
A return value is actually a value you get back from a function
that you can use or reuse.
So how do we convert this Scratch block from last week to C code this week?
Well, if you want to ask the user for something like their name,
you can do this.
You use a CS50 function called get_string.
And you use the parentheses to represent here comes the inputs there too.
You can then put the sentence you want to ask the user--
quote/unquote, what's your name?
But you do indeed need the quotes literally in C.
So I'll go ahead and add those as well.
Subtle, but I've deliberately included a space after the question mark,
but before the double quote, just so that the cursor
moves one step over because, in this case,
we're not going to get a special speech box like we did in Scratch.
It's just going to leave the cursor where it is, so we'll see that,
aesthetically, that just moves the blinking cursor one
space after the sentence on the screen.
All right, but the catch is with Scratch,
we just automatically got back the answer from the user
in a special variable called answer.
In C, you're going to have to be a little more specific.
In C, If you want to get back a return value from a function like get_string,
you have to use an equal sign and then the name of a variable on the left.
The choice of variables is up to you.
I could have called this anything-- x, y, z.
I'm going to more descriptively call it answer for parity with what
MIT did with Scratch.
But notice that this doesn't represent equality, per se.
This is assignment in this case.
So in C, when you use a single equal sign,
that means copy the value on the right over to the value
on the left-- from right to left.
So what does this do for us?
Well, if get_string is a function that prompts the user with,
quote/unquote, what's your name, and it has I claim a return value,
that means it hands me back some value.
But it's up to me in C to do something with that value.
So if I want to copy that value into a variable that I can use and reuse,
I use an equal sign, and I invent on the left-hand side of that equal sign
any variable name I want.
There are certain rules.
There are certain conventions.
But generally if you use a single word with all lowercase,
you're in good shape.
But C's a little more pedantic than that.
And those of you who have programmed before
might not be used to this, for instance, in Python, which
is a world we'll get to in a few weeks.
You also have to tell C what type of value you're storing.
So if I do want a string of text from the user-- so not an integer, not
a single character.
I want a whole string of text, like a phrase, a sentence, a name,
in this case--
I have to tell C that this variable is of type string.
So it's a little wordy, but you get used to it.
And you just have to be precise.
You're informing the computer what type of value is going in this variable.
All right, it's so close to being correct,
but I have omitted something that's annoyingly important still.
What's missing still?
Yeah?
AUDIENCE: Semicolon?
DAVID J. MALAN: So semicolon.
This is a statement.
This is like a full thought, if you will.
In Code, I do need to end it ultimately with the semicolon at the end there.
All right, so this was more of a mouthful,
but let's try using this in now my code.
Let me go back to VS Code where I have version 0 of my code here.
Let me go ahead and include one other file at the top of hello.c,
namely include cs50.h so that I have access now
to get_string and anything else I might want.
Now let me go ahead and add a line of code here inside of these curly braces.
And let me go ahead and do this--
string answer equals get_string, quote/unquote, what's your name,
question mark.
I'm going to add an extra space before the double quote.
I'm going to indeed end my thought with a semicolon.
And now let me deliberately make a mistake, just to make a point here.
Let me now try changing hello, world to hello, comma, answer.
Now, perhaps, even though this is some new lines of code,
you can see where I've errored already.
But let me try making this program now.
So far, so good.
So no error messages.
So that's a good thing.
Let me go ahead and run ./hello, and you'll see the prompt.
What's your name, question mark.
And notice, the cursor is indeed one space to the right
just because I thought it would look prettier
to put a little blank space there as opposed to leaving
it right after the question mark.
Let me type my name.
But even if you've never programmed before, I have screwed up here.
What are we going to see on the screen when I hit Enter?
AUDIENCE: Hello, answer.
Hello, answer, most likely.
Why?
Because the computer is going to take me literally.
And if I say, quote/unquote, hello, answer.
That is the string of text followed by a new line that's
going to be outputted to the screen.
So we need some way of actually plugging answer into this line of code.
It's not quite as simple as scratch where
you could just grab a second say block and drag and drop the variable there.
We actually need a new syntax.
And it's going to look weird at first, but it is everywhere
in software nowadays, especially in the world of C and certain other languages.
So let me go ahead and propose that I solve it as follows.
Well, back when we did this in Scratch, remember that the most elegant solution
was this here.
We used the say block still, which is going to be analogous to printf today.
But I use the join puzzle piece and Scratch to combine hello, comma, space,
and then the name of the human.
So how do we translate this code to C?
Well, it's going to look a little different now.
I'm going to start with printf with some parentheses and a semicolon
representing the say block.
But how do I now do this joining?
This is where the puzzle pieces don't quite translate perfectly.
This would be the way to do this.
You put hello, comma, and then a placeholder.
So this is what's known as a format code in C, specifically for printf.
And it just means this is a placeholder for a string.
Again, a string is just text.
So this means, hey, computer, print out literally, hello, comma, space,
and then not literally %s. %s is treated specially to mean plug in some value
here.
All right, so what else do I still need?
Well, this is still some text, so I'm still
going to surround the whole thing with double quotes.
I'm still going to include my backslash n just
to keep things tidy and move the cursor to the next line.
So the last step here in C is to somehow join the answer with that word hello.
And the way you do this is with printf, passing it not one argument, which
is what I keep doing.
I keep passing it one string of text, quote/unquote.
I'm going to now add a comma and then the name of the value that I want
printf to go back and plug into that %s.
And printf is just smart about this.
If you have one %s and one additional argument after a comma, it just does--
from right to left, it plugs it in.
If you have two %s's and two variables after the comma, that's OK too.
If you separate them with commas, it'll plug the first into the first %s
and the second variable into the second %s.
So it's just left to right, order of operations.
It's not as pretty or as simple as this, but this is how it's done in C.
All right, let me pause because this is a lot of symbology.
Any questions on this technique here?
Yeah?
AUDIENCE: Why did you exclude the backslash n in the previous section?
DAVID J. MALAN: Yeah, a really good question.
Why did I exclude the backslash n a moment ago?
Really, just my sense of aesthetics, if you will.
No good reason beyond that.
So if I look back at my code, you quite rightly
notice that I didn't have a backslash n there.
That's just because, for whatever sense of style that I have,
I wanted the name to be typed right after the question.
I totally could have added a backslash n there instead of a space.
That would have just allowed me to type down here.
Totally fine.
Just wanted to show you something different.
Good catch.
Yeah?
AUDIENCE: Can you show an example with two %s's?
DAVID J. MALAN: Can I show an example with two %s's?
Surely.
So let me in VS Code do this.
Let me clear my terminal window to clean things up.
And let me do this.
Instead of calling the variable answer all over the place,
let me call it first.
And I'll ask two questions.
What's your first name?
And now let me do string last equals get_string--
whoops, capitalization matters, so let me fix my capital S
there-- quote/unquote, What's your last name, question mark, semicolon.
And now we'll plug in one %s and a second %s.
And now I'm going to plug in first first and last last, coincidentally.
And now I'm going to go back to the terminal window.
Make hello-- crossing my fingers, all good-- ./hello.
Here's my first question, David.
Here's my second question, Malan.
And again?
Hello, David Malan.
So it just inserts them left to right.
All I was doing for parity with Scratch though--
and let me go ahead and undo this again.
I'll go back to answer, like this.
I'll go back to just asking for the person's name.
I'm going to delete mention of last.
I'm going to delete mention of the second %s.
And now if I recompile this simpler version,
oh, I did screw up-- didn't intend it.
What did I do wrong?
AUDIENCE: You forgot to change first in line 7.
DAVID J. MALAN: Yeah, so just newbie mistakes.
So I changed my variable back to answer just to be consistent with week 0,
but I didn't change it here.
So I have an use of undeclared identifier first.
It's undeclared in the sense that I declared answer a line prior.
I didn't declare first.
So indeed, intuitively, I want to just change that to that.
Let me now do make hello again, ./hello, type in just my first name this time.
And there it is-- hello, David.
Questions on this then syntax with printf?
Yeah?
DAVID J. MALAN: Ah, the placeholder--
I'll zoom in-- is just a single percent then an s.
So inside of my string here is %s, and then I have a comma outside the quotes,
and then the name of the variable whose value I want to plug in for that %s.
And now notice there's technically two commas inside
of these parentheses on line 7.
And yet, I claim that printf, at the moment,
is only taking in two arguments.
Why is there then two commas but only two arguments?
If there were two commas, you would think
there would be three arguments, right?
AUDIENCE: The comma is between the quotes,
so it counts as a comma [INAUDIBLE]
The comma in between the quotes is just an English thing.
It's separating the hello from the name.
So that's why indeed it's not only in quotes,
that's also why programs like VS Code tend to syntax highlight it
a little differently just so that it jumps out as different to you,
even though, in this case, it's a little subtle-- a light blue versus white--
but indeed, it's trying its best.
Other questions now on this placeholder?
Yeah?
AUDIENCE: If you wanted to put an exclamation point at the end,
would you put a comma after your answer variable,
and would that put it [INAUDIBLE],, or would you have to add a new line?
DAVID J. MALAN: Ah, good question.
If I wanted to add an exclamation point after the name,
would I have to add another placeholder and so forth?
I could actually do that much more simply.
I can just put the exclamation point right after the percent sign.
I don't need an additional placeholder, per se.
If I zoom out now and run make hello again, ./hello,
and type in just my name-- no exclamation point--
now you'll see more excitedly, hello, comma, David.
So printf is smart.
It will figure out where the %s is and then go and replace it.
Now, let me propose that a common thing in programming
is that as soon as we make a decision as how to design something,
we often paint ourselves into a corner and regret a decision.
Can anyone think of a problem that arises from using %s as a placeholder
in this string to printf?
What could go wrong if we're using percent in this special way?
If you literally want to say, for whatever weird reason,
%s on the screen-- or honestly, even just a single %.
It turns out that a percent sign is treated
specially inside of printf strings.
So what's the solution here?
There's different patterns of solutions to problems like these.
But suppose you wanted to say, I got 100%, for instance.
Let me go ahead and change this completely.
So I got 100% on your test or whatever.
All right, let me go ahead and run make hello, Enter.
All right, so invalid conversions specifier.
I mean, I have no idea what this means, but it's underlining
the percent sign as problematic.
Well, it turns out that humans years ago decided, ugh, all right, damn it.
We already used %.
Well, two percent signs will mean one %, literally.
So now if I rerun make hello, aha, ./hello, I got 100%.
So there's going to be things like that, honestly, that you have to ask someone,
you have to Google, you have to look it up in the documentation.
But there's always a solution to those kinds of problems.
And thankfully, they don't come up all that often.
Yeah?
Oh, just pointing.
Other questions?
Yeah?
AUDIENCE: So if you have multiple [INAUDIBLE]
DAVID J. MALAN: If you have multiple variables,
it is in the left-right order.
So printf will analyze the first string of text
that you pass in between quotes.
And whatever the first % is, the first variable that's passed in after a comma
gets plugged in there.
And then the second gets plugged into the second, third, and to the third,
and so forth.
So it's just based on left to right.
Yeah?
AUDIENCE: This more of a clarification question.
What exactly does the %s mean?
DAVID J. MALAN: It's just a placeholder.
It's called a format code, and it just means colloquially, plug in some value
here.
And printf-- the humans who wrote printf decades ago decided to treat %s
special.
Why?
Just because.
They needed some placeholder.
They decided that, eh, no one's ever going to really want to type %s.
And if they do, they can just do %%s.
So they decided to implement printf in such a way that they have code that
analyzes whatever text comes in, looks for %s,
and then somehow plugs in the subsequent values into that placeholder.
And just the-- ah, question?
Sorry?
AUDIENCE: What if we wanted to do our initials or something?
DAVID J. MALAN: Ah, so what if you wanted to do a single characters,
like initials, like D M or D J M for first, middle, last, absolutely.
And that, too, is a perfect segue from the two of you to what, in general,
are going to be called data types in C.
So it turns out, in C, there's not only strings as text.
And we'll see in more detail over the next couple of weeks what
a string really is underneath the hood.
But strings of text are not the only thing that programs can output.
They can indeed output single characters, as for initials.
They can output integers as well.
Turns out that printf has different format codes
for all sorts of different data types.
And just some of the data types we'll see in the coming weeks
will be this list here, which you'll notice it
almost perfectly lines up with the CS50 functions
that I rattled off earlier, like get_char, get_int, get_string.
The reason we called those functions that is because each of them
is designed to return to you a different type of value.
We've used get_string already in this example here.
We'll soon see get_int, and we'll see opportunities to use others.
But these indeed are the menu of available data
types plus others-- dot, dot, dot-- that you
can use when writing a program in C.
The onus, therefore, is on you to decide in advance,
do I want to store an int in this variable, or a string,
or, heck, when writing fancier code, an image, or a sound, or a video, even.
Those can all be different data types, dot dot dot.
But for now we'll focus really on just these primitives.
That was a lot.
Let's go ahead and take a five-minute break here.
No cookies yet.
But in five minutes, we'll come back, dive into more detail.
On our second break today, we'll have cookies.
All right, we are back.

And so if you have been playing along at home
but hitting some bumps in the road, that's totally normal.
And indeed, the goals of lecture generally
will be to give you a sense, conceptually, of where we'll
be going during the course of the week.
But it's indeed through the hands-on labs and problem sets
that you'll really have an opportunity at your own pace
to work through some of those same bumps in the road.
But for today, let me give you a few more building blocks.
And these two will translate from Scratch initially.
Namely, like conditionals, like how now in C,
after knowing now how we can use functions--
at least get_string and printf--
and we can use variables like the string I created earlier,
how can I now add to the mix things like decision making and conditionals
at that?
Well, with conditionals in Scratch, we had this kind of syntax on the left.
Here in Scratch is how you might express if two variables, x and y,
have this relationship.
If x is less than y, then say on the screen, x is less than y.
Well, let me translate that to the right now in C code.
So in C, the corresponding code is going to look like this,
assuming x and y already exist--
more on that later.
And notice a pattern we're going to see again and again.
There is going to be parentheses around the x and less than y-- so parentheses
around the Boolean expression, recall.
The Boolean expression is the true/false, the yes/no answer,
a question that you're trying to ask in order to decide
whether or not to do something.
So you use parentheses there.
So similar in functions where we use parentheses for printf and parentheses
for get_string, and this is just a weird inconsistency stylistically.
When using the keyword if, you should, as a matter of best practice,
put a space after the word if.
When using a function like printf or get_string, you shouldn't.
Both will work, but you'll find that these are conventions stylistically
that most people adhere to-- so space when using an if here.
All right, now inside of the curly braces
is where the actual code goes that you want to execute conditionally.
So if you want to print out x is less than y
only if x is actually less than y in C, you
use this open curly brace-- which, up until now,
you've probably rarely used on your keyboard--
and the closed curly brace down here.
And those are hugging, if you will, the one or more lines
of code underneath the if-- very similar in spirit
to how the orange block here hugs the purple puzzle piece here.
So there's no graphics in C. It's all text.
So you can think of those curly braces as really representing the same idea.
As a side note, if you only have one line of code inside of the if
condition, if you will, you strictly, speaking, don't need the curly braces.
But as a matter of good style, do include them.
It will make more obvious what your intent is.
How about in Scratch if you wanted to express this--
two ways in the road that you might go, left or right, so to speak?
Well, if x is less than y, I want to say, x is less than y.
Else, I want to say the opposite, x is not less than y in this case.
So I'm making a decision based on that Boolean expression.
In C, It's almost the same, but you're adding to the mix the key word else--
so MIT borrowed for Scratch the same keyword there--
and a second pair of curly braces, open and close respectively.
And you might guess now what goes inside of those.
Well, you print out x is less than y, or you print out x is not less than y.
All right, what if there is a three-way fork in the road?
In Scratch, this actually gets a little unwieldy graphically, if you will.
But notice that in Scratch, this is how we could express if x is less than y,
say x is less than y.
Else if x is greater than y, say x is greater than y.
Else if x equals y, then say x is equal to y.
Now, minor inconsistency here.
Just a little bit ago, I claimed, in C, that an equal sign
represents what operation?
AUDIENCE: Assignment.
DAVID J. MALAN: Assignment from right to left.
Insofar as Scratch is really meant for kids,
and they didn't really want to get into the weeds of this kind of semantic,
equal sign in Scratch means equality.
However, we're going to need to fix this in C in just a moment.
In C, equal sign means assignment right to left.
In Scratch, it literally means what you would expect.
All right, let's translate this code then to C. On the right,
this code would correspond really to this.
And you can perhaps see, somewhat goofily, what the solution was,
not unlike the %% solution earlier when humans painted themselves into one
other corner.
You say if, you say else if, and you say else if,
and how did we resolve the use of a single equal sign already?
In C, when you want to express equality--
is the thing on the left equal to the thing on the right--
you literally use two equal signs right next to each other, no space
in between them.
But now this code would be correct on both the left and the right,
whether you're doing this in Scratch or C respectively.
But now we can nitpick our code, specifically the design thereof.
Logically, can anyone critique the design of this code,
either in Scratch or C?
I feel like we could do better.
How about in back?
AUDIENCE: The only option after it getting greater than or less
than is [INAUDIBLE].
DAVID J. MALAN: Perfect.
Logically, it's got to be the case that x is less than y,
or x is greater than y, or by conclusion, it's got to be equal to y.
So why are you wasting my time or the computer's time
asking a third question?
You don't need to ask this final else if because logically, as you note,
it should go without saying.
So it's a minor tweak.
You're doing extra work potentially in the cases where x equals y.
So we can just refine that.
And just like in Scratch, you could just use an else block, similarly in C,
could we simplify this code to just an else, a sort of catch-all logically
that just handles the reality that, of course, that's going
to be the final situation instead.
All right, so we have this ability now to express conditionals
with Boolean expressions.
Let's actually do something with this next here.
So let me go back to VS Code.
I've closed hello.c, and I want to create a second file
for the sake of some demos now.
Recall that you can create a new files by typing code, space, and then
the name of the file you want to create.
For instance, I might do compare.c.
I want to write a program that's going to start comparing
some values for demonstration's sake.
But before I do that, let me just show you
by opening the File Explorer at right, this is similar in spirit
to a Mac or PC.
You can go up here and click on an icon, and you can click on the plus icon,
and you'll get a blue box.
And I can type in compare.c, and I can just manually create it that way.
Notice that opens the tab even without my having typed code.
So again, on the left, you have a GUI, a Graphical User Interface,
albeit a simplistic one.
On the right and at the bottom here, you have a command line interface,
but they're one in the same.
What's nice, though, is that if I close this file accidentally, intentionally,
whatnot, I can reopen it without creating
a new one by just running that same command-- code, space, compare.c.
So code is a VS Code thing.
It's just a user-friendly shortcut.
But it's just creating a file or opening an existing file like that.
I'm going to hide the File Explorer just to make more room for code here.
And let's go ahead and do this.
Let's write a program that compares two values that the human inputs,
but not strings this time.
Let's use some actual integers.
All right, I'm going to go ahead and include the CS50 library's header
file at top--
cs50.h.
I'm going to also include stdio.h.
Why?
One gives me user-friendly input via get_string, get_int, and so forth.
One gives me user-friendly output via printf in the case of stdio.h.
Now I'm just going to blindly type this line of code, which we'll come back to
in future weeks.
But for now, that's analogous to the when
green flag clicked code in Scratch.
And now let's go ahead and do this.
Let me go ahead and get_int from the user
and ask the user, What's x, question mark.
I'm not going to bother with a new line.
I want to keep it all in one line, just for aesthetics' sake.
But when I get back and int, just like I get back a string,
I get back a return value.
So if I want to store the result of get_int somewhere,
I had better put it in a variable.
And I can call the variable anything I want.
Previously, I used answer, or first, or last.
Now I'm going to use x.
But there's still two things left to do here logically, even though we
haven't technically done this yet.
What I still need to do?
AUDIENCE: A semicolon.
DAVID J. MALAN: So I need the semicolon at the end.
AUDIENCE: And the int first.
DAVID J. MALAN: And the int at the beginning.
You the programmer, starting today, need to decide what you're
going to be storing in your variables.
And you just need to tell the computer that so that it knows.
Now, as a teaser for languages like Python, more modern languages,
turns out, humans realized, well, gee, this is stupid.
Why can't the computer just figure out that I'm putting an int there?
Why do I have to tell it proactively?
So in some languages nowadays, like Python
will get rid of some of this syntax, will get rid of the semicolons.
But for now we're looking at, really, the origins of how this all worked.
All right, so I've done this one line ending with semicolon.
Let me do one other.
And let me get a second int asking the user, What's y, question mark.
So almost identical but different responses from the user, hopefully.
And let me just ask simply if x is less than y,
in parentheses, then some curly braces, let me go ahead and print out,
quote/unquote, x is less than y backslash n.
And now just as a side note--
I seem to be typing fast.
Some of that is because VS Code is helping me.
Let me go back to this first line with the if, hit Enter.
And now I'm only on my keyboard going to type the open curly brace.
This is a feature of many text editors nowadays.
It finishes part of your thought.
Why?
Just to save yourself a keystroke to make sure
you don't accidentally forget the closing one.
So you'll notice sometimes that things are happening that you didn't type.
It's just VS Code or future programs you use trying to be helpful for you.
I'll go ahead and manually type out now printf
x is less than y backslash n close quote semicolon.
So let me go ahead now and try to run this, and we'll see--
let's see.
So make-- not hello-- but make compare because this file is
called compare.c, hitting Enter.
No output is good because it means I haven't messed up.
Let me ./compare instead of ./hello, Enter.
What's x?
How about 1?
What's y?
How about 2?
X is less than y.
Well, let's try it again.
And here, I'll save you some keystrokes too.
Let me clear my screen.
Instead of constantly typing ./this and ./that,
you can also use your keyboard's arrow keys in VS Code to scroll back through
time.
So if I hit Up once, there's the last command I wrote.
If I do it Up twice, there's the second to last command I wrote.
So sometimes if you see me doing things fast,
it's just because I'm cheating and going through my history like that.
All right, let me go ahead, though, and rerun ./compare, Enter.
Let's reverse it this time--
2 for x, 1 for y.
And now, of course, there's no output.
All right, well, that's logically to be expected
because we didn't have an else here.
So let's add that.
Else-- now let's open my curly braces, letting VS Code do one of them
for me-- printf, quote/unquote, x is not less than y backslash n semicolon.
Let me go ahead and try this again-- ./compare, Enter.
Again, 2 for x, 1 for y.
And we should see--
huh.
What did I do wrong?
Why am I not seeing any else output?
Yeah?
AUDIENCE: You changed your code when you rebuild.
You need to compile it.
You got to get into the habit after you change your code of recompiling it.
Or otherwise, the zeros and ones in the server
are the old ones until you manually compile.
So let's fix this-- make compare, Enter.
No error messages.
That's good. ./compare, 2, 1.
And now I get back the output.
So x is not less than y.
How about if I go and add in the third condition?
Well, we can do this either efficiently or inefficiently.
Let me go ahead and refine this.
So else if x is greater than y, let's literally say, x is greater than y.
And now I could do x else if x equals equals y.
But I think we already claimed that that's unnecessarily inefficient.
So let's just have our catchall.
And here I'm going to say, quote/unquote,
x is equal to y backslash n, close quote there.
So I think now with this code, we've handled all three scenarios.
Let me go ahead and recompile it properly-- make compare, ./compare.
And now 1 and 2--
is less than y.
Let me run it again.
2 and 1-- x is greater than y.
And lastly, 1 and 1, and x is equal to y.
So for the most part, our code is getting longer.
We're up to 21 lines of code, though some of them
are just single characters on the screen.
Almost everything else is the same.
I'm using the CS50 library's header file for my get_int function, stdio.h
for my printf function, and the rest of this
is just now new syntax for conditionals as well.
Questions, then, on this C implementation
of just some basic comparisons like this?
Any questions?
Yeah?
AUDIENCE: Just a syntax question-- do the opening
brackets need to be on a separate line?
DAVID J. MALAN: Good question.
Do the opening brackets need to be on a separate line?
In CS50, yes.
What you'll see is that as part of the submission process,
we compare your code against a style guide, which is the norm in industry.
A company would have its own sense of style and how its code should look.
And there's generally automated tools within a company
that help give feedback on the code or stylize it as such.
There are alternative styles than what we use in the class.
We deliberately keep and ask that you keep
the curly braces on their own line, if only
because it rather resembles like the hugging nature of Scratch's blocks
and just makes clear that they're balanced, opened and closed.
However, another common paradigm in some languages and with some programmers
is to do something like this on each of them.
So you have the opening curly brace on the same line as here.
We do not recommend this.
This is en vogue in the JavaScript world and some others.
But ultimately in the real world, it's up to each individual programmer
and/or the company they're working for, if applicable,
to decide on those things.
All right, so beyond, then, these conditionals,
what if we want to do something that's maybe pretty common?
So almost every piece of software or website nowadays that you use
has you agree to some terms and conditions by typing Yes or No or just
Y for Yes and N for No.
So how could we implement some kind of agreement system?
Well, let me do this.
Let me create a new program, a third one called agree.c.
So I'm going to write code agree.c just to give myself a new tab.
I'm going to start, as always now, include cs50.h.
Let's include stdio.h.
And then let me do my int main(void)-- which, again, for today's purposes,
we'll take at face value is just copy/paste.
And if I just want to get Y or N, for instance, instead of Yes or No,
we can just use a simpler variable here.
How about just a char, a character, a single character?
So I can use get_char to ask the user, for instance,
do you agree, question mark.
But as before, I need to store this somewhere.
So I don't want a string because that's a single char.
I don't want an int.
I just want a char.
And it's literally C-H-A-R. And then I can call this thing anything I want.
It's conventional if you have a simple program with just a single variable
and it's of type char, call it c.
If it's an int, call it i.
If it's a string, call it s.
For now I'm just going to keep it simple and call it c.
And now I'm going to ask a question.
So if c equals equals, how about, quote/unquote, y, then
let me go ahead and print out Agreed backslash n,
as though they agreed to my terms and conditions.
Otherwise, let's see.
Else if the character equals equals, quote/unquote, n, then let me go ahead
and print out, say, Not agreed, as though they didn't, quote/unquote.
And let's leave it at that, I think, here initially.
Now, you'll notice one curiosity, one inconsistency perhaps.
Does anyone want to call it out, though it's somewhat subtle?
I've done something ever so slightly differently without explaining it yet.
Do you see it?
AUDIENCE: The single quotation mark.
So I've suddenly used single quotation marks for my single characters
and double quotes for my actual strings of text.
This is a necessity in C. When you're dealing with strings, like strings
of text, like someone's name, a sentence, a paragraph, anything
really more than one character, you typically use double quotes.
And indeed, you must.
When dealing with deliberately single characters, like I am here for y or n,
you must use single quotes instead.
Why?
Because that makes sure that the computer
knows that it's indeed a char and not a string.
So double quotes are for strings.
Single quotes are for chars.
So with that said, let me go ahead and zoom out.
Let me go ahead in my terminal window run make agree, Enter.
Seems to work OK so let me go ahead and do ./agree.
Let me go ahead now and type in y.
Here we go.
Enter.
Huh.
Let me try that again.
Rerun ./agree.
How about no?
Enter.
Why is it not behaving as I would have expected?
AUDIENCE: Because you entered the capital Y and capital N.
DAVID J. MALAN: Yeah, I kind of cheated there,
and I hit the Caps Lock key just as I started typing in input.
Why?
Because I deliberately wanted to type in uppercase instead of lowercase,
which is kind of reasonable.
It's a little obnoxious if you force the user to toggle their caps lock
key on or off when you just need a simple answer.
That's not the best User Experience, or UX.
But it would work if I cooperated.
Let me run this again without caps lock--
y lowercase for yes.
Ah, that worked. n lowercase for no.
That worked.
But how could I get it to work for both?
Well, how about this?
Let me go ahead and just add two possibilities.
So else if c equals equals quote/unquote capital Y,
then also do printf agreed backslash n.
And down here, else if c equals equals single quote capital N,
then go ahead and print out, again, Not agreed.
This, I will claim now, is correct.
And I'll do make agree real fast, ./agree.
And I'll use capital.
It now works.
I'll use capital.
It again works.
But this is perhaps not the best design.
Let me hide the terminal window and pull this up on the screen all at once.
Why might this arguably not be the best design, even though it's correct?
There's another term of art we can toss here, like [SNIFFS] something
smells kind of funky about this code.
This is an actual term of art.
There's code smell here.
Something smells a little off.
Why?
What do you think?

There's the same output again and again.
I mean, I manually typed it.
But honestly, I might as well have just copied and pasted
most of my original code to do it again and again for the two capital letters.
So if line 10 and 14 are the same AND line 18 and 22 are the same, AND then
the rest of these if and else ifs are almost the same,
[SNIFFS] there's some code smell there.
It's not well designed.
Why?
Because if I want to change things now, just like last week in Scratch,
I might have to change my code in multiple places or copy/paste
is never a good thing.
And god forbid I want to add support for Yes and No as full words,
it's really going to get long.
So how can we solve this?
Well, it turns out, we can combine some of these thoughts.
So let me try to improve the Yeses first.
It turns out, if I delete that clause, I can actually or things together.
In Scratch, there's a couple of puzzle pieces, if you didn't discover them,
that literally have the word or and the word
and on them, which allow you to combine Boolean expressions.
So that either this or this is true, or this and this is true.
In C, you can't just say the word or.
You instead use two vertical bars.
And vertical bars together mean or, logically.
And so I can say, c equals equals quote/unquote capital Y, Agreed.
And now I can get rid of this code down here.
And let me go ahead and say, vertical bar twice c
equals quote/unquote N in all caps.
And now my program's roughly a third smaller, which is good.
There's less redundancy.
And if I reopen my terminal window, rerun make of agree, ./agree,
now I can type little y or big Y and same thing for lowercase and uppercase
N. Any questions then on this syntax, whereby now you can combine thoughts
and just tighten things up?
And there'll be other such tricks too.
Yeah?
AUDIENCE: Is there not a function to just ignore the case?
DAVID J. MALAN: A really good question.
Is there not a function to just ignore the case?
Short answer, there is.
And we'll see how to do that in, actually, just about a week's time.
And in other languages, there's even more ways
to just canonicalize the user's input, throwing away any space characters
they might have accidentally hit, forcing everything to lowercase.
In C, It's going to be a little more work on our part to do that.
But in fact, as early as next week, we'll see how we can do that.
But for now we're comparing indeed just these literal values.
Other questions?
AUDIENCE: So we're assuming the user's putting in what they're suggesting.
How do you handle if they were to put in a number?
DAVID J. MALAN: Really good question.
So we are assuming, with this program and all of my last ones,
that the human's cooperating and when I ask for their name, they typed in David
and not 123, or, in this case, they typed in a single character and not
a full word.
So this is one of the features often of using a library.
So for instance, if I run agree again, and I say something like sure, Enter,
it rejects it altogether.
Why?
Because s, u, r, e is a string of characters.
It's not a single character.
Now, I could just say something like x, which is neither y nor n, of course.
But it tolerates that because it's a single character.
But built in to CS50's library is some built-in rejections
of inputs that's not expected.
So if you use get_int and the user types in not the number 1 or 2
but cat, C-A-T, it will just prompt them again, prompt them again.
And this is where, too, if you were to do this manually in C,
you end up writing this much code just to check for all of these errors.
That's why we use these training wheels for a few weeks
just to make the code more robust.
But in a few weeks' time, we'll take the liberty away.
And you'll see and understand how it's indeed doing all that.
All right, so how about this.
Let's now transition to something a little more Scratch-like, literally,
by creating how about another program here called meow--
so meow.c.
We won't have any audio capabilities for this one.
We'll just rely on print.
And suppose that I wanted to write a program
and see that just simulates a cat meowing.
So I don't need any user input just yet.
So I'm just going to use stdio.h.
I'm going to do my usual int main(void) up here.
And then I'm just going to go ahead and do printf meow backslash n.
And let's have this cat meow three times, like last week.
So I'm going to do meow, meow, meow.
Notice as an aside whenever you highlight the lines,
you'll see little dots appear.
This is just a visual cue to you to let you figure out
how many spaces you've indented.
VS Code, like a lot of editors, will automatically indent your code for you.
I've not been hitting the space bar four times every time.
I've not even been hitting Tab.
However, in C, the convention is indeed to indent lines
where appropriate by four spaces--
so not three, not five.
And these dots help you see things so that they just
line up as a matter of good style.
All right, so this program, I'm just going to stipulate right now,
is indeed going to work.
Make meow-- which is kind of cute-- and now meow.
There, three times.
Correct.
It's meowing three times.
But of course, this is not well designed.
It wasn't well designed in Scratch last week.
Why?
What should I be doing differently?
Yeah?
AUDIENCE: A loop?
AUDIENCE: This could be a loop.
It's a perfect opportunity for a loop.
Why?
Because if you wanted to change maybe the capitalization of these words,
or you wanted to change the sound to like woof for a dog or something,
you'd have to change it one, two, three places.
And that's just kind of stupid, right?
In code, you should ideally change things in one place.
So how might I do that?
Well, we could introduce a loop, yes.
But we're going to need another building block as well that we had in Scratch,
namely those things called variables.
So recall that a variable, like an algebra-- x,
y, z, whatever-- can store a value for you.
And a variable in Scratch might have looked like this.
You use this orange puzzle piece to set a variable of any name,
not just x, y, or z.
But you could call it something more descriptive, like counter,
and you can set it equal to some value.
In C, the way to do this is similar to spirit to some of the syntax
we've seen thus far.
You start by saying the name of the variable you want,
a single equal sign, and then the value.
You want to initialize it too, copying therefore from right to left.
Why?
Because the equal sign denotes, again, assignment from right to left.
This isn't enough though.
You might have the intuition already.
What's missing probably from this line of code just to create a variable?
AUDIENCE: Int.
DAVID J. MALAN: So we need int to make sure the computer knows
that this is indeed an int.
And then lastly, semicolon as well.
And that now completes the thought.
So a little more annoying than Scratch, but we're
starting to see patterns here.
So not every piece of syntax will be new.
All right, if you want to increment the counter by one,
Scratch uses the verb change, and they mean add the value to counter.
So if I want to increment an existing variable called counter,
this syntax is a little more interesting.
It turns out the code looks like this, which almost seems like a paradox.
How can counter equal counter plus 1?
That's not how math works.
But again, a single equal sign is assignment from right to left.
So this is saying, take whatever the value of counter is, add 1 to it,
and copy that value from right to left into counter itself.
You still need the semicolon, but I claim
you do not need to mention the keyword int when updating an existing variable.
So only when you create a variable in C do you use the word string, or the word
int, or any of the others we'll eventually
see-- only when creating it or initializing it for the first time.
Thereafter if you want to change it, it just exists.
It's the word you gave it.
The computer is smart enough to at least remember what type it is.
So this line is now complete
Turns out, in code, as we'll see, it's pretty common to want to add things
together, increment things by one.
So there's actually different syntax for the same idea.
The term of art here is syntactic sugar.
There's often in code many ways to do the same thing,
even though, at the end of the day, they do exactly the same functionality.
So for instance, if, after a few days of CS50, you find this a little tedious
to keep typing in some program, you can simplify it to just this.
This is the syntactic sugar.
You can use plus equals and only mention the variable name once on the left,
and it just knows that means the previous thing.
It's just slightly more succinct.
This, too, is such a common thing to add 1 to a value.
And it doesn't have to be 1.
But in this case, it is.
But if it is indeed 1, you can further tighten the code up to just do this,
counter++.
So any time in C you see ++, it means literally adding 1 to that particular
variable.
There's other ways to do this in the other direction.
If you want to subtract 1 from a variable,
you can use any of the previous syntax using a minus sign instead of plus,
or you can more succinctly do counter--.
This is the way a typical C programmer would do this.
All right, so if we have no variables, let's
go and solve the meowing with loop.
So in Scratch, we saw loops like this.
This, of course, had the cat meow three times.
How do we do this in C?
Now, this is where things get a little more involved code-wise.
but if you understand each and every line,
we'll follow logically what's going on.
So here, I claim, is one way to implement
a loop that iterates three times in C. And this is kind of ridiculous, right?
We went from two super simple puzzle pieces like this to, my god,
it's 1, 2, 3, 4, 5, 6 lines of code, all of which are pretty involved.
So that escalated quickly.
But what's each line doing?
And we'll see other ways to do this more simply.
So we're initializing a variable called counter to 3, just like before.
Why?
Well, what does it mean to loop or to repeat something three times?
Well, it's like doing something three times,
and then do it, and then count down, and then
do it, and then count down, and then do it, until you're all out of counts.
So this is declaring a variable called counter, setting it equal to 3.
Then I'm inducing a loop in C, which is similar in spirit to repeat 3,
but you have to do more of the math yourself.
So I'm asking the question in parentheses,
while count is greater than 0, what do I want to do?
Well, per the indentation inside the curly braces, I want to meow one time.
And then, to be clear, what's this last line of code doing?
If counter starts off at three, this makes it 2 by subtracting 1 from it.
Then what happens?
By nature of the loop, just like in Scratch, it knows to go back and forth.
even though there's a nice, pretty arrow in Scratch, and there isn't here,
C knows to do this again, and again, and again, constantly asking this question
and then updating this value at the end.
So if I highlight just a few of these steps, the variable starts off at 3.
And actually, let me simplify 2.
I claimed earlier that when using single variables,
people very often just call it i for int, or c for char,
or s for string unless you have multiple variables.
So let me tighten the code up.
And this already makes it look a little more tolerable.
Let me actually tighten it up further, add one more step.
So now this is about as tight, as succinct
as you can make this code at the moment.
So what's actually going to happen here?
Well, the first line of code executes, and that initializes i to 3.
Then we check the condition.
While i is greater than 0, is i greater than 0?
Well, per my three fingers, obviously.
So we print out meow on the screen.
Then we subtract 1 from i, at which point now we have 2 as the value of i.
Then the code goes back to the condition.
And notice, the condition there is in parentheses.
That's another Boolean expression.
So loops can use Boolean expressions, just like conditionals use
Boolean expressions to make decisions.
The loop, though, is deciding not whether to do this thing or that
but whether to do the same thing again, and again, and again.
And as it ticks through the code one line after the other,
it's ultimately going to get down to 1, and then 0, and then stop.
So put another way-- came with some props here-- so suppose this ball here
is your variable, and you initialize it to 3 with three stress balls,
you can do something three times, right?
If I want to give out three stress balls--
here's your chance for free stress balls without having to answer any questions.
OK, there we go.
So here we go, subtracting 1 from my variable.
I'm left with two.
Oh my god.
All right, don't tell Sanders.
[GRUNTS] Oh, I'm sorry.
Oh.
[LAUGHTER]
OK, that ended poorly.
Apologies.
All right.
But now the educational point, though, is
that my variable has been decrement did further to just have--
I'm not throwing that far again.
I can't do this.
Here we go.
All right, here we go.
And one final subtraction.
And now our variable is left empty.
So we had three stress balls there, and that's all a variable is.
It's some kind of storage.
It's actually, of course, implemented in the computer's memory.
But metaphorically, it's really just a bowl with some values.
And every time you or, in this case, subtract,
you're just changing the value of that variable.
And then the code, meanwhile, of course, in parentheses, is just checking,
is the bowl empty?
Is the bowl empty?
Is the bowl empty?
AKA, is i greater than 0 or not?
Any questions on how we've implemented loops in this way?
And I owe you a stress ball after class.
Questions on loops?
All right, so it turns out, this is kind of ugly.
And this really starts to take the fun out
of programming when you have to write out this sequence of steps.
So it turns out, there's other ways to do this.
But first, let's see, logically, how else
you might express this because it's a little weird that we keep using zero.
So the one other way to do this would be to invert the logic.
You could absolutely start with your variable, call it i equal to 1.
And then you could ask the question, is i less than or equal to 3?
And notice a bit of new syntax here.
On your typical keyboard, there is number less than
or equal sign or greater than or equal sign
like you would write in math class with 1 over the other.
And so in C, you use two characters, less than followed by an equal sign
or, if appropriate, greater than followed by in equal sign.
And that logically captures that idea.
So notice that I'm changing my questions.
I'm initializing i to 1, and then I'm going to increment it ultimately
to 2 and then 3.
But because I'm doing less than or equal to,
it's still going to go from 1, 2, 3.
So that works too.
We could similarly do this yet another way.
We could initialize i to 0, and then we could say, well, i is less than 3
and keep incrementing it.
And I showed this last form is actually the most canonical.
It might be the most human-like to think in terms of 1 to 3.
It might be the most stressball-like to think in terms of 3 to 0,
counting down.
But typically, the go-to syntax for most programmers
once you get comfortable counting from 0 is to always start counting from 0
and count up to less than the value you're counting up to.
So it would be incorrect, why, to change this to less than or equal to 3 here?
What would happen if I changed the less than to less than or equal to?
AUDIENCE: It'll only meow twice.
DAVID J. MALAN: Yeah, it'll meow an extra-- a fourth time, in fact, total,
right?
Because you'll start at 0, then 1, then 2, then 3.
And less than or equal to 3-- sorry--
3 will give you the fourth time.
So we do want indeed to be just a single less than.
All right, so now that we have those options,
let me just give you one other.
And this one takes a little more, getting used to as well,
but it's probably the more common way to write this.
Let me go ahead and propose that we implement this as follows.
Let me go back to my code here.
Let me go into my several printfs, getting rid of all but one of them
ultimately.
And let's implement this in code.
So let's do int i get 0, how about then while i is less than 3,
then let's go ahead and say printf quote/unquote meow--
melow-- meow backslash n.
And then we have to do i minus minus or plus plus?
AUDIENCE: Plus plus.
DAVID J. MALAN: So plus plus because we're starting at 0
and going up to but not through 3.
So let me go ahead now and make meow after clearing my terminal, ./meow,
and it's still just as correct.
But it's a little more--
it's a little better designed.
Why?
Because now if I want to change it from 3 to 30 times, for instance,
I can change it there.
I can recompile my code.
I can do ./meow, and done.
I don't have to copy and paste it 27 more times to get that effect.
And I can even change what the word is by changing it in just one location.
But it turns out, there's other ways to do this too.
And let me propose that we introduce you to what's called a for loop as well.
So if you want to repeat something three times,
you can absolutely take the while loop approach that we just saw,
or you can do this.
And this one takes a little more, getting used to,
but it kind of consolidates into one line all of the same logic.
So notice, we have the keyword for here.
And for is just a preposition in this case that generally implies,
here comes a loop.
Inside of parentheses here is not just a Boolean expression.
And this is where things get a little weird.
There's three things-- to the left of the semicolon,
in the middle of the two semicolons, and to the right of the semicolon.
This is really the only other context we'll see semicolons, and it's weird.
Normally, it's been at the end of the line.
Now it's two of them in the middle of the line,
but this is the way humans decided years ago to do it.
So what is this doing?
Almost the same thing.
It is going to initialize a variable called i to 0.
It's going to then check.
If it's less than 3, it's then going to do whatever's in the curly braces,
and it's lastly going to increment i and repeat.
So just highlighting those in turn, at first, i
is initialized to 0, just like before.
Then this condition is checked.
This is a Boolean expression.
Yes or no, true or false will be its answer.
And if i is less than 3, which it should be once it starts at 0,
well, then we're going to go ahead and print out meow.
Then i is going to get incremented.
So it starts at 0.
It goes now to 1.
At that point, the Boolean expression is checked again.
So you don't keep changing i back to 0.
That first step happens only once.
But now you repeat through those three other highlights.
I check if i is less than 3.
It is.
So I print out meow.
It then increments i.
I check if i, now 2, is less than 3.
It is.
I print out meow.
i gets incremented.
I now check.
Is i less than 3?
No, it's not, because 3 is not less than 3.
And so the whole thing stops.
And whatever code is below this curly brace, if any,
starts executing instead.
Just like in Scratch, you break out of the loop and the puzzle pieces
being hugged.
Questions, then, about this alternative syntax for loops, AKA, a for loop?
AUDIENCE: Can you explain again why it doesn't go back to 0?
DAVID J. MALAN: Sorry, say again?
AUDIENCE: Can you explain again why it doesn't reset to 0?
Can I explain again why it doesn't reset to 0?
Honestly, just because.
This was the syntax they chose.
This first part before the first semicolon
is only executed once just because.
That's how it's designed.
Everything else cycles again and again.
And this is just an alternative syntax to using
the slightly more lines of code.
It was, like, six lines of code using the while loop.
Logically, it's the same thing.
Programmers, once they get more comfortable,
tend to prefer this because it just expresses all your same thoughts
more succinctly.
That's all.
Yeah?
AUDIENCE: That was my question.
DAVID J. MALAN: OK.
So let's just work this into my meow example.
Let me go back to the code here.
And notice, indeed, if I highlight all these lines,
I think we can tighten this up.
Let me get rid of all of those and instead do for int i equals 0.
And I'm saying equals.
Most programmers would say gets.
So int i gets 0 means assignment-- the word get.
Now I'm going to do i is less than 3 i++.
Now in here I'm going to do my printf quote/unquote meow backslash n.
And so it's indeed a little tighter.
I mean, two of the lines are just curly braces.
There's really only two juicy lines of code now.
Let me go ahead and do make meow, ./meow.
And again, we're back in business with three of them printing only.
All right, there's one last structure we should explore
just because it's sometimes useful.
This was a forever block.
And this would be a little weird in Scratch
to just say meow forever, or at least without waiting.
But there is indeed a forever block in Scratch, which means do the following,
forever.
And I proposed, I think, verbally last week at least one example
where this is useful.
Meowing forever, a little annoying.
But can you think of common cases where you
might want to write code or use a program that loops forever?
Yeah?
AUDIENCE: Playing music throughout an entire game.
DAVID J. MALAN: Yeah, playing music.
Like Spotify playlists, just repeating again and again
would be some kind of loop.
AUDIENCE: Checking for collisions.
DAVID J. MALAN: Checking for collisions in Scratch,
so seeing if something's bouncing off the wall or another sprite.
Yeah?
AUDIENCE: Oh, checking for input.
DAVID J. MALAN: Checking for input.
So yeah, get_string is essentially just waiting there forever
for me to type in some input until I do.
AUDIENCE: Checking the time.
DAVID J. MALAN: Checking the time and actually
maintaining human time, like a wall clock.
Behind you?
Is that the same?
AUDIENCE: I was going to say checking the time.
DAVID J. MALAN: OK, checking the time.
And one more?
Detecting a key press too.
Like in Scratch, just waiting for some kind of event to happen,
just like on a phone or a browser.
And so there's so many examples where you
might want to do something forever--
just so you've seen the corresponding C building block.
It's a little weird, but this is probably the most canonical way
to do it in C. If you want to print meow forever--
which would be a little crazy because it would literally print and take over
your computer printing forever meow--
you would generally do it like this.
Why?
Well, a while loop expects in parentheses a Boolean expression,
and a Boolean expression is, again, a yes/no, a true/false question.
But if you want the answer to that question always to be yes--
or really, always to be true, turns out in C and a lot of languages
will then just say true because true--
T-R-U-E-- is never going to change magically to false.
I mean, it's just a special word in the programming language.
So by saying while true, it just means do the following forever.
Another common paradigm before true and false
became commonplace would be to do this instead--
change while 1.
You might see in online examples and texts and the like,
while 1 is really the same thing.
Any value that is 0 is generally interpreted as false by a computer.
Any value that is 1 or any other non-zero value
is generally interpreted as true.
And so this, too, would have the same effect, saying while true or while 1.
Generally speaking, while true is perhaps a little clearer these days.
Now, meowing forever is not a good thing.
But suppose I did that by intent or by accident.
Well, let's try this.
So here I'll go into my code.
I'm going to get rid of for loop and change my while loop to, how about,
true.
And in this case here, well, we'll keep it-- let's do this.
Make meow, Enter.
And you'll see this, use of undeclared identifier true.
This is actually hinting at my mention that the old way was 0 and 1.
Nowadays, you could say true or false.
But true and false are themselves special words that you have to include.
And it turns out, if you want to use special Boolean values like this,
there's another header file we haven't seen called stdbool that essentially
creates true and false as keywords.
Alternatively, CS50 includes that same file.
So it's more common in CS50 is to see it like this.
Now if I clear my terminal window and do make meow and then ./meow and hit
Enter, well, unfortunately, this isn't the best thing to do infinitely when
you're in the cloud using a browser.
This is indeed a browser, just full-screened here.
This means I'm sending millions of meows over the internet to my computer here.
So this will happen to you at some point, probably not with meow.
But you'll lose control over your terminal window.
Why?
Because you screwed up.
And you have an infinite loop.
You didn't really intend it.
Or maybe you did.
You were curious to see what happens.
What do you do?
When does the meowing stop?
What recourse do we have here?
Well, Control-C will be your friend.
Sometimes you have to hit it a bunch in a cloud environment.
But Control-C for cancel will interrupt a program that's running.
And I promise that almost all of you will at some point
accidentally introduce an infinite loop because your math is slightly off.
When in doubt, click in the terminal window and hit Control-C--
sometimes multiple times--
and that will indeed cancel whatever is happening there.
In this case, I might have intended it.
But sometimes it's not, in fact, intended.
All right, so we've been taking for granted this whole graphical user
interface for some time and, indeed, the commands that I'm typing
and the buttons on I'm clicking.
And let me just give you a better sense of what
it is we are using underneath the hood this whole time,
namely an operating system called Linux.
So I keep alluding verbally, of course, to Macs and PCs
because almost all of us are running macOS or Windows on our desktops
or laptops nowadays.
But there's lots of other operating systems out there,
and one of the most popular one is called Linux.
And Linux is very often used on servers nowadays-- companies
that host email, companies that host websites or apps, more generally.
Certain computer scientists or computer science students
often like to brag that they run Linux just because that's a thing.
But it is really just an alternative to macOS or Windows
that provides you with both a GUI, if you want it,
but also an especially a command line environment.
Now, fun fact-- Windows and macOS do have terminal windows or the equivalent
thereof.
And eventually, you might use it on your own Mac or PC to solve some problem.
But Linux is really known for, along with other operating systems,
its command line environment, which, again, I distinguished earlier from GUI
as a Command Line Interface, or CLI.
And that refers, really, to the terminal window.
So if I go back to VS Code here, and let me, in fact, go ahead and close my tab
and focus entirely on the terminal window,
this terminal window is really just your command line interface
to your very own server in the cloud.
The term of art here is you each will have
your own container in the cloud, which is like your own computer
running somewhere on the internet with your own username and password
to which you have access and your own hard drive, if you will,
your own home folder that has all of your files for the class.
And it's only accessible to you unless you enable live sharing thereof.
So when you're typing commands here, it looks
like you're typing them, of course, on your own Mac or PC.
But they're actually being sent over the browser
to some server in the cloud where you are controlling, really,
your own account therein.
So it turns out that there are other commands that are worth knowing.
And we'll give you just a few of these today.
And over the coming weeks will you have opportunities
to play with others as well.
But these are some of the basics.
And they're all incredibly succinct because, indeed, for things you're
typing at the command line, humans generally have not
wanted to type out long commands.
So a lot of these are abbreviations here.
Now, perhaps the most common one I'll start with first
is ls, a lowercase l and a lowercase s that stands for, succinctly, list.
So if I go to my terminal window now where up until now,
I've only typed code, which is a VS Code thing for creating an opening files,
and make, which triggers the compilation of my code, what if I now type ls?
This will list all of the files in my current folder--
my hard drive in the cloud, if you will.
So if I hit Enter, you'll see a whole bunch of results.
Now, they're color coded too.
The white ones here end in .c.
Those are the source code files I've written
during class today-- agree.c, compare.c, hello.c, and meow.c.
And you can perhaps guess, the green ones here that just by convention
have an asterisk on the end to denote that they're special represent what?
One of the four others.
Yeah?
AUDIENCE: The machine code?
DAVID J. MALAN: Yeah, the machine code.
So those are my actual programs that are identically named minus the .c
extension.
And the asterisk means that they're executable.
That is in the world of macOS or Windows, you would double click.
But in the world of a command line environment,
that means you do ./ and then the name without the asterisk to execute or run
the code therein.
So if I open up my File Explorer-- and I'm hitting Command-B on my computer
here just as a keyboard shortcut-- you'll see the exact same thing.
So ls is the command line interface for listing the files in your account.
But here, because I'm using VS Code or any program like it,
I also get a graphical user interface as well.
So it's just two different places to be.
You're welcome to use whatever you're comfortable with.
But over time will you naturally get more comfortable
and capable with the terminal window alone.
Well, what else is on this list here?
Well, during the break, I saw that in at least one of you,
for instance, had created a file called hello instead of hello.c.
So you were in a situation where you did this accidentally and hit Enter.
And then you went ahead and typed in all of your code like this.
And then down in your terminal window, you
were trying to do make hello, Enter.
And this now didn't actually do anything.
I can't-- I'm hitting--
I'm trying to run the command.
I got permission denied, as at least one of you did.
Now, why is that?
Well, let's just do a quick check.
If I do ls, I see now hello, but hello has
no asterisk next to it, which means it's not executable.
That's my code.
Why?
Well, notice the top of my tab confirms, oh, I screwed up.
I didn't name my file hello.c, which it just has to be.
So what do you do?
Well, you could very hackishly copy this, create a new file, paste it in.
Or no, no, no.
We know how to rename things now here because that's one of our options.
Let me do this.
Let me do mv for move, hello, and then hello.c, and hit Enter.
You'll see the tab closes because hello no longer exists.
But if I now type ls, you'll see, ah, there is hello.c.
And if I open that file now, whew, there's all of my same code.
And now if I do make hello--
make hello-- now I do get an executable file where in the world is restored.
So mv, it's just a command not just for renaming, but it also turns out,
eventually for moving files as well.
You can also create directories or folders.
So for instance, if I go into VS Code again, and suppose
I hover over here and click not on the plus file icon but plus folder,
I can create a folder called, for instance, pset1 for problem
set 1 in the class.
And you'll see now that it's empty because all of my other files
are in the default folder of my account.
But I could also go in there like this.
And I could click on File, and now I can create a new file called mario.c,
which is one of the first problems, for instance.
But you'll notice now that mario.c is inside of the pset1 folder.
So if I zoom out and I type ls at my terminal window,
I won't see mario.c anywhere.
But I do see a pset1 folder.
And it's in light blue followed by a slash, which you don't have to type.
It just indicates that's a folder.
Now, I can visually at top left obviously see pwet1 contains mario.c.
But if I try to do something like make mario here,
no rule to make target mario.
It just doesn't seem to exist.
And that's because you're in the wrong directory.
So in a command line interface, it's not quite as simple
as just clicking on a folder, and voila, it opens.
You have to change into the directory or folder.
And cd is going to be the command there.
So if I want to actually change into that directory,
I can do cd, space, pset1, Enter.
And now you'll see my prompt changes.
And this is just a common convention, but it's not the only one out there.
Now I still have a dollar sign, which indicates where I can type commands.
But before it, I see a reminder constantly what folder I'm in.
And we put that there deliberately, like a lot of Linux users
do just to remind themselves where they are because unlike macOS or Windows,
where you have a nice, big window telling you where you are,
at the command line, you need to be reminded textually.
But now if I type ls and hit Enter, what should I see?
AUDIENCE: Mario.c
DAVID J. MALAN: Yeah, mario.c.
And now if I want to open it--
if I want to actually compile it, I can run make mario in this directory
once I actually type out all of the code.
Rest assured that in problem sets and labs,
we'll almost always-- certainly, in the first weeks of the class--
give you exactly the commands to type.
Odds are because it's new to many of you,
you will accidentally type the wrong commands.
No big deal.
Just remember that you have different ways to solve these problems.
You've got the graphical File Explorer, which
should feel a little more familiar.
But in time you'll start to know and, honestly,
probably prefer commands like these-- so cd for Change Directory,
cp for copy a file, ls for list, mkdir to make a directory--
create a new folder at the command line instead of with the button--
mv for move or rename, rm for--
AUDIENCE: Remove.
DAVID J. MALAN: Remove.
So be careful with that one.
Rmdir, remove directory.
And there's dozens, hundreds of other commands.
You won't need many of them, but we'll start
to scratch the surface all the more over time.
But ultimately, this command line interface
is going to be a more powerful mechanism, a more capable mechanism,
and ultimately, a more efficient mechanism
for writing code, running commands, solving problems, analyzing data more
generally, even though know there's going
to be some growing pains early on just because it's
probably so new for many of you.
So with that said, we have some problems still to solve,
but we promised cookies today.
So let's go ahead and take a 10-minute break.
Cookies are now served in the transept.
And we'll be back here in 10.
All right, we are back.
And up until now, each of the code examples in C we've done
have been designed to show one specific topic.
But we thought we'd try to take a step back and solve a more general problem
and give you a sense of when given a problem set, for instance, or just
a programming problem more generally, where you even begin
and how you go about approaching it when it's not obvious
what the point of the exercise is.
So one of my favorite games from yesteryear
is this one here, "Super Mario Brothers" that has come in so many
different forms since.
But in this original two-dimensional side scroller game,
there was a lot of artwork like this.
So for instance, up here in the sky were four question marks.
And we'll find that in C and a lot of programming languages initially,
it's a lot easier, a lot more accessible to focus really on black and white type
interactive programs textually as opposed
to full-fledged graphics and the like, but more
on the more graphical acoustic type of programs before long.
But for now let me go over and propose that we
try to just implement in ASCII art--
ASCII, again, being the code that maps numbers
to letters, at least for English, into a textual version of these for question
marks in the sky.
So for this, let me go over to VS Code.
I'll create my own version of mario.c that
will be different from the one you're challenged with in problem set 1.
Indeed, in problem set 1, you'll be challenged
to build a little something like this, albeit with hashtags
for ASCII art instead of graphics.
And in mario.c, I want to just solve this simple problem first.
So it's all involving output.
So I'll do include stdio.h so I can use printf.
I'll do my int main(void)-- more on why we keep doing that in future weeks.
And I'm just going to do something simple initially,
like 1, 2, 3, 4, backslash n.
This is about the simplest way I can implement four question
marks in the sky like these here using pure text like this.
So let me go ahead and do make mario, ./mario, and voila.
We have those four question marks.
But we've seen, of course, that there are better ways to do this.
And if you wanted to generalize this to be five question marks, six,
60 different question marks, loop was always the answer
for not repeating ourselves.
So maybe I should rewrite this a little bit more flexibly and say something
like this, for in i get 0, i less than 4, i++.
And then inside of a for loop, now I can just do a single question mark,
but I don't think what I've just done is correct.
Any one spot the aesthetic bug already?
Yeah, why is this wrong if I want to print the same thing?
Yeah?
AUDIENCE: The backslash n, you said to use it into [INAUDIBLE]..
So I don't think I want a backslash n after every question mark
because the goal is, again, this row of question marks in the sky.
So if I now recompile this, make mario, ./mario, OK, it's almost there.
But now I have that regression to where the dollar sign is not on its own line.
So I think I need a new line, but I don't think I want it here
because that was not going to end well.
Where do I want to instead?
Any instinct?
Yeah?
Yeah, so outside for loop.
So indeed, I can just go below line 8 and above line 9, creating a new one.
And now it's totally fine to just print a new line like that.
You don't have to print anything else with it.
It's indeed a character unto itself.
So let's do make mario one last time, ./mario.
OK, so now we're back in business there.
Well, what if we wanted to do some other scene from "Mario," such as this one
here where there's a lot of vertical obstacles like these bricks here?
If I wanted to print out now a column of three bricks--
and I'll use hashtags for these instead of anything graphical-- well,
I think we're almost there, right?
I think I can now--
it's almost maybe a little easier.
I can go back here, change the question mark
to something that looks more like a brick, like this hash symbol.
And I think now I do want the new line character because when I now do make
mario, ./mario, OK, there's my wall of four.
Oh, but wait.
I didn't want four.
I wanted to be consistent just with this particular scene here,
so I just want three.
So I can still change it in one place.
And here, again, is that paradigm.
Even whether you're using 4 or 3, if you get
into the habit of starting counting from 0,
you go on up to but not through the value you want to count up to.
So that's why I'm using less than instead of less than or equal to there.
So this would be the common paradigm, though you could count it
like we saw earlier in different ways.
But what if things escalate one level further?
And when you're in the underground version of "Super Mario Brothers,"
there's a lot of these underground obstructions,
including grids of bricks like this.
And let me conjecture that if you slice this up, it's roughly a 3
by 3 grid of bricks that all interlock prettily to give us just one
big, large brick like this.
So if I want to print out a 3 by 3 grid, now things are getting a little more
interesting because up until now, I've printed either one row horizontally
or one column vertically.
But we haven't really seen any code where
I'm printing or living in two different dimensions like the game would imply.
But let me propose that we could do this.
Let me go ahead and say, all right, suppose I want to print a 3
by 3 grid of bricks.
It's really that I want to print, what, three rows of bricks.
A grid is three rows.
So if I take the high-level idea and reduce it
to something a little simpler, how do I do that?
Well, let me get rid of the printf for a moment as I did.
And let me just stipulate that this for loop,
even though it doesn't do anything useful yet,
will do something how many times just by design?
All right, three times.
This for loop is good to go.
It will do something three times by just using i to do the counting.
All right, well, if I want to print out now a row of three bricks all
on the same line, that's pretty similar to what
we did earlier when I just wanted to print out
four question marks in the sky.
So we've seen a solution there.
And I daresay we can compose one into the other.
So if I want to print out a row of bricks,
I could just do this for in i get 0 i less than 3 i++,
and then inside of this inner loop, if you will,
let me print out a single brick like this.
And then I don't like where this is going but, I think I've taken two ideas
and I've combined them.
But what might be problematic about lines 5 and 7 at the moment?
What might be bad here?
Yeah, in back?
AUDIENCE: You used the same integer i.
DAVID J. MALAN: Yeah, I'm using the same integer i, which
I feel like could get me into trouble.
If I'm trying to count three things here,
but then I'm hijacking this variable and using it inside of the loop,
I feel like I should avoid this collision of names.
And so what's a good alternative to i?
Well, a programmer, if nesting loops in this way,
would pretty commonly go with j.
You could certainly change this to be rows and columns
if you want more descriptive variables.
But i and j is pretty canonical.
So I'm going to go ahead and do this, j++ instead of i++ everywhere.
And let me try compiling this.
So make mario, Enter, ./mario.
OK, so a couple of things are wrong here.
This is not a 3 by 3 grid.
But if you count these things, how many did I indeed print at least?
You can probably just guess logically.
AUDIENCE: Nine.
DAVID J. MALAN: Yeah, there's nine hashes there.
Unfortunately, they're all on the same line
instead of on three different lines.
So where logically can I fix this?
I'm definitely printing all the bricks.
They're just not on the right levels.
Yeah?
AUDIENCE: If you put a new line at the first loop,
then you'll get three separate lines.
So put a new line after the first loop, this inner loop, if you will,
the nested loop, if you will.
So let me go ahead and print out just a backslash n here.
And what's this doing?
Well, I think that's going to solve it by just
moving the cursor to the next line after you've done one row.
So let me go ahead and do make mario, Enter, ./mario,
and now we're in business.
So it's a very simplistic version of this same graphic,
but I'm leveraging two different ideas now--
or the same idea twice rather now.
I'm using one loop to control my cursor going row, by row, by row.
But then within that loop, I'm doing left to right,
dot, dot, dot, dot, dot, with printing out
each of these individual bricks like this.
Now, there's a little sloppiness here still.
If I want this to always be a square just because that's
what it looks like in the game, well, I could change it to be a 4 by 4
square by doing this or a 5 by 5 grid--
whoops-- by doing this.
Why is this perhaps not the best design to just keep changing the numbers when
I want to change the size?
Where could this go awry?
Yeah?
AUDIENCE: If it's a square, [INAUDIBLE]
If it's always going to be a square and height
is going to be the same as width, I'm just inviting trouble here, right?
Eventually, I'm going to screw up.
I'm going to change one but not the other.
Then it's going to come out to be a rectangle instead of a proper square.
So I should probably solve this a little differently.
So let me do that.
At the top of my main function here, let me go ahead and give myself
a variable called maybe n for the number of bricks I want horizontally
and vertically.
And I'll just initialize that to 3 initially.
And instead of putting 3 here, I'll literally just use n.
But I'll do it in both places so that now, henceforth,
if I ever want to change this and change it to 4, or 5, or anything else,
I'm all done.
It's better designed because there's a lower probability of mistakes.
But I could technically still screw up somehow.
I could technically accidentally write a line of code like n++,
or I could just change the value of that variable even though I don't want it
to ever change.
And maybe it's because I'm a bad programmer, I copy/pasted wrong,
I'm working with someone who doesn't know what n represents,
I can defend myself and my code against human error
like that by going up here to line 5.
And instead of just declaring a simple variable like we did in Scratch,
I can further harden my code, so to speak,
by declaring it to be a constant using the keyword const.
Now, this is just a feature of C and some other languages
to protect you against yourself by proactively saying,
n is a constant, specifically the number 5 or, previously, the number 3.
You cannot accidentally write code elsewhere that changes it.
The computer will throw an error and catch that error.
So it's just a way of programming a little more defensively.
Some languages have this.
Some languages don't.
But in general, it's a good practice.
It makes your code better designed because it's just
as less vulnerable to mistakes by you, colleagues, or anyone else
using the code.
So let me change this back to 3 just to be our default.
But now I'm using n in both places.
And if I do make mario, ./mario, we're back to where we originally started.
But the code is a little more better designed.
And let me note this too.
All this time, I've been mentioning that correctness is important.
Design is important.
There is also this matter of style.
I've been very deliberately writing pretty code,
if you will-- not just the syntax highlighting, which, is automatic.
But notice that I keep indenting everything nicely.
Any time I have curly braces, like on lines 4 and 14,
everything is indented one level.
When I have additional curly braces on lines 7 and 13,
everything is nicely indented as well.
Technically speaking, the computer does not care about that kind of whitespace,
so to speak.
And you could really make a mess of things
like this because you have a strange sense of style
or just because you're being a little sloppy.
But this code is actually still correct.
If I recompile it-- let me open up my terminal window--
make mario, no errors, ./mario, it works perfectly fine.
But you can imagine just how annoying this now
is to read, certainly for a TA, but certainly for you the next day,
certainly for a colleague who has to read your code.
This is just bad style.
It still works, and it's well designed in that you're writing code
defensively, you're using a constant.
But, my god, the style is atrocious.
Now, you'll often find that there's tools
that can help you format your code for you
in a manner consistent with a courses or a company's style.
But this is the kind of muscle memory you'll want to develop over time too.
Take these VS Code suggestions as it's outputting lines of code for you
because it's trying to format your code in a readable way.
And, oh, my god, if and when you do have bugs in your code
and things aren't even indented properly,
there's no way you the human are going to be able to wrap your mind around
what's happening and where.
You're just making the problem harder for yourself.
So do get into this habit too of manifesting good style as well.
All right, well, let me propose that we don't only want a 3 by 3 grid.
We want this to be a little more dynamic.
So suppose we moved away from a constant to just using an integer called n.
And let's ask the user for the size of this grid
as by prompting them with get_int, as we've done before.
And I'll store it in n here.
And then I can go ahead and, more dynamically,
run make mario to compile it-- whoops.
Oh, I screwed up accidentally.
What is it suggesting I do, albeit cryptically?
AUDIENCE: You have to include the cs50.h.
DAVID J. MALAN: Yeah, I forgot to include the CS50 header file up top.
And that's why it doesn't know that get_int is, in fact, valid.
So that's an easy fix.
I'm just going to go up here and include cs50.h.
Now I'm going to clear my terminal and rerun make mario.
Now we're good-- ./mario.
And now notice I'm prompted for size.
So if I type in 3, it's the same as before.
If I type in 10, it's even bigger, but it happens all now automatically.
But there are some things that we're not detecting.
For instance, suppose I type in cat.
Well, that's handled by the get_int function, as I claimed earlier.
That's one of the features of using a library.
You don't have to deal with erroneous input.
But we only designed a function called get_int to get you an integer.
We don't know if you want it to be positive, negative, zero,
or some combination thereof.
And it's kind of weird to allow the user to type in negative 1
for the size of the grid or negative 3 for the size of the grid.
And indeed, your code does nothing, so at least it's not crashing.
But that's kind of stupid, right?
It'd be nice to force the user if they want
a grid to give us a positive value.
So how could we do this?
Well, I could go up here and I could say something like if n is less than 1--
so if it's 0 or negative, which I don't want, what could I do?
Well, I could say, well, prompt the user again for the size.
And now notice, I'm not declaring n again because once it exists,
you don't have to mention the data type again.
We said that earlier.
But this is kind of stupid.
Why?
Because now when you've given the user a second chance, OK, now maybe
I'll do-- all right, if this version of n is less than 1, well,
let's just go and prompt the user a third time.
I mean, you can see where this is stupidly going.
This can't be the right solution to keep typing recursively
the same thing again and again.
Where would it stop?
You'd have to give them a finite number of chances
or just make a mess of your code.
So what would be intuitively a better solution here?
AUDIENCE: A while loop.
DAVID J. MALAN: Yeah, so some kind of loop.
We've seen a while loop.
We've seen a for loop, so maybe one of those.
So let me try this.
Let me delete this messiness and just go back to the first question.
And let me do this.
So while n is less than 1--
so while the number is not what we want--
let's just prompt the user in a loop this time for the size again.
Now, here too, this is better because it's only two requests for information.
But clearly, lines 6 and 9 are pretty much identical other than the int.
And if I went in and changed the size, if I
add this, if I change the wording here, change it to a different language,
I have to change it in two places.
That's bad.
Copy/paste, bad.
So what might be better?
Well, it turns out, there's another paradigm in C
that you can use that gets around this problem, this duplication of code.
It would be much nicer if I just write the code once.
And I can do that using a third type of loop called a do while loop.
So it turns out, in C, you can do this.
If you want to get the value of a variable like n,
first just to create the variable without an initial value.
So int n semicolon means we don't know what value it has, yes.
But that's OK.
We're going to add a value to it eventually.
Then I'm going to say this, do, literally.
I'm going to open my curly braces.
And what do I want to do?
I want to assign to n the return value of get_int,
prompting the user for size.
Well, when do you want to do that?
I want to do that while n is less than 1.
And this code now achieves the exact same goal,
but by never repeating myself.
Why?
Well, notice on these lines of code now, I'm literally saying on line 6,
give me a variable called n of type integer.
It doesn't have a value initially, but that's fine.
You can do that.
Line 7 says, do the following.
What do you want to do? get_int, prompting the user with the word size,
and just store that value in n.
But because C code runs top to bottom, left to right,
now it's reasonable on line 11 to ask that question, OK, is
the current value of n, which it definitely got on line 8, less than 1?
And if the user didn't cooperate-- they typed in 0, or negative 1,
or negative 3-- what's going to happen?
It's going to go back up here and repeat, repeat, repeat everything
in the do while loop.
So a do while loop in C--
which is not something some other languages have.
Python, if you know it, does not have a do while loop.
This is perhaps the cleanest way to achieve this,
even though it's a little weird that you have to declare your variable,
create your variable up top, and then check it down below.
But otherwise, it's similar to a while loop.
It just flips the order in which you're asking the question.
Any questions on this construct?
And do while, in general, is super useful when
you want to get input from the user and make
sure it meets certain requirements.
So all right, so now that we have this building block after that interlude.
How can I go about cleaning up this code?
And then let's conclude by taking a look at things that our code can't do
or can't do very well or correctly.
Let me propose that in a final version of Mario,
let me just add what are called now some comments.
So it turns out, in code in C, you can define
what are called comments, which are just notes to self.
Some of you discovered these in Scratch.
There's little yellow sticky notes you can
use to add citations or explanations.
In C, there's a couple of ways to write comments.
And in general, comments are notes for yourself, for your TA,
for your colleague as to what your code is doing and why or how.
It's a little explanatory note in English
or whatever your human language might be.
So for instance, what I might do here in my implementation
of this version of mario, I might first ask myself a question like--
I might first make a note to self like this on a new line,
above this first block of code, Get size of grid.
It's just an explanatory remark in any terse English
that generally explains the next six or so lines, the next chunk
or block of code, if you will.
It would be a little excessive to comment every single line.
At some point, the programmer should know what individual lines of code do.
But it's nice to be able to glance at this comment on line 6
that starts with two slashes, and it gets grayed out
because of syntax highlighting.
It's not logic.
It's just a note to self.
It generally gives me a little cheat sheet
as to what the following lines of code should be doing and/or why.
And then down here, well, there's a second block of code
that's a bunch of lines.
But together, this just, what, prints grid of bricks.
And so it's another comment to myself that
just makes it a little more understandable what
these 20-some-odd lines of code are doing by adding
some English explanations thereof.
But now that I have these, wouldn't it be nice
if I could abstract these pieces of functionality away, this
getting of the size and this printing of the grid?
In other words, suppose that you didn't know where to begin with this problem.
And the problem at hand were literally implement
a program that prints a grid of bricks of some variable size--
3, or 4, or 5, or whatever the human types in.
If you have really no idea where to start,
comments are actually a good way of getting
started because comments can be an approximation of what we call last week
pseudocode.
Pseudocode is terse English that gets your point across, like for the phone
book searching like last time.
So if you didn't really know where to begin,
you could do something like this.
I could, for instance, just say, Get size of grid as my first step
and then Print grid of bricks as my second step.
And that's it for my program thus far.
This is now implemented in pseudocode.
I have some massive placeholders there.
I still have work to be done.
But at least I have a high-level solution to the problem in comments.
And now I can even go this far.
I could say, well, let's suppose that there's just a function already
that exists called get size.
I could do something like this.
I could do int n equals get_size.
And now I just have to assume for the moment
that some abstraction called get_size exists.
It doesn't.
This does not come with the CS50 library.
But I could invent it, I bet.
How else might I proceed?
Well, let's just assume for the moment that there's also
a function called print_grid that just prints a grid of that size n.
So here too is an abstraction.
These puzzle pieces don't exist.
These functions don't yet exist.
But in C, just like in Scratch, I can create my own functions.
How do I do that?
Well, let me go down later in the file.
And by convention, you generally want to leave main at the top of your code.
Why?
Because it's the main function, and it's just where
the human eye is going to look to see what some file of code does.
And let me do this.
I want to create a function of my own called get_size whose purpose in life
is to get the size that the user wants.
I want this function to return an integer.
And the syntax for doing that is this, right,
similar to a variable, the data type that this function returns.
I don't need this function to take any inputs.
And so I'm going to use a new keyword that we've actually been using thus
far-- more on it another time-- just called
void, which just means this get_size function does not take any inputs.
It does have an output.
It outputs an int.
And this is just the weird order in which you write it.
You write the output format, the name of the function, and then the inputs,
if any, inside of parentheses.
And now I can implement get_size.
But I've already implemented get_size.
Or at least now at this point in the story,
I at least know concretely what to do.
And I could figure out eventually, with some trial and error
perhaps, all right, if I declare a variable
and I do the following n equals get_int, prompting the user for size,
and I keep doing that while n is less than 1, once that block of code
is done, here is a new keyword in C where you can return that value n.
So I keep referring to these values that some functions return as return values.
In C, there's literally a keyword called return
that will hand back to any function that uses
that function the value in question.
So in a nutshell, between lines 15 and 21 now,
here is some code identical to our solution earlier that gets a value
n from the user that is positive.
It's 1, or 2, or higher.
It's not 0, or it's not less than 1.
And as soon as we've got that value, we hand it back as a return value.
Notice how I'm using this function on line 7.
Just like with get_int, just like with get_string,
I'm calling the function-- nothing in the parentheses in this case.
But then I'm using the assignment operator
to copy whatever its return value is into my variable n.
And so now I have a function that didn't use
to exist called get_size that gets me a positive integer no matter what.
And now for the grid, how do I do this?
How do I invent a function called print_grid
that takes a single argument, a number and prints a grid of that size?
Well, let's go down here.
I'm going to write the name of this function print_grid.
This function just needs to print.
It has a side effect, as we keep saying.
So I'm just going to say it has no return value.
It's just void.
It doesn't have an output, per se.
It's just an aesthetic side effect.
But it does take in an argument.
An argument is an input, and the syntax for this in C
is to name the type of the input it takes and the name of the variable.
And I could call this anything I want.
I'll call it size.
I could call it n.
And it's OK to use the same variable in different functions,
but I'll call it size just to be distinct.
And then in this function, I'm just going
to copy from memory the same code is before. for int i get 0,
i less than size--
instead of 3-- i++, inside of this, for int j gets 0, j is less than size j++,
and inside of that, print out with printf a single hash,
print out after that loop a single new line, and that's it.
Now, I did this fast, admittedly.
But it's the same code that I wrote earlier.
But now, just like I did with Scratch, let
me just arbitrarily hit Enter a bunch of times
to move the code out of sight, out of mind.
Now I have abstractions.
I have puzzle pieces that now exist called get_size and print_grid,
syntax for which takes some getting used to, but they now just exist.
Except I do need to do one thing.
Because C is a little naive, if I try to do make mario now and hit Enter,
implicit declaration of function get_size is invalid.
And we've seen that before when I hadn't included a file.
When I hadn't included CS50 library, get_int didn't work.
But that's not the issue here because this is not from a library.
I just invented this.
C takes you literally.
And if you define these functions at the bottom of your file,
they don't exist on line 7 or 10.
So I could do this.
I could, all right, fine, well, let me just highlight all of this,
cut to my clipboard, and paste it up here.
This would solve the problem.
I could just move all of those functions at the top of my file.
That's annoying because now main is at the bottom of the file.
It's going to take longer to find it.
That's not a clean solution.
So let me put it back where it was at the bottom.
And let me do this.
This is the only time in CS50 and, really in C programming
where copy/paste is reasonable.
If you copy and paste the first line of code from each function
and then end it with a semicolon, you can tease the compiler
by giving it just enough of a hint at the top of the file
that, OK, these functions don't exist till down later.
But here's a hint that they will exist.
This is how you can convince the compiler to trust you.
So those other functions can still be lower in the file, below main.
But now when I do make mario--
oh, damn it.
Oh, I said print instead of printf.
That's my bad-- printf.
So if I do make mario, ./mario, now I can type in 3,
and we're back in business.
Now, this was a very heavy-handed way in long way
to get to a much more complicated solution.
But this solution, in some sense, is better designed.
Why?
Because now, especially without the comments,
I mean, look how short my code is.
My main function is literally two lines of code.
Why?
Well, I factored out the juicy stuff into its own functions.
And now, especially if I'm working with colleagues or others,
you could imagine splitting up large programs into smaller parts,
having different people implement different parts,
so long as you all agree in advance on what those inputs and those outputs
actually are.
All right, so let's now consider what computers can do well and not so well.
C indeed supports a whole bunch of operators, mathematically,
via which we can do addition, and subtraction, multiplication, division,
and even calculate the remainder when you divide one number by another.
In fact, why don't we go ahead and use these in a very simple program
and make our very own calculator?
So let me go over here to VS Code.
Let me go ahead and create a new file called calculator.c.
And in this file, let's go ahead and first include
a couple of now familiar header files-- cs50.h as well as stdio.h.
Let's go ahead then and declare main with int main(void).
And then inside of main, let's do something relatively simple.
Let's declare an int and call it x, and set
it equal to whatever the return value is of get int,
prompting the user for a value for x.
Let's then give ourselves a second variable.
We'll call it, say, y.
Set that equal to the return value of another call to get_int,
prompting the user this time for that value y.
And then let's very simply go ahead at the very end
and just print out, say, the sum of x plus y, a super simple calculator.
So I'll use printf, quote/unquote, %i for integer,
backslash n to give me the new line.
Then I'm going to go ahead and do x plus y to indeed print out the sum.
Let me go down to my terminal window now.
Let me do make calculator in order to compile the code.
No error messages, so that's good.
Let me do ./calculator.
And let's do something like 2 plus 2, which, of course, should equal 4.
And it does.
But it turns out that sometimes there are going to be limitations
that we bump up against.
And let me get a little more ambitious here.
Let me clear my terminal window.
And let me go ahead and rerun calculator again.
And this time, let's, oh, 2 billion for x, and let's type in the same for y.
And, of course, now the answer of 2 billion plus 2 billion
should, of course, be 4 billion.
And yet, it's not.
So curiously, we see, of all things, a negative number
here, which suggests that somehow the plus operator doesn't quite
work as well as we might like.
Now, why might this actually be?
Well, it turns out that inside of your computer is, of course, memory, or RAM,
Random Access Memory.
And depending on the size of your computer and the type of computer,
it might very well look a little something like this--
a little circuit board with these black little modules on it
that actually contain all of the bytes of your computer's memory.
Unfortunately, you and I only have a finite amount
of this memory inside of our computers, which
means no matter how high we want to count,
there's ultimately going to be a limitation on how high we
can count because we only have a finite amount of memory.
We don't have an infinite number of zeros and ones to play with.
We have to actually be bounded ultimately.
So what's the implication of this?
Well, it turns out that computers typically use
as many as 32 bits in zeros or ones to represent something
like an integer, or in C, in int.
So for instance, the smallest number we could
represent using 32 ints, of course, using 32 bits, of course,
would be zero--
32 zeros like this here.
And the biggest number we could represent
is by changing all of those zeros to ones, which, in this case,
will ideally give us a number that equals roughly 4 billion in total.
It's actually 4,294,967,295 maximally if you set all 32 of those bits to ones
and then do out the actual math.
The catch, though, is that we humans and computers in general
also sometimes want to and need to be able to represent negative numbers.
So if you want to represent negative numbers as well as positive numbers
in 0, you can't really just start counting at 0
and go all the way up to roughly 4 billion.
You've got to split the difference and maybe
allocate half of those patterns of zeros and ones two negative numbers
and the other half roughly to positive numbers.
So in fact, in practice, when you're using even as many as 32 bits,
the highest most computers could count, certainly in a program like this in C
using an int, would be roughly 2 billion.
That is 2,147,483,647.
But the flip side of that is that we could also now,
using different patterns of bits, represent negative numbers as low
as negative 2 billion, give or take.
But the implication then, of course, is that if we only
have a finite number of bits and can only count so high, at some point,
we're going to run out of bits, so to speak.
In other words, we encounter what's generally known as integer overflow
where you want to use more bits than you have available.
And as a result, you overflow the available space.
What does this mean, in fact, in real terms?
Well, let's suppose that you only have three bits,
but I'm going to gray out a fourth bit just
to convey where we'd like to put an additional bit ultimately.
If this of course, is 0, per week 0's discussion,
this is 1, 2, 3, 4, 5, 6, 7.
Now, ideally, in binary, if you want to add one more to this value 7,
you're going to have to carry the 1 mathematically,
and that would ideally give 1000.
But if you don't have four bits and your computer is only sophisticated enough
to have three bits, not even 32, but three,
the implication is that you're effectively representing not 1000,
but rather, 000.
There's just no room to store that fourth bit
that I've grayed out here, which is to say that your integer might overflow.
And as soon as you get to 7, the next number once you add 1
is actually going to be 0, or worse, as we've seen here
in my code, a negative value instead.
So what could we do to perhaps address this kind of concern?
Well, C does not have just integers or ints.
It also has longs, which, as the name suggests,
are just longer integers, which means they have more bits available to them.
So let me go back into my code here.
I'll clear the terminal window.
And let me go ahead and change my integers to
literally long here, long here.
I'm going to have to change my function in CS50's library
to be not get_int, but get_long.
And that's indeed another function we provide in the library.
Let me change this get_int to get_long as well.
I'll keep my variable names the same, but I do need to make one other change.
It turns out that printf also supports other format codes--
so not just %i for integers or %s for strings, but also, for instance,
%li for a long integer, as well as %f for floating-point values with
decimals.
So with that said, let's go ahead and change my printf line to be not %i,
but %li.
Now let me go ahead and do make calculator again, Enter--
no apparent errors now-- ./calculator.
And 2 plus 2 still equals 4 as before.
But now if I do calculator again, and let's do 2 billion
again as well as 2 billion for y, previously, we
overflowed the size of an integer and got some weird negative number
because the pattern was misinterpreted, if you will, as a negative number
instead.
But a long, instead of using 32 bits, conventionally
uses 64 bits, which means we have more than enough spare bits
to go when we add 2 billion plus 2 billion.
And now, in fact, we get the correct answer of 4 billion,
which does fit inside of the size of a long.
Now, a long can count up quite high.
And, in fact, it can count as high as this, 9 quintillion.
And so that will give us quite a bit more runway.
But, of course, it too is ultimately going to be finite.
So if you have numbers that need to go bigger than that,
you might still very well have a problem.
Now, there's another problem that we might run into as well.
And we can see it in the context of even this simple calculator.
Computers also suffer from potentially what's called truncation,
where especially when you're doing math involving floating-point values-- that
is numbers with decimals-- you might accidentally unknowingly truncate
the value-- that is lose everything after the decimal point.
So in fact, let me go back to VS Code here.
I'll clear my terminal window.
And let's still use longs, but let's go ahead and use
division instead of addition here.
So let me change this plus to a divide operator.
Let me go ahead and recompile the code down here with make calculator.
Let me go ahead and run ./calculator, and let me go ahead and do something
like 1 for x and 3 for y.
And we'll see that--
well, wait a minute.
1 divided by 3, I learned, should be 1/3.
But in a floating-point value, that should 0.33333, maybe
with a little line over it in grade school,
but, really, an infinite number of threes.
And yet, we seem to have lost even one of those threes after the decimal point
because the answer is coming back here as just 0.
So why might that be?
Well, if I know that two integers, when divided one by the other,
is supposed to give me a fraction, a floating-point value
with a decimal point, I can't continue to use integers or even,
in this case, longs, which do not have support for decimal points.
So let me go ahead and change this format code here from %li to %f,
which is, again, going to represent a floating-point value instead of a long
integer or even an integer.
And let me go ahead further and define maybe a third variable, z, as a float
itself.
So I'll give myself a variable z equals x divided by y.
And now rather than print x divided by y, let's just go ahead and print z.
So now I'm operating in a world of floating-point values
because I proactively that a long or an int divided
by another such value, if it's meant to have a fraction,
needs to be stored in a floating-point value, something with a decimal point.
Well, let me go down to my terminal window here and rerun make
of calculator-- seems to work OK-- ./calculator,
and let's do 1 divided by 3 again.
And still here, we see all zeros.
So we do at least see a decimal point, so we've made some progress Thanks
to the %f and the float.
But it seems that we've already truncated the value 1 divided by 3.
So how do we actually get around this issue?
Well, if you the programmer know that you're
dealing in a world that's going to give you floating point
values with decimal points, you might very well
need to use what's called a feature known
as typecasting-- that is convert one data type to another by explicitly
telling the compiler that you want to do so.
Now, how do I do this?
Well, let's go back to my code here.
And if the issue fundamentally is that C is still
treating x and y as integers-- or technically,
longs with no decimal point-- and dividing one by the other,
therefore has no room, so to speak, for any numbers after a decimal point,
why don't I proactively do this?
Let me, using a slightly new syntax with parentheses,
specify that I want to convert x proactively from a long to a float.
Let me specify proactively that I want to convert y from a long to a float
as well.
And now let me go ahead and trust that nz
should be the result of dividing not a long by a long or an int by an int,
but rather, a float by a float.
Let me clear my terminal window, run make calculator again--
seems to work OK-- ./calculator.
And now 1, 3, and hopefully now we actually see
that my code has outputted 0.333333.
And I think if we kept showing more numbers after the decimal point,
we'd theoretically see as many of those threes as we want.
But there is still one more catch.
And especially when we're manipulating numbers
in this way in a computer using a finite amount of memory,
another challenge we might run up against-- besides integer
overflow, besides truncation-- is this known as floating-point imprecision.
Just as we can't represent as big of an integer as we want using int
or long alone because there is going to be an upper bound,
there's similarly going to be a bound on just how precise our numbers can be.
And indeed, let's go back to VS Code here.
I'll clear my terminal window yet again.
And this time, let me use some slightly unlikely syntax to specify that I
don't want to see the default number of numbers after the decimal point,
which %f gives us automatically.
Let's go ahead and show me 20 decimal point numbers after the decimal point.
And the weird syntax for this is to do not %f,
but %.20 to indicate to see that I want to see 20 digits,
not the default after, now, the decimal point.
Let me rerun make calculator.
Let me do ./calculator again.
And let's do 1, let's do 3.
And now this is even weirder, right?
From grade school, you presumably learned that 1 divided by 3
is, of course, 1/3.
But that should be 0.33333, infinitely many times, or, on paper,
with a little line over it.
But the computer is doing some weird approximation here.
It's a whole bunch of 3's and then 4326744079590.
Well, what's really happening under the hood,
well, again, is this issue of floating-point imprecision.
If you only have a finite number of bits and, in turn,
a finite amount of memory, the computer can really only
be so precise intuitively.
Equivalently, the computer is decided on some way
of representing floating-point values.
But the catch is, per grade school math, there's
an infinite number of numbers out there and an infinite number
of floating-point values because you can keep adding more and more digits if you
want.
So the computer, given the way it's implementing these floating point
values, is essentially giving us the closest approximation that it can.
Now, how can we go about improving the situation?
Well, there is one alternative.
Instead of using float, I can use something
called a double, which, as the name suggests,
uses twice as many bits as a float.
So instead of 32 typically, it will use 64.
And that's just like the difference between a long and an int,
which gave us more bits.
But in this case, this will be used for more precision.
Let's go ahead and cast x to a double.
Let's cast y to a double.
And now let's go ahead and, using the same format code--
%.20f is still OK for doubles.
Let me do make calculator.
Let me do ./calculator.
And now let me do 1 divided by 3.
And we still have some of that imprecision.
And it's even more of it if we looked at more than just 20 digits.
But now we have more threes after the decimal point.
So it's at least more, and more, and more precise, but it's not perfect.
But it's at least more precise.
So these kinds of issues, then, are going
to be necessary to keep in mind any time you
do something numerically, scientifically, at least
with a language C where you're going to bump up
against these real-world limitations of hardware and, in turn, language.
Now, later in the semester, we'll transition to a language called Python.
And that's actually going to solve at least one of these problems
for us by just automatically giving us more bits,
so to speak, as we need them, at least for integers.
But even the issue of floating-point imprecision is going to remain.
Now, just how real-world are these issues?
Well, back in the year 1999, we got a taste
of this when the world realized in the years leading up to that date
that it might not have been the best idea to implement computers
and software therein by storing gears using just two digits.
Like, instead of storing 1999 to represent the year 1999,
a lot of computers, for reasons of space and cost,
were in the habit of cutting a corner and just using
two digits to keep track of the year.
The problem with that is that if systems were not updated by the year 1999
to support the year 2000, 2001, and so forth, is that, just like before
with integer overflow, some computers might
add 1 to the year in their memory, '99.
It should be the year 2000, but if they're only
using two digits to represent years, they
might mistake the year-- as some systems may very well have--
for the year 1900 instead, taking literally
a big step backwards, if you will.
Now, you'd like to think that kind of issue
is behind us, especially as we understand
all the more about the limitations of code and computing.
But we're actually going to run up against this very same type of issue
again in just a few years.
On January 19 in the year 2038, we will have run out of bits in most computers
right now to keep track of time.
It turns out, years ago, humans decided to use a 32-bit integer
to keep track of how many seconds had elapsed over time.
They chose a somewhat arbitrary date in the past--
January 1, 1970--
And they just started counting seconds from there on out.
And so if a computer stores some number of seconds,
that tells the computer how many seconds have
passed since that particular date, January 1, 1970.
Unfortunately, using a 32-bit integer, as we've
seen, you can only count so high, at which point,
you overflow the size of that variable.
And so potentially, if we don't get ahead of this as humans, as a society,
as computer scientists, on the date January 19, 2038,
that bit might flip over, thereby overflowing the size of those integers,
bringing us back computationally to December 13, 1901.
So this is to say now, with all of this computational ability and code
comes a responsibility to actually write correct code.
Next week, we'll peel back some of these layers.
But for now, this was week 1, and best of luck on problem set 1.
[APPLAUSE]
[MUSIC PLAYING]

Untitled

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Untitled

Uploaded by

Copyright:

Available Formats

[REEL-TO-REEL PLAYER STARTING]

DAVID J. MALAN: All right, this is CS50.

All right, we are back.

DAVID J. MALAN: Yeah.

You might also like