You are on page 1of 9

Maria Teaca 343C3

01. [30p] Vmstat

Before running the tasks After running the tasks

Clearly, the cpus are getting worked well. They don’t have any idle or waiting time! They
should unionize.

/proc is actually a fake filesystem. They allow us, however, to access kernel data structures
through the file interface, so that’s quite nice.

Ran vmstat with -d for disk info and -n to skip re-printing headers. Tail’s there to get that
header out of the way. Did a cheeky sort with regard to the second column (-k 2), in
descending numerical order.
Maria Teaca 343C3

02. [30p] Mpstat

We can just increase the max recursion depth when we get to factorial and realise we’ll
exceed the current one. Add a heuristic offset to account for some existing calls on the GIL’s
stack.

The process will change between cpus as it runs, but it runs!


Maria Teaca 343C3

Here I was torturing my cpus for a grade. Left 0 out as per the assignment.

Then I added insult to injury. 0’s not alone anymore, though


Maria Teaca 343C3

Had to give up on fish because it’s got some issues with command substitution, but
eventually I got it right: create a taskset running on processors 1-ncpus with a step of 2 and
give it procs/2 tasks to run. Beautiful!
Maria Teaca 343C3

03. [15p] Zip with compression levels

The txt’s not zip’s favourite since not a lot of things get repeated. The time it takes to
compress it goes up, still.

Did the plot for RAY.bmp since that has the most satisfying increase in compression. There’s
a lot you can summarize about a bmp file, but it didn’t work as well for marbles, since the
colours on that one are a bit all over the place.
Maria Teaca 343C3

04. [25p] llvm-mca

Took 400MB to install clang (we should really get some vms on fep so we can have access
to sudo and not have to do this on our own computers, right?). Had to rename the file to
compile and then used -S to get the assembly.

Ran a nice analysis focused on pressure points. Look at that side effect on that return!
Maria Teaca 343C3

Got just the loop in with the not-documentation-oriented asm directives. Also, let’s all use
spaces instead of tabs, please? My editor’s not happy about it.
Maria Teaca 343C3

With only the loop and a lot of iterations, we can see we get a better uOps/cycle, since we
get rid of the push and return (which use more cycles). These metrics are still somewhat
quite misleading, since modern architectures are quite non-deterministic at runtime and hard
to model but oh well.

[10p] Task C - In-depth examination

With O0, we have quite a lot of register dependencies.

With O1, we get rid of most of them: there’s just one left on eax and we get a brand new
resource interference.

With O2 and O3, the assembly becomes quite unreadable. There’s probably some loop
unrolling going on and, also, a lot of registers which we don’t study in IOCLA, so it’s all quite
hard to understand. Still, after looking at that eax dependency a bit more, it doesn’t look like
a data dependency (and sure it wouldn’t be, the compiler’s smart enough), so it must be
some physical thing I’m personally not privy to, since I’m quite comfortable as a software
engineer.

So yeah, not sure how to solve that.


Maria Teaca 343C3

06. [5p] Bonus - Feedback

Not sure how you check if this picture is mine or someone else’s. Maybe I should’ve included
a timestamp but welp.

You might also like