HW 6

Uploaded by

Decycle

0% found this document useful (0 votes)

65 views3 pages

Original Title

HW_6

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

65 views3 pages

HW 6

Uploaded by

Decycle

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 3

Search inside document

Homework #6

6.41
You are writing a new 3D game that you hope will earn you fame and fortune. You are
currently working on a function to blank the screen buffer before drawing the next frame.
The screen you are working with is a 640 × 480 array of pixels. The machine you are
working on has a 64 KB direct-mapped cache with 4-byte lines. The C structures you are
using are as follows:

1 struct pixel {
2 char r;
3 char g;
4 char b;
5 char a;
6 };
7
8 struct pixel buffer[480][640];
9 int i, j;
10 char *cptr;
11 int *iptr;
Assume the following:
sizeof(char) = 1 and sizeof(int) = 4.
buffer begins at memory address 0.
The cache is initially empty.
The only memory accesses are to the entries of the array buffer. Variables i, j, cptr, and
iptr are stored in registers.

What percentage of writes in the following code will miss in the cache?

1 for (j = 0; j < 640; j++) {

2 for (i = 0; i < 480; i++){
3 buffer[i][j].r = 0;
4 buffer[i][j].g = 0;
5 buffer[i][j].b = 0;
6 buffer[i][j].a = 0;
7 }
8 }

Solution:
Given the C structures and the layout of the memory, the code is performing a write
operation in a column-major order. This means that it is not accessing data in a way that
is contiguous in memory. The pixel structure in the code is 4 bytes long, exactly fitting
into a cache line. When the first byte (the r byte) of a pixel is written to, a cache miss
occurs as that pixel data is not in the cache yet. The direct-mapped cache then loads the
whole pixel structure (4 bytes) from memory into a cache line, which covers the r, g, b,
and a bytes of that pixel. The next three write operations, which are writing to the g, b,
and a bytes of the same pixel, will hit in the cache because they are located in the same
cache line that was fetched when the r byte was written to. Therefore, for every group of
four write operations (to r, g, b, and a), only the first write will cause a cache miss, while
the following three writes will hit in the cache. This results in a cache miss rate of 1 out
of 4 writes, or 25%. So the percentage of writes in the provided code that will miss in the
cache is 25%.

6.45
In this assignment, you will apply the concepts you learned in Chapters 5 and 6 to the
problem of optimizing code for a memory-intensive application. Consider a procedure to
copy and transpose the elements of an N × N matrix of type int. That is, for source matrix
S and destination matrix D, we want to copy each element si,j to dj,i. This code can be
written with a simple loop,

1 void transpose(int dst, int src, int dim)

2 {
3 int i, j;
4
5 for (i = 0; i < dim; i++)
6 for (j = 0; j < dim; j++)
7 dst[j*dim + i] = src[i*dim + j];
8 }
where the arguments to the procedure are pointers to the destination (dst) and source (src)
matrices, as well as the matrix size N (dim). Your job is to devise a transpose routine that
runs as fast as possible.

Solution:
The given code transposes a matrix with a double loop that accesses the src array in a
row-major order, and the dst array in a column-major order. This can cause cache misses
especially for larger matrices, since these accesses won't fully utilize a cache line most of
the time.
We will use blocking (or tiling), as the optimization technique for the matrix. Make the
matrix divided into smaller sub-matrices (or blocks) that can fit into the cache, and these
blocks are processed individually. BLOCK_SIZE is a tunable parameter, and we will set
it to the size that gives the best performance for our specific cache architecture.
Here's an improved version of the transpose function using blocking:

#define BLOCK_SIZE 16

void transpose(int dst, int src, int dim) {

int i, j, m, n;
for (i = 0; i < dim; i += BLOCK_SIZE) {
for (j = 0; j < dim; j += BLOCK_SIZE) {
for (m = i; m < i + BLOCK_SIZE && m < dim; ++m) {
for (n = j; n < j + BLOCK_SIZE && n < dim; ++n) {
dst[n*dim + m] = src[m*dim + n];
}
}
}
}
}

Learn C++
From Everand
Learn C++
Durgesh
Rating: 4.5 out of 5 stars
4.5/5 (9)
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
Structure Member Alignment, Padding and Data Packing - GeeksforGeeks
Document8 pages
Structure Member Alignment, Padding and Data Packing - GeeksforGeeks
tiaroslo
No ratings yet
15 JoP Mar 08
Document1 page
15 JoP Mar 08
sgganesh
No ratings yet
GATE QnA
Document5 pages
GATE QnA
Gaurav More
No ratings yet
Computer Architecture Exam Midterm1 Solutions
Document17 pages
Computer Architecture Exam Midterm1 Solutions
nhungdieubatchot
100% (1)
04 02 Multivalued Variable Array
Document20 pages
04 02 Multivalued Variable Array
Jusen Samosir
No ratings yet
Writing Endian-Independent Code in C
Document10 pages
Writing Endian-Independent Code in C
hhung516
No ratings yet
Chrome C-Ares Bug Writeup
Document42 pages
Chrome C-Ares Bug Writeup
ddingo123
No ratings yet
Dynamic Memoryallocation
Document5 pages
Dynamic Memoryallocation
Nobita XD
No ratings yet
Lab 2
Document8 pages
Lab 2
SAURABH DEGDAWALA
No ratings yet
RSA Public-Key Encryption and Signature Lab
Document8 pages
RSA Public-Key Encryption and Signature Lab
Santiago Guaillas
No ratings yet
Operating System
Document44 pages
Operating System
best xerox
No ratings yet
Quiz For Chapter 7 With Solutions
Document8 pages
Quiz For Chapter 7 With Solutions
Thịi Ánhh
No ratings yet
C Lab Worksheet 11A - 1 C & C++ Pointers Part 3: Pointers, Array and Functions
Document5 pages
C Lab Worksheet 11A - 1 C & C++ Pointers Part 3: Pointers, Array and Functions
Ayesha Khan
No ratings yet
18-742 Advanced Computer Architecture: Exam I October 8, 1997
Document11 pages
18-742 Advanced Computer Architecture: Exam I October 8, 1997
Nitin Godhey
No ratings yet
Turbo C Interiew Questions With Answers..
Document6 pages
Turbo C Interiew Questions With Answers..
Priyank Panchal
No ratings yet
Bits and Bytes - C Interview Questions and Answers
Document6 pages
Bits and Bytes - C Interview Questions and Answers
avinash
No ratings yet
Assignment 3 Sol
Document8 pages
Assignment 3 Sol
SUDHANSHU RAUT
No ratings yet
CO1401 Week 11 Lecture
Document40 pages
CO1401 Week 11 Lecture
Edward Lee
No ratings yet
Cstring Management: Joseph M. Newcomer
Document17 pages
Cstring Management: Joseph M. Newcomer
Rodrigo A Saucedo
No ratings yet
CS61C Final Exam : University of California, Berkeley - College of Engineering
Document8 pages
CS61C Final Exam : University of California, Berkeley - College of Engineering
juggleninja
No ratings yet
Secure C Coding
Document62 pages
Secure C Coding
jagadish12
No ratings yet
MIT6 858F14 q13 1 Sol
Document15 pages
MIT6 858F14 q13 1 Sol
jarod_kyle
No ratings yet
MIT6 S096 IAP13 Lec3 PDF
Document43 pages
MIT6 S096 IAP13 Lec3 PDF
Margerie Fruelda
No ratings yet
Byte Alignment and Ordering
Document9 pages
Byte Alignment and Ordering
rodriguesvasco
No ratings yet
C Programming Language
Document191 pages
C Programming Language
Karthikeyan K
No ratings yet
SIMD Tutorial
Document17 pages
SIMD Tutorial
zzal119911
No ratings yet
Assignment-12 Solution July 2019
Document6 pages
Assignment-12 Solution July 2019
sudhir
No ratings yet
Data Types in C
Document6 pages
Data Types in C
Bharath B.v.p
No ratings yet
Memorys Numerical
Document11 pages
Memorys Numerical
Aman Kumar
No ratings yet
Lab 4 Process (Cont) Course: Operating Systems: Thanh Le-Hai Hoang Email: Thanhhoang@hcmut - Edu.vn
Document9 pages
Lab 4 Process (Cont) Course: Operating Systems: Thanh Le-Hai Hoang Email: Thanhhoang@hcmut - Edu.vn
Anh Nguyễn
No ratings yet
A Matrix-Multiply Unit For Posits in Reconfigurable Logic Leveraging (Open) CAPI
Document9 pages
A Matrix-Multiply Unit For Posits in Reconfigurable Logic Leveraging (Open) CAPI
muhammad zeeshan
No ratings yet
Problem Solving Using C (Unit 5)
Document22 pages
Problem Solving Using C (Unit 5)
03Abhishek Sharma
No ratings yet
Cve-2015-0235 - H G G V: A C P L
Document9 pages
Cve-2015-0235 - H G G V: A C P L
Rizky Sulistyo
No ratings yet
Exam1 f12
Document15 pages
Exam1 f12
Ryan Davis
No ratings yet
Labsheet 3
Document22 pages
Labsheet 3
ROSHAN KARANTH
No ratings yet
Gate Syllabus
Document13 pages
Gate Syllabus
skecboys
No ratings yet
Quiz I: Massachusetts Institute of Technology 6.858 Fall 2013
Document14 pages
Quiz I: Massachusetts Institute of Technology 6.858 Fall 2013
jarod_kyle
No ratings yet
CRC of BCA (2) Assignment (Revised Syllabus)
Document17 pages
CRC of BCA (2) Assignment (Revised Syllabus)
Bshrinivas
No ratings yet
Final
Document4 pages
Final
Apinya SUTHISOPHAARPORN
No ratings yet
Urbo C Interiew Questions With Answers
Document6 pages
Urbo C Interiew Questions With Answers
arun kumar
No ratings yet
Processors
Document25 pages
Processors
chuck212
No ratings yet
Computer Architecture: Author: Emil Sadigov CS2
Document10 pages
Computer Architecture: Author: Emil Sadigov CS2
emka
No ratings yet
Exam2 s09 v2
Document10 pages
Exam2 s09 v2
serhatandic42
No ratings yet
Clanguagesolutions PDF
Document7 pages
Clanguagesolutions PDF
vinith
No ratings yet
LAB: 07 Topic: Malloc, Calloc, Realloc, Free, BRK and SBRK
Document23 pages
LAB: 07 Topic: Malloc, Calloc, Realloc, Free, BRK and SBRK
Vignesh Challa
No ratings yet
Dynamic Memory Allocation
Document6 pages
Dynamic Memory Allocation
Debadrita
No ratings yet
Computer Architecture Sample Final
Document10 pages
Computer Architecture Sample Final
khayat_samir
No ratings yet
Cisco
Document30 pages
Cisco
Anurag Mishra
No ratings yet
Embedded Sample Paper
Document11 pages
Embedded Sample Paper
shivanand_shettennav
No ratings yet
CSNB123 - Tutorial CSNB123 - Tutorial
Document7 pages
CSNB123 - Tutorial CSNB123 - Tutorial
Muhd Izzuddin Abd Razak
No ratings yet
Optimizing C++/Code Optimization/faster Operations: Structure Fields Order
Document5 pages
Optimizing C++/Code Optimization/faster Operations: Structure Fields Order
Tukang Usil
No ratings yet
Managing Data Between MATLAB and C - Real Time DSP Lab Manual
Document3 pages
Managing Data Between MATLAB and C - Real Time DSP Lab Manual
Hamayun
No ratings yet
Code:: Ckielstra Suggestion
Document37 pages
Code:: Ckielstra Suggestion
estraj1954
100% (1)
Midterm Sample Answer: Instructor: Cristiana Amza Department of Electrical and Computer Engineering University of Toronto
Document18 pages
Midterm Sample Answer: Instructor: Cristiana Amza Department of Electrical and Computer Engineering University of Toronto
jhusseth
No ratings yet
Assignment-5 (Strings, Structures & Unions) : - Strings
Document4 pages
Assignment-5 (Strings, Structures & Unions) : - Strings
gauravlko
No ratings yet
Solutions: CS152 Computer Architecture and Engineering
Document17 pages
Solutions: CS152 Computer Architecture and Engineering
Babasrinivas Guduru
No ratings yet
DPPL Sessional-II Solution
Document3 pages
DPPL Sessional-II Solution
himanshu.m.singh132
No ratings yet
Master System Architecture: Architecture of Consoles: A Practical Analysis, #15
From Everand
Master System Architecture: Architecture of Consoles: A Practical Analysis, #15
Rodrigo Copetti
No ratings yet
AXIS Camera Setup and Troublshooting
Document4 pages
AXIS Camera Setup and Troublshooting
Shadab Qureshi
No ratings yet
Tower Crane Joystick
Document6 pages
Tower Crane Joystick
RunnTech
No ratings yet
8 Active Manager
Document6 pages
8 Active Manager
Gauri
No ratings yet
Forms
Document8 pages
Forms
RAJIV MISHRA
No ratings yet
HTML Book
Document199 pages
HTML Book
Muhammad Ashraf Soomro
No ratings yet
Tea 5777
Document47 pages
Tea 5777
Ariana Ribeiro Lameirinhas
No ratings yet
Scilab Presentation
Document17 pages
Scilab Presentation
Hrishikesh Khaladkar
No ratings yet
Chandra Sekhar - CRM Consultant
Document4 pages
Chandra Sekhar - CRM Consultant
DotNet developer
No ratings yet
1.1 Solution PDF
Document165 pages
1.1 Solution PDF
coolboyasif
No ratings yet
Lect 1b - Cadastral SUM
Document24 pages
Lect 1b - Cadastral SUM
Farisa Zulkifli
No ratings yet
00-Python Essentials 1 (Aligned With PCEP-30-02)
Document11 pages
00-Python Essentials 1 (Aligned With PCEP-30-02)
josegmay
100% (1)
To 2 Dki B. Inggris 2020 PDF
Document16 pages
To 2 Dki B. Inggris 2020 PDF
Chika alyetha09
No ratings yet
English and Literature - Davao Room Assignments For September 2013 LET
Document75 pages
English and Literature - Davao Room Assignments For September 2013 LET
ScoopBoy
No ratings yet
CM Level 7 Worksheet Solutions
Document42 pages
CM Level 7 Worksheet Solutions
Maths Gupta
No ratings yet
MFA Getting Started Guide
Document5 pages
MFA Getting Started Guide
RebekahAnderson
No ratings yet
Q64RD-G Manual PDF
Document162 pages
Q64RD-G Manual PDF
misbah_27
No ratings yet
Lecture 06 - Binary Search Tree (BST) - Design Analysis of Algorithm
Document30 pages
Lecture 06 - Binary Search Tree (BST) - Design Analysis of Algorithm
Muhammad Afaq
No ratings yet
How To Crack Wpa Passwords PDF
Document2 pages
How To Crack Wpa Passwords PDF
Jose
No ratings yet
How To Load Gamecube Games and or Homebrew Over A LAN V3 0
Document12 pages
How To Load Gamecube Games and or Homebrew Over A LAN V3 0
Wagner Luiz
100% (1)
BCS304 - Data
Document17 pages
BCS304 - Data
blank account
No ratings yet
T-SQL Concepts PDF
Document137 pages
T-SQL Concepts PDF
VenkateshRaju Sammit
No ratings yet
IRB6640 Spare Parts Product Manual 3HAC038330-En RevC
Document52 pages
IRB6640 Spare Parts Product Manual 3HAC038330-En RevC
Armando Domínguez
No ratings yet
File and Exceptions
Document9 pages
File and Exceptions
Vishal Anand
No ratings yet
Erp Flowchart PDF
Document2 pages
Erp Flowchart PDF
Mista
No ratings yet
Security Operations Center: How To Build A
Document35 pages
Security Operations Center: How To Build A
ep230842
No ratings yet
Epson L130 Brochure PDF
Document2 pages
Epson L130 Brochure PDF
Abhinav Verma
No ratings yet
Mth603 Collection of Old Papers
Document11 pages
Mth603 Collection of Old Papers
cs619finalproject.com
No ratings yet
PowerGrid902 EU
Document47 pages
PowerGrid902 EU
GabrielFerrari
No ratings yet
BSC (H) Sem 3 Guidelines July 2012
Document17 pages
BSC (H) Sem 3 Guidelines July 2012
Priyanka D Singh
No ratings yet
B700 Series Maintenance Kit Instructions PDF
Document8 pages
B700 Series Maintenance Kit Instructions PDF
Benjamin Garcia
No ratings yet