You are on page 1of 13

CS 251 Quiz 4

(​Python, Android Studio, Java,​)


Date: 10th November 2019 Max marks: 120
Please refer to the general instructions and submission guidelines at the end of
this document before submitting​.

All the accessible sites during the Quiz are as follows:


1. https://www.google.com
2. https://www.google.co.in
3. https://stackoverflow.com/
4. https://stackexchange.com/
5. https://www.python-course.eu/
6. https://docs.python.org/3/
7. https://scipy.org/
8. https://developer.android.com/
9. https://en.wikipedia.org/wiki/
10. https://www.w3schools.com/
11. https://www.tutorialspoint.com/
12. https://www.geeksforgeeks.org/
13. https://www.javatpoint.com/
14. https://scipy-lectures.org/
15. https://beginnersbook.com/

____________________________________________________________________________

q1 - Android
(13 Marks)
answers.txt

Specify your answers for the following questions in a txt file named ​answers.txt such that
every line has just one letter ranging from ​a​,  b​,  c  ​and  d​. In case you don’t know the answer
to a question then put in a ​-​ ​(​hyphen​) for that respective line.
i.e. answers.txt should have content similar to the following, no extra spacing and all in capitals.
$ cat answers.txt 





and so on …

This question will be auto-graded so make sure that you follow the submission format precisely.
Use ​Sublime Text for editing ​answers.txt so that you can match the respective question
number with the line number.

Marking Scheme​:
+1​ for every correct answer
-1​ for every wrong answer

1. An Applications background service is managed by which of the following application


component
a. Activity
b. Broadcast Receiver
c. Services
d. Content Provider

2. A .apk file contains:


a. classes.dex (compiled Java classes)
b. res (resources)
c. assets
d. All of the above

3. Controlling UI and managing user interaction is handled by which of the following


application component
a. Activity
b. Broadcast Receiver
c. Services
d. Content Provider

4. Layout of an activity is stored in


a. .json file
b. .png file
c. .xml file
d. .java file
5. File that holds information about SDK, versions, application Id, etc. is
a. build.gradle
b. res/values
c. res/layout
d. AndroidManifest.xml

6. Applications configuration file is managed by which of the following android component.


a. Manifest
b. Fragment
c. View
d. intent

7. If you need to access the INTERNET in your android application, where would you
specify the need for it?
a. build.gradle file
b. AndroidManifest.xml
c. There is no need to specify such need, one can directly use it.
d. MainActivity.java

8. Let’s say you are on ActivityA of App1, then you clicked on the notification you received
of some WhatsApp message and now the WhatsApp has opened up. Which state of the
ActivityA of App1 be called.
a. onPause()
b. onStop()
c. onDestroy()

9. Which callback makes the activity visible to the user


a. onCreate()
b. onStart()
c. onResume()

10. Which callback is fired when the system first creates the activity
a. onCreate()
b. onStart()
c. onResume()

11. The state in which the application interacts with the user
a. onCreate()
b. onStart()
c. onResume()

12. When you switch from ActivityA to ActivityB within the same application, in which state
would ActivityA be?
a. onResume()
b. onDestroy()
c. onStop()
d. onPause()

13. Intents are used for


a. Starting an Activity
b. Starting a Service
c. Delivering a broadcast
d. All of the above

____________________________________________________________________________

q2 - Trees and Mirrors [Basic Python]


(5 + 10 + 10 Marks)
Hints: ​(recursion, list comprehension, )
Tree is an important data structure you’ll come across all the time.
We’re sparing you the details of the data structures and algorithms involved - that’s another course’s
headache. All you need to solve these questions is knowing python syntax and understanding
pseudo-code.

Here in ​node.py ​you’re giving a simple node structure designed for a binary tree - a node either has a left
or right child - both nodes, or has ​None in place of them, indicating a leaf. No node in this binary tree has
just 1 child - either both or no children.

An internal node ​is defined as a node in the tree which has two children - these children may themselves
be a leaf or an internal node. Hence understand that by definition a leaf is not an internal node, and a node
is either an internal node or a leaf.
Henceforth, a tree is said to be of size n if it has ​n internal nodes. For a binary tree, ​this tree will have
in total ​2n + 1​ nodes: n + 1 of them will be leaves. A tree has at least one node as defined here - it’s head.

Function 1: ​Mirror a tree [5 marks]


Fill up the function ​def mirrorTree(node)​using the explanation given below.
Arguments (1):​ head node of a tree.
Return value: ​Root (same as head) of newly constructed tree that’s the mirror image to the given tree.
Don’t print anything. Do not change the name of or arguments to this function- it will be auto-graded.

Definition: ​The mirror image of a tree is defined as given in figure 1 below. T1 is a mirror image of T2 if
one child of the head node of T1 is the mirror image of the other child of the head node of T2. For head
being a leaf - mirror image of a leaf head tree is itself.
Figure 1: ​Examples of trees that are mirror images of each other, not symmetrical

Function 2: ​Find all trees with ‘n’ internal nodes [10 marks]
Complete function ​def allTrees(n)​using the explanation given below.
Don’t print anything. Do not change the name of or arguments to this function- it will be auto-graded.
Arguments (1):​ A single integer - number of internal nodes
Return value: ​List of heads of all unique trees that have n internal nodes. The order does not matter.
Limits - 0 <= n <= 11
Note that the list is exponential in n,
This function, on using ​list comprehension​, can be completed with just 1 for loop, no nesting.
By a for loop, we mean an indented block that goes
for foo in bar: 
blah 
# end 
... 
Hence an instance of list comprehension is not a for loop
Hence you will be awarded ​full 10 marks ​for correct output and 1 for loop used, ​8 marks ​if the output is
correct but two for loops have been used (nested or not), ​6 marks ​if the output is correct but three loops
have been used (nested or not). ​5 marks ​for correct output and more than 3 loops. ​Zero marks if the
output is incorrect.
Function 3: F
​ ind all trees with ‘n’ internal nodes that are symmetrical (i.e) they
exactly match their mirror images. [10 marks]
Complete function ​def allSymTrees(n)​using the explanation given below.
Don’t print anything. Do not change the name of or arguments to this function- it will be auto-graded.
Arguments (1): ​A single integer - number of internal nodes
Return value:​ List of nodes that are heads to symmetrical trees. Again, the order does not matter.
This will be rather simple once you have completed ​allTrees
Definition of equality: ​The equality of two trees is equivalent to the equality of its head node, which has
been defined in the Node class. Read it carefully.
Definition of symmetry:​ if node is head of tree T, then T is symmetric iff node == mirror(node).
This is also exponential, 0 <= n <= 23
This function, on using ​list comprehension​, can be completed without for loops. 
Hence you will be awarded ​full 10 marks ​for correct output and no for loops and 7 marks ​if the output is
correct but one for loop has been used (nested or not). ​5 marks ​for more than 1 loop. ​Zero marks if the
output is incorrect.
Hints for function 3:
1. What is the relation between all trees and all symmetric trees?
2. Even number of internal nodes - is such a symmetric tree possible?
3. What would happen if the left child of the head node is the mirror of the right child?
Just to reinforce what you might be thinking, and to make this question slightly easier, the following are
true. Use these to understand the approach for the last function.
a. len(allT rees(n)) == len(allSymT rees(2n + 1))
b. len(allSymT rees(n)) == 0
____________________________________________________________________________

q3 - Image Resizing [Advanced Python]


(25 Marks)

Image resizing can be achieved by multiple techniques. What you have to implement here is Image
Resizing by ​Bilinear Interpolation​. Go through ​this link to have an understanding of how bilinear
interpolation is done (the very first algorithm itself in the specified link should be enough for
understanding). You need to implement this resizing feature using numpy.

Fig. 1. An illustration of image resizing using bilinear interpolation. The four red dots show the data
points and the green dot is the point at which we want to interpolate.
You have been provided with two images ​barbara-input.png and ​barbara-output.png​, two
python code files ​autograder.py and ​resizeImage.py and a numpy array file ​truth.npy in
quiz4-resources/q3/ 

Assume the pixel dimensions for the input image to be equal along both axes, i.e., assume an aspect ratio
of 1:1 for the axes. Consider the number of rows as M and the number of columns as N.
You need to resize the image to have the ​number of rows = 3M−2 and the ​number of columns = 2N−1​,
such that the first and last rows, and the first and last columns, in the original and resized images,
represent the same data.

For this, you just need to code the function ​bilinearInterpolation() in ​resizeImage.py​. Its
input and output are as follows:
Input: A square matrix in the form of 2D numpy arrays representing the original image
(​barbara-input.png​)

Output: The resized 2D numpy array with the specified number of rows and columns representing
the resultant image (​barbara-output.png​)
For testing your code, simply run ​autograder.py which will by itself call the function
bilinearInterpolation() and generate the resultant image. It will also report on the terminal
whether your output is correct or not. ​(DO NOT EDIT ​autograder.py​)
____________________________________________________________________________

q4 - N-gram Language Model [JAVA]


( 57 Marks)
all java files specified in the submission instructions
Interfaces and Objects along with Collection framework and
Multithreading
Notes:
● If there is an exception, your program should exit with exit code -1.
● You have been provided boilerplate code. You just need to fill in your code in those
templates.

Models that assign probabilities to a sequence of words are called ​Language Models or ​LMs​. In
this question, we will introduce to you the most simple model that assigns probabilities to
sentences or a sequence of words, the ​n-gram​. An n-gram is a sequence of N words: a 2-gram
(or bigram) is a two-word sequence of words like “please submit”, “submit the” or “the quiz”,
and a 3-gram (or trigram) is a three-word sequence of words like “please submit the” or “submit
the quiz”. The final goal of this question is to get the unigram, bigram and trigram statistics of a
given corpus and use this information to predict the maximum likely next word given a sequence
of words and its probability.

You are provided with a data folder containing several text files (referred to as a corpus from
now on) in the q4 folder.

You need to implement the following tasks sequentially.

1. The function ​readFromFile in ​readFile.java​. It takes the path to a file as an


argument and
a. reads it's contents into a string
b. replaces all the ​non-alphabetic​ and non-whitespace characters with a single space.
c. converts it into lowercase and returns it. [4 marks]

2. The function ​returnListsOfFiles in ​getListOfFiles.java. ​It takes the path


of a directory containing the corpus as an argument and
a. returns an array of String where each element of this array will be the path of a
file in the directory (order is not important). [4 marks]

3. The method ​get_ngrams_single_string ​in ​getNgramsFromSingleString.java​.


It takes two arguments
i. body - a string containing the contents of a file
ii. n - max number of r-grams to be considered [6 marks]
and returns
a. all the r-grams present in the body in a HashMap for r = 1..n
b. key of the HashMap will be an r-gram in the form of a List of Strings.
c. value of the HashMap will be the corresponding count in the body.

​ Eg:
body - I am a good boy and a bad guy
n=3
Some of the elements of the hashmap will be :
I -> 1
a -> 2
good boy -> 1
and a bad -> 1
etc.,
Note :
1. No need to clean the body
2. Split using whitespace (multiple whitespaces are considered
equivalent to single whitespace)

4. The function ​get_ngrams_from_directory ​in the file ​ngrams_stats_st.java


It takes two arguments [8 marks]
i. dir - path of the directory containing the corpus
ii. n - max number of r-grams to be considered (same as above in 3.)
and returns
a. the n-gram statistics (just as described in the above task) of all the files present in
the corpus and then combines all these HashMaps into a single HashMap (return
this final HashMap.)

5. The function ​main ​in ​single_threaded_lm.java. It takes two command-line


arguments: [8 marks]
1. the first argument is the path of the directory containing the corpus (remove
leading and trailing whitespaces)
2. the second argument is the path of file containing queries (1 query on each line)

For each query,


print the most likely next word, along with its probability (one on each line,
word-probability pairs should be separated by a tab)
use ​ngrams_stats_st.get_ngrams_from_directory with n = 3 (refer to
the appendix)

6. The function ​get_ngrams_from_directory​ in the file ​ngrams_stats_mt.java


It takes two arguments [12 marks]
i. dir - path of the directory containing the corpus
ii. n - max number of r-grams to be considered (same as above in 3.)
and returns the n-gram statistics (just as described in the task 3) of all the files present in
the corpus and then combines all these HashMaps into a single HashMap (return this final
HashMap.)
But, it should do its job in a parallelized way:
● Create a thread-safe queue (let's call it inqueue) (for example,
LinkedBlockingQueue​) for giving input to workers
● then push filepaths into the queue
● each worker will take filepath from the inqueue and get the statistics (using the
function ​getNgramsFromSingleString.get_ngrams_single_string​)
and push it into another thread-safe queue (let's call it outqueue)
● while these workers are putting their results into the outqueue, the main thread
should pop these intermediate dictionaries and combine these intermediate
dictionaries into the main dictionary (which will be empty initially);
● when all workers have done their job, return this main dictionary

NOTE: IMPLEMENT WORKERS' CODE IN WORKER.JAVA

7. The function ​run ​in the file ​Worker.java​. Worker class extends the Thread class. You
need to implement its ​run method (as described in ​ngrams_stats_mt.java​). Its data
members are: [7 marks]
a. inqueue: it will be of type LinkedBlockingQueue<String>
b. outqueue: it will be of type LinkedBlockingQueue<HashMap<List<String>,
Integer> >
8. The function ​main ​in the file ​multi_threaded_lm.java​.
It takes two command-line arguments : [8 marks]
a. the first argument is the path of the directory containing the corpus (remove
leading and trailing whitespaces)
b. the second argument is the path of the file containing queries (1 query on each
line)

For each query,


print the most likely next word, along with its probability (one on each line,
word-probability pairs should be separated by a tab)
use ​ngrams_stats_mt.get_ngrams_from_directory with num_workers = 4,
n = 3 (refer to the Appendix)

Testing

To test if your submission is correct, run:


javac multi_threaded_lm.java && java multi_threaded_lm data queries.txt >
myoutput.txt
<or>
javac single_threaded_lm.java && java single_threaded_lm data queries.txt
> myoutput.txt
diff -bBw myoutput.txt output.txt
Appendix

We will use the frequency stats of the unigrams, bigrams and trigrams in the given corpus to
estimate the next word in a sequence i.e., given a sequence “I am going to ”, we need to predict
the next word in the sequence, it can be school, market or theatre, etc. For this, we will use an
advanced technique called ​Backoff and Interpolation ​in this question.

Given a sequence of words ​w​1 w​ ​ 3 ……


​ 2 w​ ​ w​
​ n-2​w​n-1 we have to predict the next word ​w​n in
​ the
above sequence. So we use the following equation to find the next word.

where ​λ​1​, λ​2 ​and λ​3​ ​ are parameters that need to be tuned based on the corpus provided.

For this question assume that

λ​1​ = 0.0196
λ​2​ = 0.588
λ​3 ​ = 0.3924

You can use the below formulae during your calculation.

1. total-unigram-count = ∑ unigram − f req(wi )


i

2. total-bigram-count = ∑ bigram − f req(wi wj )


i,j

3. total-trigram-count = ∑ trigram − f req(wi wj wk )


i,j,k
unigram−f req(wn )
4. P(w​n​)​ =
total−unigram−count
bigram−f req(wn−1 wn ) total−unigram−count
5. P(w​n​|w​n-1​) ​=
unigram−f req(wn−1 ) × total−bigram−count
trigram−f req(wn−2 wn−1 wn ) total−bigram−count
6. P(w​n​|w​n-1​w​n-2​)​ =
bigram−f req(wn−2 wn−1 ) × total−trigram−count

Note that if anything in the denominator of the above formulae is 0 then set the corresponding
probability to 0.
Now your final task would be to find the word which maximizes the above probability and print
the word and the corresponding probability (tab-separated)
Output​ : argmaxwn P(w​n​|w​n-1​w​n-2​) maxwn P(w​n​|w​n-1​w​n-2​)

(Appendix is helpful while implementing tasks 5 and 8.)

____________________________________________________________________________

General Instructions
● Make sure you know what you write, you might be asked to explain your code at
a later point in time.
● The submission will be graded automatically, so stick to the naming conventions
strictly.
● The deadline for the quiz is ​10th November, Sunday at 1:15 PM​.

Submission Instructions
The quiz4 directory to you looks exactly like this - except it is not named by your roll
number. Rename it as shown below and maintain this exact directory structure. ​Do not
move around files, add or delete any of them. ​We will pull this directory as required.

quiz4-<roll_number>/ 
├── q1 
│ └── answers.txt 
├── q2 
│ └── node.py 
├── q3 
│ ├── autograder.py 
│ ├── resizeImage.py 
│ ├── barbara-input.png 
│ ├── barbara-output.png 
│ └── truth.npy 
└── q4 
├── data/ 
│ └── ... txt 
├── getListOfFiles.java 
├── readFile.java 
├── Worker.java 
├── ngrams_stats.java 
├── ngrams_stats_st.java 
├── ngrams_stats_mt.java 
├── single_threaded_lm.java 
├── multi_threaded_lm.java 
├── getNgramsFromSingleString.java 
├── output.txt 
└── queries.txt 

You might also like