You are on page 1of 5

CS 5150/6150 - Fall 2022

Nick de Jonge Problem Set 4 October 29, 2022

Question 1
(a) If there are b bees in the swarm at the start of round i, what is the expected number of bees
remaining at the start of round i + 1?
First, let the random variable Y as the number of flowers that had one bee land on it. Second,
let (
1 if flower f had one bee land on it
Y = x1 + x2 + · · · + xn , where xf =
0 otherwise
Now analyzing the expected value of Y :
n
X
E[Y ] = E[xf ]
f =1

=n∗E
= n ∗ (1 ∗ P r[1 bee landed on flower f] + 0 ∗ P r[not 1 bee landed on flower f⋆])
= n ∗ P r[1 bee landed on flower f]
 
1 1 1
=n∗ ∗ ∗ ··· ∗
n n n
 b
1
=n∗
n
 b−1
1
=
n

⋆: either 0 bees or > 1 bees land on flower f.


b−1
After round i, n1 flowers have one bee land on them. Thus at the start of round i + 1,
b−1
there are a total of b − n1 bees looking for flowers.

(b)

1/5
CS 5150/6150 - Fall 2022
Nick de Jonge Problem Set 4 October 29, 2022

Question 2
(a) Assume n people and m masks. There are n(n − 1) possible pairs, represented as k (i.j) of
people i and j. Y is a random variable that represents the number of (i, j) pairs which are
identified as variable t.
(
1 if the pair (i, j) = t both have mask 1
X=
0 otherwise
E[Y ] = X1 + X2 + · · · + Xt
= Σkt=1 E[Xt ]
= k ∗ (1 ∗ P r[pair has mask 1] + 0 ∗ P r[pair has other mask]))
= k ∗ (1 ∗ P r[pair has mask 1]))
 2
1
= n(n − 1) ∗
m

When n = m + 1, the expectation of matching pairs equals 1.


(b) Y represents the number of people who receive kit 1. The total number of houses n=200.
(
1 if house i receives kit 1
X=
0 otherwise
E[Y ] = X1 + X2 + · · · + Xi
= Σni=1 E[Xi ]
= n ∗ (1 ∗ P r[house has kit 1] + 0 ∗ P r[house has other kit]))
= n ∗ (P r[house has kit 1]))
 
1
=n∗
1, 000, 000
n
=
1, 000, 000
200
=
1, 000, 000
Using Markov’s Inequality and the fact that two houses got the same kit we can prove the
upper bound of the probability that two houses got the same kit is very low:
200
2≥t∗ −→ t = 10, 000
1, 000, 000
1
P r [Y ≥ t ∗ E[Y ]] ≤
  t
200 1
P r Y ≥ 10, 000 ∗ ≤
1, 000, 000 10, 000
P r [Y ≥ 2] ≤ 0.0001 < 0.05

As we can see, the chance of two houses having the same kit when there are a million possible
1
templates is 10,000 .

2/5
CS 5150/6150 - Fall 2022
Nick de Jonge Problem Set 4 October 29, 2022

Question 3
(a) The algorithm for the multiple selection problem consists on one recursive function that relies
upon randomization to find the solution. The algorithm is similar to quicksort which uses a
randomly chosen pivot value. First, we call the function multi-select() with the arrays A and
K as parameters. The algorithm first checks the size of the K array, if it is 0 then it returns.
Then checks the size of arrays A and K together. If A is length one and K is length 1, then it
returns the elements held in those single-value arrays. After these checks, we then randomly
select a pivot value y from A using a uniform random function. Then we initialize four arrays:
A1, A2, K1, K2. A1 and A2 will hold the values from A that are above and below the pivot
value. K1 and K2 will hold the ki targets for the A1 and A2 arrays. The next step is to
divide up the values in A that are lower and higher value than the pivot value into the A1
and A2 arrays respectively. While we do this, we calculate the rank of y within the array A.
This is so we can use the rank value as a pivot for the K1 and K2 arrays. After this step, we
do the same for the K array and use the rank of y as a pivot to split the lower and higher k
values into K1 and K2 respectively. Once we have the split arrays, we then check the length
of K1 and K2. If the checked array has length greater than 0, we call multi-select() with their
appropriate A and K variant arrays. The returning values from multi-select are pushed into
a solution array in the form of value pairs: (K[], A[]) so that we can keep track of the correct
ranks and values found by the returning recursive function calls.

3/5
CS 5150/6150 - Fall 2022
Nick de Jonge Problem Set 4 October 29, 2022

(b) Pseudocode:

Algorithm 1: multi-select() function


Input: An array A[1 . . . n] of n unsorted, unique numbers and a sequence held in an array K
of m sorted integers k1 , k2 , . . . , km where 1 ≤ k1 < k2 < · · · < km ≤ n.
Output: Returns array of ki and ki -th value of A for all i = 1, 2, . . . , m as pairs.
1 multi-select(A, K){
2 if K.length == 0 then
3 return
4 if A.length == 1 AND K.length == 1 then
5 return (K[1], A[1])
6 y = ⋆ random.pick(A)
7 A1, A2, K1, K2 = []
8 ranky = 1
9 for (i = 1; i ≤ A.length; i++) do
10 if A[i] ≤ y then
11 A1.push(A[i])
12 ranky += 1
13 else
14 A2.push(A[i])
15 for (j = 1; j ≤ K.length; j++) do
16 if K[j] ≤ ranky then
17 K1.push(K[j])
18 else
19 K2.push(K[j])
20 solution = []
21 if (K1.length > 0) then
22 solution.push(multi-select(A1, K1))
23 if (K2.length > 0) then
24 solution.push(multi-select(A2, K2))
25 return solution
26 } End Function
27 ⋆ : random.pick() chooses an element uniformly at random from the passed array.

(c) We can be assured of the correctness of this algorithm because within in each recursive call
to multi-select(), we check to see if there is only one remaining value in the passed array A.
This would indicate we have isolated the correct value in A with the corresponding K value.
These k-th ranks are preserved through each recursive function call because they are returned
and passed back to the calling function. All returned values are saved and returned from the
initial function.

(d) The running time of this algorithm is first analyzed by beginning with the first few length
checks of arrays A and K. These checks are done in constant time. The initialization of the
A1, A2, K1, K2 arrays, pivot y and rank of y are also done in constant time. Then we get
to the first for loop which takes O(n) time because we loop through the entire array A. The
second for loop is completed in O(m) time and since m ¡ n, we do not need to be concerned

4/5
CS 5150/6150 - Fall 2022
Nick de Jonge Problem Set 4 October 29, 2022

with this for loop for the final running time. Within each instance of multi-select() there are
two separate recursive calls to multi-select(). We can see that since we are using a randomized
pivot value, the worst case instances which would be selecting the first or last item in the
array would lead to at most log(m) recursive calls to multi-select(). Therefore, we can see
that the final runtime of this algorithm is O(nlogm), which is what we wanted to show.

5/5

You might also like