Professional Documents
Culture Documents
I Introduction
Sparse Table is a data structure that can support operations wherein multiple values
does not affect the result. An example of such problem is this:
You are given an array on integers 𝐴[𝑁] and 𝑄 queries. For each query, you must find
the maximum value within the segment 𝐴[𝑙, 𝑟] wherein 𝑙 ≤ 𝑟.
A common problem such as this has many solutions, such as the naïve method which
has a runtime of 𝑂(𝑁) per query and segment tree with a runtime of 𝑂(log 𝑁) per query.
If there are updates, segment tree would very likely be a good choice, but since there
are none, then each query can actually be solved in 𝑂(1) using a sparse table.
Sparse Table is a data structure that uses dynamic programming to solve queries in very
efficiently. It also utilizes the fact that the answer for the query max(𝐴[𝑙, 𝑎], 𝐴[𝑏, 𝑟])
wherein 𝑙 ≤ 𝑏 < 𝑎 ≤ 𝑟 is the same with max(𝐴[𝑙, 𝑘], 𝐴[𝑘 + 1, 𝑟]). What this means that
repeating the segment [𝑏, 𝑎] on the query does not actually change the result, but what
if repeating a segment does actually affect the result? Take for example the following
problem:
You are given an array of positive integers 𝐴[𝑁] and 𝑄 queries. For each query, find the
sum of the subarray 𝐴[𝑙, 𝑟] wherein 𝑙 ≤ 𝑟.
This time, instead of the maximum value in the segment, the sum is now queried.
Unfortunately, by repeating a non-empty segment, the answer will actually be
affected. For example, let the segment to be queried on be 𝐴[𝑙, 𝑟]. If one answers it
normally as one would answer the previous problem, you would get
What if we want to solve a problem in 𝑂(1) but the operation involved does not have
an inverse, for example, finding the result of a range matrix multiplication or
multiplication modulo 𝑀 wherein 𝑀 can be any positive integer? Since these
operations does not have inverses, prefixes and suffixes cannot be used. How can these
be solved then? Simple, using Disjoint Sparse Tables!
The idea behind is like this, let the boundaries of the query be [𝑙, 𝑟] and 𝑚 be an integer
such that 𝑙 ≤ 𝑚 < 𝑟. If we have an index 𝑖 > 𝑚, then we precompute the following way
And if 𝑖 ≤ 𝑚, then
Wherein ⊗ denotes the operation to be used. Since 𝑙 ≤ 𝑚 < 𝑟, then the answer for the
segment [𝑙, 𝑟] would just be 𝑑𝑠𝑡[𝑙] ⊗ 𝑑𝑠𝑡[𝑟]. Since the values of 𝑑𝑠𝑡[ ] have already been
precomputed, then answering a query only takes 𝑂(1).
Disjoint Sparse Table is a recursive data structure and is built quite similarly to a
segment tree. Initially, the segment to be precomputed is [0, 𝑁 − 1] with 𝑙 = 0, 𝑟 = 𝑁 −
1, 𝑚 = 𝑁⁄2. Then, the segments [𝑙, 𝑚] and [𝑚 + 1, 𝑟] are recursively built until 𝑙 = 𝑟.
1. int DST[K][N], arr[MAXN];
2.
3. void build(int depth, int l, int r) {
4. if(l == r) return;
5. int m = (l + r)/2;
6. DST[depth][m] = arr[m];
7. for(int i=m-1; i>=l; i--) DST[depth][i] =
combine(arr[i], DST[depth][i+1]);
8. //To avoid out of bounds
9. if(m+1 <= r) {
10. DST[depth][m+1] = arr[m+1];
11. for(int i=m+2; i<=r; i++) DST[depth][i] =
combine(DST[depth][i-1], arr[i]);
12. }
13. build(depth+1, l, m);
14. build(depth+1, m+1, r);
15. }
In order to query in 𝑂(1), one must be able to find the depth wherein there exist a
midpoint 𝑚 such that 𝑙 ≤ 𝑚 < 𝑟 efficiently. In order to do this easily, the length of 𝑑𝑠𝑡
must be the smallest power of 2 at least 𝑁. The extra space can be filled with identity
elements or can be ignored at all.
𝐿𝑒𝑛𝑔𝑡ℎ = 2⌈log2 𝑁⌉
Depth Intervals
0 [0,31]
1 [0,15] [16,31]
2 [0,7][8,15][16, 23][24, 31]
3 [0,3][4,7][8,11][12,15][16,19][20,23][24,27][28,31]
Depth Midpoints
0 [15]
1 [7][23]
2 [3][11][19][27]
3 [1][5][9][13][17][21][25][29]
When converted to binary, the midpoints show an interesting pattern
As we can see, for a certain depth, a specific bit is always active and this bit differs
depending on the depth. Also, quite interestingly, these bits are easily calculable!
𝑏𝑖𝑡 = ⌈log 2 𝑁⌉ − 𝑑𝑒𝑝𝑡ℎ − 1
The problem that remains now is how to find the bit efficiently. Before that, we must
first explain more the importance of this bit. If both 𝑙 and 𝑟 are represented in binary,
then this bit is actually the most significant bit of 𝑙 ⊕ 𝑟 (proof omitted). Since it is
possible to get the most significant bit in 𝑂(1), then we are done!
Since the segments are halved at every level, then there are at most 𝑂(log 𝑁) levels and
at since each level takes 𝑂(𝑁), the total memory and preprocessing time is 𝑂(𝑁 log 𝑁).
Please do note that since 𝑙 ≤ 𝑚 < 𝑟, then the data structure cannot solve queries
wherein 𝑙 = 𝑟, fortunately such queries are trivial. Please refer to the sample code for
implementation details.
1. class disjointSparseTable {
2. private:
3. int size, levels;
4. vector<int> arr;
5. vector<vector<int> > dst;
6.
7. public:
8. void build(int k, int s, int e) {
9. if(s == e) return;
10. int m = (s + e) >> 1;
11. dst[m][k] = arr[m];
12. for(int i=m-1; i>=s; i--) dst[i][k] =
dst[i+1][k] + arr[i];
13. if(m+1 <= e) {
14. dst[m+1][k] = arr[m+1];
15. for(int i=m+2; i<=e; i++) dst[i][k] = dst[i-
1][k] + arr[i];
16. }
17. build(k+1, s, m);
18. build(k+1, m+1, e);
19. }
20.
21. disjointSparseTable(vector<int> in) {
22. int n = (int)in.size();
23. levels = __builtin_clz(n);
24. size = 1 << (31 - levels);
25. if(n != size) {
26. levels--;
27. size <<= 1;
28. }
29. arr = in;
30. while(arr.size() < size) arr.push_back(0);
31. dst.resize(size, vector<int>(22));
32. build(0,0,size-1);
33. }
34.
35. //already 0-based
36. int query(int l, int r) {
37. if(l == r) return arr[l];
38. unsigned int tmp = __builtin_clz(l ^ r);
39. unsigned int lvl = tmp - levels - 1;
40. return dst[l][lvl] + dst[r][lvl];
41. }
42. };
IV References
1. CodeChef Discussion on Disjoint Sparse Table
2. CodeForces Comment on Disjoint Sparse Table