I recently learned the Mo algorithm for quadratic decomposition of queries in order to speed up the resolution of certain problems.
To practice the implementation, I tried to solve D. A powerful array (a problem of the past Codeforces contest) using this idea. The problem is this:
You have an array with
whole numbers
.
Consider an arbitrary subframe
. Define
as the number of occurrences of an integer
in this subarray. The power of the subarray is defined as the sum
for all integers
(note that there is only a positive number of terms for which it is not zero).
Answer
inquiries. Each query contains two integers
and
, calculate the power <img src = "/img/159f7cc3f7225d16796b6ae13aa83a62.png" alt = "subarray"> .
It contains:



Using the Mo algorithm, I wrote code that solves this problem offline in
. I am sure that this problem can be solved using this algorithm and time complexity, since I checked the accepted code of others, and they also use a similar algorithm.
My code, however, exceeded the time limit for exceeding the verdict .
Below is the code I wrote:
#include <ios> #include <iostream> #include <cmath> #include <algorithm> #include <vector> #include <utility> #include <map> int sqt; long long int ans = 0; long long int arr[200005] = {}; long long int cnt[1000005] = {}; long long int tans[200005] = {}; struct el { int l, r, in; }; bool cmp(const el &x, const el &y) { if (xl/sqt != yl/sqt) return xl/sqt < yl/sqt; return xr < yr; } el qr[200005]; int main() { std::ios_base::sync_with_stdio(false); std::cin.tie(NULL); std::cout.tie(NULL); int n, q, a, b; std::cin >> n >> q; sqt = sqrt((double)(n))+27; for (int i = 0; i < n; i++) std::cin >> arr[i]; for (int i = 0; i < q; i++) { std::cin >> a >> b; a--; b--; qr[i].l = a; qr[i].r = b; qr[i].in = i; } std::sort(qr, qr+q, cmp); int li = 0; //left iterator int ri = 0; //right iterator ans = arr[0]; cnt[arr[0]]++; for (int i = 0; i < q; i++) { while (li < qr[i].l) { ans -= cnt[arr[li]]*cnt[arr[li]]*arr[li]; cnt[arr[li]]--; ans += cnt[arr[li]]*cnt[arr[li]]*arr[li]; li++; } while (li > qr[i].l) { li--; ans -= cnt[arr[li]]*cnt[arr[li]]*arr[li]; cnt[arr[li]]++; ans += cnt[arr[li]]*cnt[arr[li]]*arr[li]; } while (ri < qr[i].r) { ri++; ans -= cnt[arr[ri]]*cnt[arr[ri]]*arr[ri]; cnt[arr[ri]]++; ans += cnt[arr[ri]]*cnt[arr[ri]]*arr[ri]; } while (ri > qr[i].r) { ans -= cnt[arr[ri]]*cnt[arr[ri]]*arr[ri]; cnt[arr[ri]]--; ans += cnt[arr[ri]]*cnt[arr[ri]]*arr[ri]; ri--; } tans[qr[i].in] = ans; } for (int i = 0; i < q; i++) std::cout << tans[i] << '\n'; }
Can you suggest any non-asymptotic (or perhaps even asymptotic) improvement that can speed up the program to pass the time limit?
I have already tried the following things to no avail:
- Using a vector instead of an array.
- Using two nested pairs instead of structure.
- Use only one pair, and then use the card to try to restore the correct order of answers.
- Adding multiple constants in
sqt (e.g.
in the code above). - Overload <comparison operator inside the
el structure itself.
I feel that I am missing something important, since the other codes that I checked seem to have a time limit by a small margin (about half a second). However, they seem to be using the same algorithm as my code.
Any help would be greatly appreciated!