Data structure problem

Given a sequence of integers, there are a number of queries. Each query has a range of [l, r], and you should find the median of a given range of [l, r]

The number of requests can reach 100,000. The length of the sequence can reach 100,000.

I wonder if any data structure can support such a query


My decision:

I am consulting my partner today, and he says to use the partition tree.

We can build a partition tree in nlog (n) time and respond to each request in log (n) time

The partition tree is actually a merge sort process, but for each node in the tree it stores the number of integers that go to the left subtree. Thus, we can use this information to process the request.

here is my code:

This program should find x in a given interval [l, r], which minimizes the following equation.

alt text http://acm.tju.edu.cn/toj/3556_01.jpg

Explanation:

seq keeps the sequence

pos retains position after sorting

ind stores index

cntL stores the number of integers that go to the left tree in a given range

#include <cstdio>
#include <cstring>
#include <algorithm>
using namespace std;
#define N 100008
typedef long long LL;
int n, m, seq[N], ind[N], pos[N], next[N];
int cntL[20][N];
LL sum[20][N], sumL, subSum[N];

void build(int l, int r, int head, int dep)
{
    if (l == r)
    {
        cntL[dep][l] = cntL[dep][l-1];
        sum[dep][l] = sum[dep][l-1];
        return ;
    }
    int mid = (l+r)>>1;
    int hl = 0, hr = 0, tl = 0, tr = 0;
    for (int i = head, j = l; i != -1; i = next[i], j++)
    {
        cntL[dep][j] = cntL[dep][j-1];
        sum[dep][j] = sum[dep][j-1];
        if (pos[i] <= mid)
        {
            next[tl] = i;
            tl = i;
            if (hl == 0) hl = i;
            cntL[dep][j]++;
            sum[dep][j] += seq[i];
        }
        else
        {
            next[tr] = i;
            tr = i;
            if (hr == 0) hr = i;
        }
    }
    next[tl] = -1;
    next[tr] = -1;
    build(l, mid, hl, dep+1);
    build(mid+1, r, hr, dep+1);
}

int query(int left, int right, int ql, int qr, int kth, int dep)
{
    if (left == right)
    {
        return ind[left];
    }
    int mid = (left+right)>>1;
    if (cntL[dep][qr] - cntL[dep][ql-1] >= kth)
    {
        return query(left, mid, left+cntL[dep][ql-1]-cntL[dep][left-1], left+cntL[dep][qr]-cntL[dep][left-1]-1, kth, dep+1);
    }
    else
    {
        sumL += sum[dep][qr]-sum[dep][ql-1];
        return query(mid+1, right, mid+1+ql-left-(cntL[dep][ql-1]-cntL[dep][left-1]), mid+qr+1-left-(cntL[dep][qr]-cntL[dep][left-1]), \
                kth-(cntL[dep][qr]-cntL[dep][ql-1]), dep+1);
    }
}

inline int cmp(int x, int y)
{
    return seq[x] < seq[y];
}

int main()
{
    int ca, t, i, j, middle, ql, qr, id, tot;
    LL ans;
    scanf("%d", &ca);
    for (t = 1; t <= ca; t++)
    {
        scanf("%d", &n);
        subSum[0] = 0;
        for (i = 1; i <= n; i++) 
        {
            scanf("%d", seq+i);
            ind[i] = i;
            subSum[i] = subSum[i-1]+seq[i];
        }
        sort(ind+1, ind+1+n, cmp);
        for (i = 1; i <= n; i++)
        {
            pos[ind[i]] = i;
            next[i] = i+1;
        }
        next[n] = -1;
        build(1, n, 1, 0);
        printf("Case #%d:\n", t);
        scanf("%d", &m);
        while (m--)
        {
            scanf("%d%d", &ql, &qr);
            ql++, qr++;
            middle = (qr-ql+2)/2;
            sumL= 0;
            id = query(1, n, ql, qr, middle, 0);
            ans = subSum[qr]-subSum[ql-1]-sumL;
            tot = qr-ql+1;
            ans = ans-(tot-middle+1)*1ll*seq[id]+(middle-1)*1ll*seq[id]-sumL;
            printf("%lld\n", ans);
        }
        puts("");
    }
}
+3
source share
1 answer

This is called the range offset problem. The following document may make a difference: Towards median media in the optimal range . (Free link, thanks belisarius).

From the abstract of the article:

: n , , , . , O (nlogk + klogn) k . k = O (n). O (nlogn) O (n) . O (n) O (nlogn) , O (LOGN/loglogn). , O (log ^ 2n) O (nlogn) O ((logn/loglogn) ^ 2) O (nlogn/loglogn) , , , O (log ^ O (1) n) Q (logn/loglogn).

, , , - .

, O (n ^ 3) (, , O (n ^ 2logn)) O (n ^ 2), O ( 1) .

. , , r-l ? ....

+4

Source: https://habr.com/ru/post/1755982/


All Articles