How to effectively find the 10 largest numbers out of billions of numbers?

Task: Find 10 maximum numbers from a file that contains billions of numbers

Input: 97911 98855 12345 78982 ..... .....

I really came up with a solution below that has

  • best degree of difficulty O(n)- When a file has numbers in descending order
  • worst difficulty O(n*10) ~ O(n)When a file is numbered in ascending order
  • Medium difficulty ~ O(n)

The complexity of space O(1)in all cases

I am reading a file using a file reader and a sorted array that stores a maximum of 10 numbers. I will check if the currentLine is larger than the smallest element in the array. If it will be inserted in the correct position by replacement.

Scanner sc = new Scanner(new FileReader(new File("demo.txt")));
int[] maxNum = new int[10];
    while(sc.hasNext()){
    int phoneNumber = Integer.parseInt(sc.nextLine());
    if(phoneNumber>maxNum[9]){
        maxNum[9] = phoneNumber;
        for(int i =9;i>0;i--){
            if(maxNum[i]>maxNum[i-1]){
                int temp = maxNum[i];
                maxNum[i] = maxNum[i-1];
                maxNum[i-1] = temp;
            }
        }
    }
    }

,

+4
3

, , 10 . O (n) - , .

( ) maxNum -. , (, 100 ). , 10.

+4

. , . 20 20 10 . 10 20 ( 10), .

, , . , . , , . O (n), , (, t), n/t-. , t , .

, , - , .

+3

, K N :

  • O (N lg N), K . , ( ) , โ€‹โ€‹ MergeSort.

  • Min- K N. K , . : O (N lg K). Min- .

  • Use the selection algorithm to find (NK) th the largest value at the expected time O (N). The Quickselect algorithm, which uses the Quicksort separation algorithm, also separates values โ€‹โ€‹such that the largest K values โ€‹โ€‹are on one side of the (NK) largest. Estimated Runtime: O (N). However, this selection algorithm is in memory.

+1
source

Source: https://habr.com/ru/post/1666823/


All Articles