I just proved that the sieve of Eratosthenes is less effective than the trial unit?

I tried to compare the speed during the execution of two algorithms: the brute force program C for printing prime numbers (10,000 numbers) and the Eratosthenes C lattice program (also 10,000 prime numbers).

My measured run time for sieve algorithm: 0.744 seconds

My runtime for brute force algorithm: 0.262 seconds

However, I was told that the Sieve of Eratosthenes algorithm is more efficient than brute force, and therefore I thought it would work faster. Therefore, either I am mistaken, or my program is corrupted (which I doubt).

Therefore, my question is: Since I have opposite results than expected, does this prove that the Eratosthenes sieve is indeed a less efficient algorithm in terms of speed compared to trial fission?

I'm not sure if this makes any difference, but I'm using the Dev C ++ compiler and Windows 7.

+4
source share
6 answers

TL DR: comparing the speed of code variants with just one input size is pointless; a comparison of empirical orders of growth really reflects the algorithmic nature of the code and will be consistent on different test platforms, for the same range of measurements of input sizes. Comparison of absolute speed values ​​makes sense only for code variants that exhibit the same asymptotic or at least local growth patterns.


It is not enough to measure the speed of your two implementations with only one input size. Usually, several data points are required to estimate the runtime of the empirical growth orders of our code (since the code can be run with different input sizes), It is defined as the logarithm of the runtime relationship, based on the ratio of input sizes.

Thus, even if at some input code_1 is 10 times faster than code_2 , but its start time doubles with each doubling of the input size, while for code_2 it grows only as 1.1x, very soon code_2 will become much faster than code_1 .

Thus, a real measure of the effectiveness of an algorithm is its runtime complexity (and the complexity of its space, i.e. memory requirements). And when we measure it empirically, we only measure if for a specific code at hand (in a certain range of input sizes), and not for the algorithm itself, i.e. Its perfect implementation.

In particular, the theoretical complexity of the test division is O(n^1.5 / (log n)^0.5) , in n primes, usually regarded as the empirical order of growth ~ n^1.40..1.45 (but it can be ~n^1.3 initially, for smaller input sizes). For the Eratosthenes sieve, it is O(n log n log (log n)) , usually considered as ~ n^1.1..1.2 . But, of course, there are suboptimal implementations of both trial fission and the sieve of Eratosthenes, which work at ~n^2.0 and worse.

So no , that doesn't prove anything. One datapoint does not make sense, at least three are needed to get a "big picture", that is, in order to predict with certainty the runtime of & frasl; necessary for large input sizes.

Forecasting with known certainty is what the scientific method speaks of.


By the way, your work time is very long. The calculation of 10,000 primes should be almost instantaneous, which is much less than 1 / 100th of a second to run the program on a quick box. Perhaps you are also measuring print time. Not.:)

+5
source

No, the elapsed time is not a standard for measuring efficiency, since it varies from platform to platform, saying that "my algorithm works in 10 seconds" practically does not provide information about the algorithm itself. In addition to this, you will need to list all the environmental specifications and other processes running at the same time, and this will be a huge mess. Consequently, the development of ordinal designations (Big Oh, Little Oh, Omega, etc.).

Efficiency usually branches out into two sections:

  • Time efficiency.
  • Space efficiency.

... where one algorithm can be extremely time efficient, but very space inefficient. And vice versa. Algorithms are analyzed based on their asymptotic behavior when scaling the number of instructions needed to execute for a given input n . This is a very high-level field explanation that has been carefully studied by PhD Computer Scientists - I suggest you read about it here for the best low-level explanation you will find.

Notice, I am attaching a link to refer to Big Oh - all sister entries can be found on this page on Wikipedia, and this is usually a good place to start. He will also use the difference in the effectiveness of space and time.

Small application of time efficiency with Big Oh:

Consider the following recursive function in Racket (it would be in Python if I knew this is the best pseudocode I can do):

 (define (fn_a input_a) (cond [(empty? input_a) empty] [(empty? (rest input_a)) input_a] [(> (first input_a) (fn_a (rest input_a))) (cons (first input_a) empty)] [else (fn_a (rest input_a))])) 

... we see that: empty? , rest , > and first are all O (1). We also notice that in the worst case, the call is made to fn_a in the third condition and the fourth condition on rest of input_a . Then we can write our recurrence relation as T (n) = O (1) + 2T (n - 1). Looking at this in the relapse relationship diagram, we see that fn_a is of order O (2 ^ n), since in the worst case two recursive calls are made.

It is also important to note that the formal definition of Big Oh is also correct (but useless) to say that fn_a is equal to O (3 ^ n). Many of the algorithms in the analysis are described using Big Oh, but it would be more appropriate to use Big Theta to draw boundaries, which essentially means: the lowest, most accurate order in relation to this algorithm.

Be careful, read the formal definitions!

+6
source

Does a longer run time mean a less efficient algorithm?

Not necessary. The effectiveness of a program is measured not only by the time it takes, but also by the resources that it takes. Space is another factor to consider when considering performance.

From the wiki : -

For maximum efficiency, we want to minimize the use of resources. However, different resources (for example, time, space) cannot be compared directly, so which of the two algorithms is considered more effective depends on which measure of efficiency is considered as the most important, for example. Is it a high speed requirement or minimal memory usage or for some other measure?

+2
source

In general: yes, but when you are in the lower range of 1 second, there is a lot of noise that can confuse ...

Performs each test many times and uses some statistics on the results (for example, average or average / deviation depending on how much you care)

And / or do it to do more work - for example, find more prime numbers

+1
source

In short, yes, if you think time is effective in terms of efficiency. There are memory considerations too.

Be careful how you measure - make sure your timing tools are accurate.

Make sure you measure the same computer when nothing is working.
Make sure you measure several times and take the average and varinace for a worthy comparison.
Consider getting someone to review your code to verify that it is doing what you think it is doing.

+1
source

The effectiveness of algorithms is usually measured by how efficiently they process large inputs. 10,000 numbers is not a very large entrance, so you may need to use a large one before the Eratosthenes sieve starts faster.

Also, in one of your implementations there may be a big

Finally, the effectiveness of the algorithms can be measured by the amount of memory needed (but this measure is less common, especially since the memory is now so cheap)

+1
source

Source: https://habr.com/ru/post/1497401/


All Articles