Performance optimization of the counting algorithm in Pypy vs Python (Numpy vs List)

Question

Performance optimization of the counting algorithm in Pypy vs Python (Numpy vs List)

I was expecting pypy to be an order of magnitude faster than python, but the results show that pypy is actually slower than expected.

I have two questions:

Why is pypy significantly slower with numpy?
Is there anything I can do to optimize my algorithm to make pypy (or python) faster?

Resulting timings:

Python 2.7.5

# points: 16,777,216 (8 ** 3 * 32 ** 3)
Xrange time: 1487.15 ms
Xrange Numpy time: 2553.98 ms
Point Gen Time: 6162.23 ms
Nump Creation Time: 13894.73 ms

Pypy 2.2.1

# points: 16,777,216 (8 ** 3 * 32 ** 3)
Transmission Time: 129.48 ms
Xrange Numpy time: 4644.12 ms
Point Gen Time: 4643.82 ms
Numpy Creation Time: 44168.98 ms

Algorithm

, , .

def generate(size=32, point=(0, 0, 0), width=32):
    """
    generate points in space around a center point with a specific width and
      number of divisions (size)
    """
    X, Y, Z = point
    half = width * 0.5
    delta = width
    scale = width / size
    offset = scale * 0.5
    X = X + offset - half
    Y = Y + offset - half
    Z = Z + offset - half
    for x in xrange(size):
        x = (x * scale) + X
        for y in xrange(size):
            y = (y * scale) + Y
            for z in xrange(size):
                z = (z * scale) + Z
                yield (x, y, z)

pypy, python. , , :

xrange

rsize = 8    # size of region
csize = 32   # size of chunk
number_of_points = rsize ** 3 * csize ** 3
[x for x in xrange(number_of_points)]

xrange numpy

rsize = 8    # size of region
csize = 32   # size of chunk
number_of_points = rsize ** 3 * csize ** 3
np.array([x for x in xrange(number_of_points)])

rsize = 8    # size of region
csize = 32   # size of chunk
[p
 for rp in generate(size=rsize, width=rsize*csize)
 for p in generate(size=csize, width=csize, point=rp)]

numpy

rsize = 8    # size of region
csize = 32   # size of chunk
np.array([p
 for rp in generate(size=rsize, width=rsize*csize)
 for p in generate(size=csize, width=csize, point=rp)])

:

voxel, , . , , Java/++, python ( pypy).

, , . , (), . Numpy , -numpy. numpy .

, , numpy. , , , , . , , , , numpy. , , .get , __ getitem __ (.. [lookup] vs. dicitonary.get(lookup))

timings...

Python 2.7.5

 - Option 1: tuple access... 2045.51 ms
 - Option 2: tuple access (again)... 2081.97 ms    # sampling effect of cache
 - Option 3: list access... 2072.09 ms
 - Option 4: dict access... 3436.53 ms
 - Option 5: iterable creation... N/A
 - Option 6: numpy array... 1752.44 ms

 - Option 1: tuple creation... 690.36 ms
 - Option 2: tuple creation (again)... 716.49 ms    # sampling effect of cache
 - Option 3: list creation... 684.28 ms
 - Option 4: dict creation... 1498.94 ms
 - Option 5: iterable creation...  0.01 ms
 - Option 6: numpy creation... 3514.25 ms

Pypy 2.2.1

 - Option 1: tuple access... 243.34 ms
 - Option 2: tuple access (again)... 246.51 ms    # sampling effect of cache
 - Option 3: list access... 139.65 ms
 - Option 4: dict access... 454.65 ms
 - Option 5: iterable creation... N/A
 - Option 6: numpy array... 21.60 ms

 - Option 1: tuple creation... 1016.27 ms
 - Option 2: tuple creation (again)... 1063.50 ms    # sampling effect of cache
 - Option 3: list creation... 365.98 ms
 - Option 4: dict creation... 2258.44 ms
 - Option 5: iterable creation...  0.00 ms
 - Option 6: numpy creation... 12514.20 ms

.

dsize = 10 ** 7   # or 10 million data points
data = [(i, random.random()*dsize)
         for i in range(dsize)]
lookup = tuple(int(random.random()*dsize) for i in range(dsize))

:

for x in lookup:
    data_of_specific_type[x]

data_of_specific_type (, (), () ..)

+4

performance optimization python numpy pypy

Brian Bruggeman 22 '14 23:26

1

Alex Gaynor · Answer 1 · 2014-05-24T02:58:32+0000

, :

np.array([p
    for rp in generate(size=rsize, width=rsize*csize)
    for p in generate(size=csize, width=csize, point=rp)])

list np.array.

:

arr = np.empty(size)
i = 0
for rp in generate(size=rsize, width=rsize*csize):
    for p in generate(size=csize, width=csize, point=rp):
        arr[i] = p
        i += 1

Performance optimization of the counting algorithm in Pypy vs Python (Numpy vs List)

More articles: