Cutoff end points invisibly truncated

>>> class Potato(object): ... def __getslice__(self, start, stop): ... print start, stop ... >>> sys.maxint 9223372036854775807 >>> x = sys.maxint + 69 >>> print x 9223372036854775876 >>> Potato()[123:x] 123 9223372036854775807 

Why doesn't the getlice call respect the sent stop , instead silently replacing 2 ^ 63 - 1 instead? Does this mean that the __getslice__ implementation for your syntax will usually be unsafe with long ones?

In any case, I can do everything I need with __getitem__ , I'm just wondering why __getslice__ seems to be broken.

Edit: Where is the code in CPython that cuts the slice? Is this part of the python specification (language) or just a cpython "function" (implementation)?

+6
source share
1 answer

Python C code that processes slices for objects that implement the sq_slice slot cannot process integers over Py_ssize_t (== sys.maxsize ). The word sq_slice is the C-API equivalent of the __getslice__ special method.

For a two-element slice, Python 2 uses one of SLICE+* opcodes ; this is then handled by the apply_slice() function. This uses the _PyEval_SliceIndex function to convert Python index objects ( int , long or something that implements the __index__ method ) to an integer Py_ssize_t . The method has the following comment:

 /* Extract a slice index from a PyInt or PyLong or an object with the nb_index slot defined, and store in *pi. Silently reduce values larger than PY_SSIZE_T_MAX to PY_SSIZE_T_MAX, and silently boost values less than -PY_SSIZE_T_MAX-1 to -PY_SSIZE_T_MAX-1. Return 0 on error, 1 on success. */ 

This means that any slicing in Python 2 using a 2-digit value syntax is limited to values ​​in the sys.maxsize range when the sq_slice slot is sq_slice .

Slicing using a three-digit form ( item[start:stop:stride] ) instead uses the BUILD_SLICE (then BINARY_SUBSCR ), and instead creates a slice() object, not limited to sys.maxsize .

If the object does not implement the sq_slice() slot (therefore there is no __getslice__ ), the apply_slice() function apply_slice() also return to using the slice() object.

As for whether this is an implementation detail or part of a language: The section expression documentation distinguishes between simple_slicing and extended_slicing ; the first only allows the form short_slice . For simple slicing, indices should be integers:

The lower and upper boundary expressions, if any, should be evaluated equal to integers; the default values ​​are zero and sys.maxint , respectively.

This suggests that Python 2 restricts indexes to sys.maxint values ​​by prohibiting long integers. In Python 3, simple slicing was cut from the whole language.

If your code should support slicing with values ​​greater than sys.maxsize , and you need to inherit from a type that implements __getslice__ , then your options:

  • use a three-digit syntax, with None for the step:

     Potato()[123:x:None] 
  • to create slice() objects explicitly:

     Potato()[slice(123, x)] 

slice() objects can handle long integers just fine; however, the slice.indices() method cannot handle lengths by sys.maxsize yet:

 >>> import sys >>> s = slice(0, sys.maxsize + 1) >>> s slice(0, 9223372036854775808L, None) >>> s.stop 9223372036854775808L >>> s.indices(sys.maxsize + 2) Traceback (most recent call last): File "<stdin>", line 1, in <module> OverflowError: cannot fit 'long' into an index-sized integer 
+7
source

Source: https://habr.com/ru/post/1201525/


All Articles