Quirk identity with split () string

>>> 'hi'.split()[0] is 'hi'
    True    
>>> 'hi there'.split()[0] is 'hi'
    False
>>> 'hi there again'.split()[0] is 'hi'
    False

My hypothesis is:

The first line has only one element in split, while the other two have more than one element. I believe that although Python primitives such as strare stored in memory by value inside a function, separate distribution functions will be allocated to simplify memory management. I think that split()is one of these functions and usually sets newlines. But it also processes an edge input register that does not require any separation (for example 'hi'), where the original link to the string is simply returned. Is my explanation correct?

+4
source share
3 answers

, , Python, str, , .

Python . "", , - , , .

, , :

Py_LOCAL_INLINE(PyObject *)
STRINGLIB(split_whitespace)(PyObject* str_obj,
                           const STRINGLIB_CHAR* str, Py_ssize_t str_len,
                           Py_ssize_t maxcount)
{
    ...
#ifndef STRINGLIB_MUTABLE
        if (j == 0 && i == str_len && STRINGLIB_CHECK_EXACT(str_obj)) {
            /* No whitespace in str_obj, so just use it as list[0] */
            Py_INCREF(str_obj);
            PyList_SET_ITEM(list, 0, (PyObject *)str_obj);
            count++;
            break;
        }

, . , , , Python Python.

+1

, :

'hi there again'.split()[0] == 'hi'

>>True

, - .

0

All data in Python is stored by reference. (A PyObject*in implementation C). You found that you .split()just returned selfas optimization when the delimiter was not found. When a separator is found, it must create separate string objects for each part, so they are separate objects.

(Unlike Java, which has distinctly different data types for "primitives" and "reference types / classes" and behaves differently with them)

0
source

Source: https://habr.com/ru/post/1614793/


All Articles