Are PyArg_ParseTuple () "s" format specifiers useful in the Python 3.x C API?

I am trying to write a Python C extension that processes byte strings, and I have something that basically works for Python 2.x and Python 3.x.

For Python 2.x code, next to the start of my function, I currently have a line:

    if (!PyArg_ParseTuple(args, "s#:in_bytes", &src_ptr, &src_len))
    ...

I notice that the format specifier s#accepts both Unicode strings and byte strings. I just want it to accept byte strings and reject Unicode. For Python 2.x, this may be “good enough” - the standard hashlibone seems to do the same, accepting Unicode as well as byte strings. However, Python 3.x is designed to clear a messy Unicode / byte string and does not allow the interchangeability of the two.

So, I was surprised to find that in Python 3.x, format specifierss for PyArg_ParseTuple()still seem to accept Unicode and provide a "default encoded string version" of Unicode. This seems to run counter to the principles of Python 3.x, making format specifiers sunusable. Is my analysis correct, or am I missing something?

Look at the implementation for hashlibPython 3.x (for example, see md5module.cfunction MD5_update()and its use of GET_BUFFER_VIEW_OR_ERROUT()macro). I see that it avoids format specifiers sand simply accepts a common object ( Ospecifier) ​​and then performs various explicit type checks using a macro GET_BUFFER_VIEW_OR_ERROUT(). Is that what we should do?

+3
1

- , C API Python 3 , Python. , , , " " - - API Python C ( , , , , -.)

+3

Source: https://habr.com/ru/post/1736431/


All Articles