How to provide custom function for python blaze using sqlite backend?

I connect to the sqlite database in Blaze using df = bz.Data("sqlite:///<mydatabase>) everything works fine, but I don’t know how to provide custom functions in my interaction with df. I have a column with the IP name in df, which is text containing IP addresses, I also have a function toSubnet (x, y) that accepts the IP address (x) in text format and returns its / subnet. For example:

 out = toSubnet('1.1.1.1',24) out 1.1.1.0/24 

Now, if I want to map all IP addresses to their / 14 subnets, I use:

 df.IP.map(lambda x:toSubnet(x,14),'string') 

What works when the backend is CSV. But with sqlite backend I get a NotImplementedError . What is wrong here?

+5
source share
1 answer

NB:. This does not tell you how to do exactly what you want, but gives an explanation of why this does not work, and a possible next step to get this to work with SQLite.

The problem you are facing is that it is very difficult to efficiently execute arbitrary Python code for an arbitrary SQL database.

Blaze takes user code and translates it into SQL as much as possible using SQLAlchemy, which I think has no way to do this.

Since almost every database has a different way of working with user-defined functions (UDF), quite a lot of work is on creating an API that allows the following:

  • User to define function in Python
  • Turn this pure Python function into a database-specific UDF.

However, the Python interface for SQLite has a way to register Python functions that can be executed in an SQL statement:

https://docs.python.org/2/library/sqlite3.html#sqlite3.Connection.create_function

There is currently no way to express UDF using Blaze using the SQL backend, although this can be implemented as a new type of expression that allows the user to register a function through the base db API database.

+6
source

Source: https://habr.com/ru/post/1012439/


All Articles