How do you avoid strings for SQLite table / column names in Python?

The standard approach to using variable values ​​in SQLite queries is the "question mark style", for example:

import sqlite3 with sqlite3.connect(":memory:") as connection: connection.execute("CREATE TABLE foo(bar)") connection.execute("INSERT INTO foo(bar) VALUES (?)", ("cow",)) print(list(connection.execute("SELECT * from foo"))) # prints [(u'cow',)] 

However, this only works for substituting values ​​into queries. It does not work with table or column names:

 import sqlite3 with sqlite3.connect(":memory:") as connection: connection.execute("CREATE TABLE foo(?)", ("bar",)) # raises sqlite3.OperationalError: near "?": syntax error 

Neither the sqlite3 module nor PEP 249 mention a function to escape names or values. Presumably this prevents users from collecting their queries with strings, but that leaves me at a loss.

Which function or method is most suitable for using variable names for columns or tables in SQLite? I would prefer to do this without any other dependencies, as I will use it in my own shell.

I searched, but could not find, a clear and complete description of the corresponding part of the SQLite syntax to use to write my own function. I want to be sure that this will work for any identifier allowed by SQLite, so the trial and error solution is too vague for me.

SQLite uses " to quote identifiers , but I'm not sure it's easy to avoid them. The PHP function documentation sqlite_escape_string assumes that some binary data can also be escaped, but this may be a fad of the PHP library.

+47
python sqlite sqlite3
Jun 28 '11 at 23:33
source share
9 answers

To convert any string to a SQLite identifier:

  • Make sure the string can be encoded as UTF-8.
  • Make sure the string does not contain any NUL characters.
  • Replace everything with "" .
  • Wrap everything in double quotes.

Implementation

 import codecs def quote_identifier(s, errors="strict"): encodable = s.encode("utf-8", errors).decode("utf-8") nul_index = encodable.find("\x00") if nul_index >= 0: error = UnicodeEncodeError("NUL-terminated utf-8", encodable, nul_index, nul_index + 1, "NUL not allowed") error_handler = codecs.lookup_error(errors) replacement, _ = error_handler(error) encodable = encodable.replace("\x00", replacement) return "\"" + encodable.replace("\"", "\"\"") + "\"" 

Given a single string argument, it will run away and correctly point it or throw an exception. The second argument can be used to specify any error handler registered in the codecs module . Built-in:

  • 'strict' : throw an exception in case of an encoding error
  • 'replace' : replace invalid data with a suitable replacement marker, e.g. '?' or '\ufffd'
  • 'ignore' : ignore invalid data and continue without further notice
  • 'xmlcharrefreplace' : replace the corresponding XML character reference (for encoding only)
  • 'backslashreplace' : replace executable escape sequences (encoding only)

This does not check for reserved identifiers, so if you try to create a new SQLITE_MASTER table, it will not stop you.

Usage example

 import sqlite3 def test_identifier(identifier): "Tests an identifier to ensure it handled properly." with sqlite3.connect(":memory:") as c: c.execute("CREATE TABLE " + quote_identifier(identifier) + " (foo)") assert identifier == c.execute("SELECT name FROM SQLITE_MASTER").fetchone()[0] test_identifier("'Héllo?'\\\n\r\t\"Hello!\" -☃") # works test_identifier("北方话") # works test_identifier(chr(0x20000)) # works print(quote_identifier("Fo\x00o!", "replace")) # prints "Fo?o!" print(quote_identifier("Fo\x00o!", "ignore")) # prints "Foo!" print(quote_identifier("Fo\x00o!")) # raises UnicodeEncodeError print(quote_identifier(chr(0xD800))) # raises UnicodeEncodeError 

Observations and links

  • SQLite TEXT identifiers, not binary.
    • SQLITE_MASTER schema in FAQ
    • The Python 2 SQLite API shouted at me when I gave it bytes, it could not decode as text.
    • The Python 3 SQLite API requires str s queries, not bytes .
  • Double quotes in SQLite identifiers are escaped as two double quotes.
  • SQLite identifiers are case-sensitive, but they are not case-sensitive ASCII letters. You can enable unicode case insensitivity.
  • sqlite3 can handle any other Unicode string if it can be correctly encoded in UTF-8. Incorrect strings can cause crashes between Python 3.0 and Python 3.1.2 or so. Python 2 accepted these invalid strings, but this is considered a bug.
+30
Jul 15 2018-11-11T00:
source share

The psycopg2 documentation explicitly recommends using the usual python% or {} formatting to replace table and column names (or other bits of dynamic syntax) in the names, and then use the parameter mechanism to substitute values ​​into the query.

I disagree with everyone who says: "Never use dynamic table / column names, you are doing something wrong if you need to." Every day I write programs to automate working with databases, and I do this all the time. We have many databases with a large number of tables, but all of them are based on repeating patterns, so the general code for processing them is extremely useful. Manually recording requests each time would be much more error prone and dangerous.

It comes down to what “safe” means. The usual wisdom is that using the usual python string manipulation to enter values ​​into your queries is not "safe". This is due to the fact that everything that can go wrong if you do this, and such data very often comes from the user and is not under your control. You need a 100% reliable way to avoid these values ​​properly so that the user cannot inject SQL into the data value and execute its database. Thus, library authors do this work; you should never.

If, however, you write general utility code for working with things in databases, these considerations do not apply so much. You implicitly grant anyone who can call such a code access to everything in the database; that point helper code . Therefore, now the security problem is that the data created by the user should never be used in such code. This is a common coding security issue, and it is the same problem as blind exec entering a user input string. This is a great problem from pasting values ​​into your queries because you want to be able to safely process user input.

So, my recommendation: do what you want to dynamically collect your requests. Use the usual python template line to add tables and columns to the names, stick where clauses and joins are, everything is good (and terrible for debugging). But make sure that you know that any values ​​that affect such code should come from you , not from your users [1]. You then use the SQLite replacement functionality to safely insert user-entered values ​​into your queries as values.

[1] If (as is the case with a lot of code that I write), your users are people who in any case have full access to the databases, and the code is to simplify their work, then this consideration really does not apply ; you are probably collecting queries in user tables. But you should still use SQLite parameter substitution to save yourself from the inevitable genuine value that ultimately contains quotes or percent signs.

+27
Jun 29 '11 at 3:32
source share

If you are absolutely sure that you need to specify column names dynamically, you should use a library that can do this safely (and complains about what's wrong). SQLAlchemy is very good at this.

 >>> import sqlalchemy >>> from sqlalchemy import * >>> metadata = MetaData() >>> dynamic_column = "cow" >>> foo_table = Table('foo', metadata, ... Column(dynamic_column, Integer)) >>> 

foo_table now represents a table with a dynamic schema, but you can use it only in the context of a real database connection (so sqlalchemy knows the dialect and what to do with the generated sql).

 >>> metadata.bind = create_engine('sqlite:///:memory:', echo=True) 

You can then issue CREATE TABLE ... with echo=True , sqlalchemy will record the generated sql, but in general sqlalchemy does everything possible to generate sql from your hands (so that you do not use it for evil purposes).

 >>> foo_table.create() 2011-06-28 21:54:54,040 INFO sqlalchemy.engine.base.Engine.0x...2f4c CREATE TABLE foo ( cow INTEGER ) 2011-06-28 21:54:54,040 INFO sqlalchemy.engine.base.Engine.0x...2f4c () 2011-06-28 21:54:54,041 INFO sqlalchemy.engine.base.Engine.0x...2f4c COMMIT >>> 

and yes, sqlalchemy will take care of any column names that need special handling, for example when the column name is a reserved word sql

 >>> dynamic_column = "order" >>> metadata = MetaData() >>> foo_table = Table('foo', metadata, ... Column(dynamic_column, Integer)) >>> metadata.bind = create_engine('sqlite:///:memory:', echo=True) >>> foo_table.create() 2011-06-28 22:00:56,267 INFO sqlalchemy.engine.base.Engine.0x...aa8c CREATE TABLE foo ( "order" INTEGER ) 2011-06-28 22:00:56,267 INFO sqlalchemy.engine.base.Engine.0x...aa8c () 2011-06-28 22:00:56,268 INFO sqlalchemy.engine.base.Engine.0x...aa8c COMMIT >>> 

and can save you from possible failure:

 >>> dynamic_column = "); drop table users; -- the evil bobby tables!" >>> metadata = MetaData() >>> foo_table = Table('foo', metadata, ... Column(dynamic_column, Integer)) >>> metadata.bind = create_engine('sqlite:///:memory:', echo=True) >>> foo_table.create() 2011-06-28 22:04:22,051 INFO sqlalchemy.engine.base.Engine.0x...05ec CREATE TABLE foo ( "); drop table users; -- the evil bobby tables!" INTEGER ) 2011-06-28 22:04:22,051 INFO sqlalchemy.engine.base.Engine.0x...05ec () 2011-06-28 22:04:22,051 INFO sqlalchemy.engine.base.Engine.0x...05ec COMMIT >>> 

(apparently some weird things are perfectly legal identifiers in sqlite)

+16
Jun 29 2018-11-11T00:
source share

the first thing to understand is that table and column names cannot be escaped in the same sense as you can avoid rows stored as database values.

The reason is that you either have to:

  • accept / reject potential table / column name, i.e. it is not guaranteed that a row is a valid column / table name, unlike a row that should be stored in some database; or,
  • misinform a string that will have the same effect as creating a digest: the surjective function used , not bijective (again, the inverse is true for the string that should be stored in some database); so you not only cannot be sure to switch from the sanitized name to the original name, but you run the risk of inadvertently trying to create two columns or tables with the same name.

Having understood that the second thing you need to understand is that how you end up “escaping” the table / column names depends on your specific context, and therefore there is more than one way to do this, but whatever in a way, you will need to dig to determine exactly what is or is not a valid column / table name in sqlite.

To get started, here is one condition:

Table names starting with "sqlite_" are reserved for internal use. Error creating table with name starting with "sqlite _".

Moreover, using certain column names can have unintended side effects:

Each row of each SQLite table has a 64-bit signed integer key that uniquely identifies the row in its table. This is an integer commonly called a "rowid". The rowid value can be obtained using one of the special case-independent names "rowid", "oid" or "rowid" in the place of the column name. If the table contains a user-defined column named "rowid", "oid" or "rowid", then that name always refers to an explicitly declared column and cannot be used to retrieve an integer rowid value.

Both texts quoted are from http://www.sqlite.org/lang_createtable.html

+7
Jul 13 '11 at 23:24
source share

From sqlite faq, question 24 (the wording of the question, of course, does not make it clear that the answer may be useful for your question):

SQL uses double quotes around identifiers (column or table names) that contain special characters or keywords. Thus, double quotes are a way to escape identifier names.

If the name itself contains double quotes, escape that double quote with another.

+6
Jul 13 2018-11-11T00:
source share

Placeholders are for values ​​only. Column and table names are structural and similar to variable names; You cannot use placeholders to fill them.

You have three options:

  • Accordingly, avoid / quote the column name wherever you use it. It is fragile and dangerous.
  • Use an ORM like SQLAlchemy , which will take care of speeding / citing for you.
  • Ideally, there are simply no dynamic column names. Tables and columns are for structure; all dynamic data should be in the table, not in it.
+5
Jun 29 2018-11-11T00:
source share

Use a function specifically defined for this:

quote (X)

The quote (X) function returns the text of an SQL literal, which is the value of its argument and is suitable for inclusion in an SQL statement. Lines are surrounded by single quotes, escaped with inner quotes as needed. BLOBs are encoded as hexadecimal literals. Strings with embedded NUL characters cannot be represented as string literals in SQL, and therefore the returned string literal is truncated to the first NUL.

Source: https://www.sqlite.org/draft/lang_corefunc.html#quote

Usage example:

 db = sqlite3.connect('foo.sqlite3') cur = db.cursor() escaped_symbol = cur.execute('SELECT quote(?);', [unescaped_symbol]).fetchone()[0] 
0
Jul 03 '19 at 22:29
source share

If you find that you need a variable name (either relvar or a field), you are probably doing something wrong. an alternative template would be to use a property map, for example:

 CREATE TABLE foo_properties( id INTEGER NOT NULL, name VARCHAR NOT NULL, value VARCHAR, PRIMARY KEY(id, name) ); 

Then you simply provide the name dynamically when creating the insert instead of the column.

-one
Jun 29 2018-11-11T00:
source share

Starting with psycopg2 version 2.7 (released in February 2017), column names and table names (identifiers) can be created on-the-fly in a secure way using psycopg2.sql . Here is a link to the documentation with examples: http://initd.org/psycopg/docs/sql.html .

So the way to write a query in your question would be as follows:

 import sqlite3 from psycopg2 import sql with sqlite3.connect(":memory:") as connection: query = sql.SQL("CREATE TABLE {}").format("bar") connection.execute(query) 
-one
Mar 21 '17 at 15:39
source share



All Articles