Constructing Valid Sql Where Clauses With Python Variables
Specifying Conditions with Python Variables
Python variables can be used within SQL statements to substitute concrete values in place of placeholders. This parameterization technique allows developers to construct reusable queries where the conditions can be changed by altering the Python code rather than having to modify the SQL statements themselves.
For example, if we wanted to select records from a customers table where the country matched a value defined in Python, we could write:
country = "Canada" sql_query = "SELECT * FROM customers WHERE country = %s" cursor.execute(sql_query, (country,))
Here, the %s placeholder will be replaced by the actual string “Canada” when the SQL query is executed. This prevents us from having to embed the value directly in the query string.
Using placeholders and parameter substitution
SQL parameters ( placeholders like %s ) allow passing values to queries securely. Using parameters helps prevent SQL injection attacks and enforces data type validation. When placeholders are used, the Python DB API takes care of escaping special characters in the parameter values.
user_id = request.values['user_id'] account = request.values['account'] query = "SELECT * FROM transactions WHERE user_id = %s AND account = %s" cursor.execute(query, (user_id, account))
The %s placeholders will be replaced with the quoted/escaped user_id and account values before execution.
Escaping special characters
When using parameters, special characters like quotes and backslashes do not need manual escaping as the DB API library handles quoting. However, if manually inserting values, careful escaping is important:
name = "O'Reilly" # Wrong: query = f"SELECT * FROM users WHERE name = {name}" # Correct: query = f"SELECT * FROM users WHERE name = '{name}'"
The single quotes around {name} will escape the apostrophe in the name string. Without quotes, this apostrophe would prematurely terminate the SQL string.
Data types and casting
SQL and Python data types do not always directly map to one another. For instance, a column that stores dates as VARCHAR in SQL could cause trouble when used in Python datetime operations.
created_on = "2022-01-01" # String in Python query = "SELECT * FROM orders WHERE created_on > %s" cursor.execute(query, (today,))
This may fail because created_on is a string, while today is a datetime object. Explicit casting would be required:
from datetime import date today = date(2022, 6, 15) query = "SELECT * FROM orders WHERE CAST(created_on AS DATE) > %s" cursor.execute(query, (today,))
Now both sides of the comparison are dates, allowing proper evaluation.
Boolean Logic and Complex Expressions
It is common to chain together multiple conditions in WHERE using boolean logic operators like AND and OR to target data more precisely. By combining multiple criteria, you can construct flexible queries to filter out records at varying degrees of specificity.
Combining conditions with AND/OR
AND and OR logical operators allow specifying multiple criteria a row must meet to qualify for selection:
category = 'Toys' min_price = 50 max_price = 250 query = "SELECT * FROM products WHERE category = %s AND price BETWEEN %s AND %s" cursor.execute(query, (category, min_price, max_price))
Now rows must both match the category and fall within the price range to be selected.
Alternatively, the OR operator can capture records matching one condition or another:
last_name = "Smith" first_name = "John" query = "SELECT * FROM customers WHERE last_name = %s OR first_name = %s" cursor.execute(query, (last_name, first_name))
Using parentheses for precedence
Adding parentheses around sub-expressions lets you control the order logical conditions are evaluated:
category = "Electronics" min_price = 100 coupon_used = True query = "SELECT * FROM orders WHERE category = %s AND (price > %s OR coupon_used = %s)" cursor.execute(query, (category, min_price, coupon_used))
Here the price OR coupon condition are evaluated first before considering the category criteria.
Parentheses help remove ambiguity in complex logic.
Working with NULL values
Comparing anything with NULL using operators like = or > will result in NULL rather than True/False. Special handling of NULLs is required:
last_purchase = None # Could come from outer join query = "SELECT name, email FROM customers LEFT JOIN orders ON customers.id = orders.customer_id WHERE last_purchase IS NULL" cursor.execute(query)
The IS NULL check will properly evaluate NULL values instead of a standard equality/comparison operation.
Validation and Error Handling
Carefully constructing queries and handling errors avoids issues down the line with retrieving data or inserting records in an unstable state.
Catching errors from malformed queries
If a parameterized SQL query fails, it will raise an exception that can be handled in Python:
user_id = "not-an-integer" try: query = "SELECT * FROM users WHERE id = %s" cursor.execute(query, (user_id,)) except psycopg2.DataError: print(f"Invalid user ID value: {user_id}")
Catching database exceptions prevents failures from cryptic database errors later.
Parameterizing queries to prevent injection
Using parameters prevents SQL injection attacks:
user_input = request.values['username'] # Vulnerable: query = f"SELECT * FROM users WHERE name = '{user_input}'" # Parameterized: query = "SELECT * FROM users WHERE name = %s" cursor.execute(query, (user_input,))
If user_input contained malicious SQL code, the parameterized version would safely escape it rather than allow injection.
Testing and debugging
Test SQL queries with simple inputs first, then try edge cases and different data types. Print out queries or use logging to inspect issues:
start_date = request.values['start'] end_date = request.values['end'] print(f"SELECT * FROM events WHERE event_date BETWEEN {start_date} AND {end_date}") # Logs parameter issues: logger.debug("Query parameters: %s, %s", start_date, end_date)
Testing thoroughly and logging values avoids errors slipping into production.
Example Code Snippets
Some example Python + SQL snippets for typical patterns when querying with variables in WHERE clauses.
Basic variable substitution
# Python user_id = 4 # SQL query = "SELECT * FROM users WHERE user_id = %s" cursor.execute(query, (user_id,))
Building a complex WHERE clause
# Python min_age = 13 max_age = 18 paid = True # SQL query = "SELECT * FROM users WHERE age BETWEEN %s AND %s AND paid = %s" cursor.execute(query, (min_age, max_age, paid))
Handling issues with data types
# Python joined_after = date(2020, 12, 1) # SQL query = "SELECT * FROM users WHERE CAST(joined_as AS DATE) > %s" cursor.execute(query, (joined_after,))