sethserver / MySQL

How to Prevent SQL Injection

I'm really sad that SQL injection still remains one of the most pervasive attacks against web applications. Statistia has a nice breakdown called Distribution of web application critical vulnerabilities worldwide as of 2022. I feel like as a Python developer - or any computer programmer for that matter - it's really important to understand what SQL injection is, how to prevent it, and how to encourage others write more secure code.

What is SQL Injection?

SQL injection is a method that lets attackers execute malicious SQL code against your database in an unexpected way. This can lead to unauthorized access to sensitive data, destruction of data, or worse. (What's worse? I don't know, but I'd rather not find out.) SQL injection vulnerabilities happen when an application uses raw user input to construct SQL queries without properly validating or escaping that input.

SQL Injection Example

Let's look at simple example of SQL injection. We'll use a simple table of users with the following schema. Keep in mind, this is a example of a bad way to do things, please don't do this.

CREATE DATABASE sql_injection_example;

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    username VARCHAR(50) NOT NULL,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL
);

INSERT INTO users (username, first_name, last_name) VALUES
    ('sethers', 'seth', 'black'),
    ('davey', 'david', 'black'),
    ('ollie', 'olivia', 'black');

Let's say we want to write a Python function that checks if a user exists in our database. We'll use the mysql.connector library to connect to our MySQL (or better MariaDB) instance.

import mysql.connector

def user_exists(username):
    conn = mysql.connector.connect(
        host="localhost",
        database="sql_injection_example",
        user="username",
        password="averysecurepassword"
    )
    cursor = conn.cursor()
    cursor.execute(f"SELECT first_name, last_name FROM users WHERE username = '{username}'")
    return cursor.fetchone() is not None

Looks good, right? Let's test it out.

>>> user_exists("sethers")

True

>>> user_exists("davey")

True

>>> user_exists("francois")

False

It works! But what happens if we pass in a username with some nefarious SQL code sprinkled in?

>>> user_exists("francois' OR '1' = '1")

True

Whoopsie! Our function returns True. Even though we passed in a username that doesn't exist, it still worked!? This is because our SQL query is now:

SELECT * FROM users WHERE username = 'francois' OR '1' = '1';

Since '1' = '1' is always True, our query returns True no matter what. This is an example of a SQL injection attack. Like I said before, please don't do this. It's bad.

How to Prevent SQL Injection in Python

A great rule to keep in mind when working with any database code is: separate the data from the query. The most practical way do this is by using parameterized queries. Practically all database libriaries allow you to use parameterized queries, where you use placeholders for user input in your SQL statements. Let's update our function from above to use a parameterized query.

import mysql.connector

def user_exists(username):
    conn = mysql.connector.connect(
        host="localhost",
        database="sql_injection_example",
        user="username",
        password="averysecurepassword"
    )
    cursor = conn.cursor()
    cursor.execute("SELECT * FROM users WHERE username = %s", (username,))
    return cursor.fetchone() is not None

Notice how we use %s as a placeholder for the username. The second argument to the execute method is a tuple of values that will be substituted for the placeholders in the query. This is a much safer way to write database code. The database library will take care of escaping the user input for us.

Let's test this new function out using the same inputs from above.

>>> user_exists("sethers")

True

>>> user_exists("davey")

True

>>> user_exists("francois")

False

It works the same way it did before. Now, what happens when we pass in that nefarious SQL code?

>>> user_exists("francois' OR '1' = '1")

False

It works! Our function now returns False, as expected. This is because our SQL query is now:

SELECT * FROM users WHERE username = 'francois\' OR \'1\' = \'1';

Notice how the single quotes in the username are escaped. This is what we want. The database library takes care of this for us, and makes our query significantly more secure. Hopefully this helps you write more secure and reliable SQL code.

Good luck out there and happy databasing!

-Sethers