Why and How to Use Prepared Statements
“If it ain’t broke, don’t fix it.”
SQL injection is one of the most common issues facing many PHP developers. You’ve finally got that login form working with your database script and now someone is telling all your hard work is “vulnerable.” It’s enough to make you want to smash a keyboard, but the fix is actually a lot easier than you probably think.
Let’s Learn a Little Hacking!
To better understand the problem, let’s learn how to do it! Let’s say your code looks like this:
<?php
/* login.php
*
* DISCLAIMER:
* The below code and overall approach is intended to mimic how a beginner
* might write a simple login form. There are several big problems with it
* and it should NOT be used to make your own login form.
*/
session_start();
if(isset($_GET["username"]))
{
$db = new mysqli("localhost","lieutenant_dan","n0_legs","myapp");
$username = $_GET["username"];
$password = md5($_GET["password"]);
$query = "SELECT * FROM `users` WHERE `username` = '$username' AND `password` = '$password'";
$result_set = $db->query($query);
if($result_set->num_rows > 0)
{
// Success!
$row = $result_set->fetch_assoc();
$_SESSION["user_id"] = $row["id"];
}
}
<!doctype html>
<html>
<body>
<form action='login.php' method='GET'>
Username: <input type='text' name='username'><br />
Password: <input type='password' name='password'><br />
<input type='submit'>
</form>
</body>
</html>
So you type in your username john and password and you end up going to a page like this:
login.php?username=john&password=abc123
The resulting $query variable becomes:
SELECT * FROM `users` WHERE `username` = ‘john‘ AND `password` = ‘abc123‘
And it works – you get back your user record and you log the person in and all is right in the world.
…
But then hackers had to go and ruin everything. They look at that URL and change it something like this:
login.php?username=john&password=abc123′ OR ”=’
Suddenly, your nice $query variable that was behaving so nicely before now looks like this:
SELECT * FROM `users` WHERE `username` = ‘john‘ AND `password` = ‘abc123 OR ”=’‘
Ruh-roh.
If you look closely, this is actually a valid SQL statement, and that “OR ”=”” piece at the end basically overrides all the criteria that came before it, so suddenly this query is returning ALL users. And in almost every system that uses a database with logins, the administrator account is typically the first one you set up, so it’ll be the first one returned in such a query.
Since your code doesn’t do any additional checking to make sure the returned row matches the username that was attempted, but just grabs the user ID from the first, row, suddenly Hacker Badman is logged into the administrator account without ever knowing the password.
This is SQL injection – the act of some kind of input changing the SQL query to run in some way other than how you intended for it to run.
What’s worse is that you can do a lot of stuff with databases. There is a lot more functionality than the simply querying and inserting of data. Some databases will even let you execute system commands and run applications, so when you are vulnerable to SQL injection, you’re often giving the hacker the ability to run ANY query he/she wants to run.
“But…” you protest, “It’s only this one query! They couldn’t access anything outside the users table!”
It’s good to think positively, but in this case, a hacker can use different tricks to run multiple queries. For example, did you know you can run multiple queries in one command by utilizing a semicolon like this?
$db->query("QUERY #1 HERE; QUERY #2 HERE");
There are even more methods besides the semicolon, but with a carefully-crafted attack, a hacker can suddenly make your code run queries that you never intended to run.
Fix Attempt #1: Escaped Characters
The common way most programmers started to try and fix SQL injection was to escape special characters.
An “escaped” character simply means it has a backslash \ in front of it that tells the database, “Hey, this next character is part of the VALUE and is NOT part of the query itself!” You would commonly use escaped characters if you had values that had apostrophes, like this:
SELECT * FROM `users` WHERE `last_name` = 'O\'Reilly'
So instead of the database treating the ‘ in O’Reilly as if it were the end of the text value, it just sees that ‘ as part of the last name being looked up.
Many of us just used the mysqli_real_escape_string() PHP function on our values to do all the necessary escaping:
$last_name = $db->real_escape_string("O'Reilly");
$query = "SELECT * FROM `users` WHERE `last_name` = '$last_name'";
echo $query; // OUTPUT: SELECT * FROM `users` WHERE `last_name` = 'O\'Reilly'
This works MOST of the time, but then hackers again showed up and figured out they could use some special characters and encoding to get around the escaping in some situations, so it was effective as long as the hacker didn’t know ALL the tricks.
But that’s not really something you want to rely on, so then most people switched over to just using prepared statements (if they weren’t already doing so).
Fix Attempt #2: Prepared Statements
The reason that SQL injection works is that the database has to parse out the already-combined query and values to figure out which pieces are part of the query and which pieces are values, so it cannot tell if SQL injection has occurred or if you actually meant to send the query like that.
With prepared statements, you’re splitting this whole process into two different steps:
- You send over the query without the values FIRST (just using placeholders like question marks “?”), which lets the database parse it without interference from any values, and see exactly what the final query SHOULD do.
- You send over the values SECOND, and the database simply plugs them into the right spots.
An example query for step 1 would look like this:
SELECT * FROM `users` WHERE `username` = ? AND `password` = ?
If you told the database to “prepare” that query, it would parse it out just like it’s shown and it would say, “I see two ? question marks, so I’ll expect two values to be provided – the first will be matched against the username field, and the second value will be matched against the password field.
So step 2, you would “bind” two variables to the statement, one for the username and one for the password, and then you could execute that statement.
Let’s look at the code and compare it to the previous way of doing it:
$username = $_GET["username"];
$password = md5($_GET["password"]);
// Old way, vulnerable to SQL injection
$result_set = $db->query("SELECT * FROM `users` WHERE `username` = '$username' AND `password` = '$password'");
// Prepared statement, safe from SQL injection
$statement = $db->prepare("SELECT * FROM `users` WHERE `username` = ? AND `password` = ?");
$statement->execute([$username, $password]);
$result_set = $statement->get_result();
It’s more code, but it’s not THAT much more code. Some people have even created their own functions to simplify those 3 lines down into one line to make it even easier and shorter.
NOTE: The technique of passing in an array of parameters to the execute() method is new to the mysqli extension and is only available in PHP 8.1 or later. If you have an earlier version, then you’ll need to use bind_param() as shown in the following section.
Variations
Sometimes developers prefer to bind their parameters separately and specify data types, like this:
$statement = $db->prepare("SELECT * FROM `users` WHERE `username` = ? AND `password` = ?");
$statement->bind_param('ss', $username, $password);
$statement->execute();
That bind_param method says, “I’m passing in 2 ‘s‘tring variables, $username, and $password.”
The code in this post uses the mysqli extension simply because it tends to be the one that most developers switched to after the mysqli extension was deprecated. However, the PDO library has some features that make prepared statements even easier, such as named parameters instead of using question marks:
$statement = $pdo->prepare("SELECT * FROM `users` WHERE `username` = :un AND `password` = :pw");
$statement->execute(array(':un' => $username, ':pw' => $password));
Named parameters can be useful if you’re binding your variables separately and you have a lot of them.
If you’re starting a new project from scratch, I’d strongly recommend using the PDO library instead of mysqli.
Q: Are there any other benefits to prepared statements?
Absolutely! There are additional performance and bandwidth benefits to prepared statements, especially if you are executing the same query multiple times but with different parameters.
For example, if you are running a bulk update of data and you’re updating 100,000 records, you can turn off the autocommit behavior on the database, set up the prepared statement, then execute it 100,000 times with different variables, and then run a commit at the end, and it will be a LOT faster than 100,000 full UPDATE queries and utilize less bandwidth, too (because you’re only sending the parameter values with each execution and not the entire query).
Q: Are prepared statements a MySQL thing?
No. They got introduced to MySQL in version 4.1, but prepared statements are available on almost any major database, and in some cases, the database class (e.g. PDO) can even emulate prepared statements for you if the underlying database doesn’t directly support them.
Q: Is this just for login forms?
SQL injection can impact more than just login forms. The example in this post is just one example. Another example is many management-type systems rely on passing parameters like IDs:
.../edit_post.php?post_id=123
Any user input that is used in code as part of a query has the potential to be SQL injection material if you don’t use prepared statements.
Q: What if I just use POST on my form instead of GET?
I used the GET form submission method in this post to make it easier to see the values, but it’s very easy to manipulate any data that is being submitted to the web server. A hacker can use the browser’s developer tools (e.g. F12 key on most PC-based major browsers) or a proxy tool like Fiddler to mess with the data and change it to whatever they want. So changing your form to POST or using JavaScript or trying to implement security on the end user’s side is never completely safe.
Leave a Reply