Is It Worth Using ChatGPT to Help Secure Source Code?

To cut straight to the point, I’m not going to say we need to rely on AI tools to help us to write truly secure code. This isn’t to say we can use tools like ChatGPT to help secure source code, but given how these utilities are trained via their LLMs, there’s only so much they can provide. In short, don’t short change security analysts who are, by nature, trained in this very thing.

That said, I’ve been using tools such as ChatGPT and other AI developer tools to help make recommendations on making code more secure. I’ll give a basic example of what I provided, what it produced, and my thoughts on it.

Note this is but a single, isolated, and simple case so there’s only so much that it can recommend. The bottom line, for me, is not so much the code it recommended but how it offered to change that I’d originally written (and then researching the why it opted to use certain function calls over others).

The Intent of the Code

In the function I’m going to share, the purpose is to look at data in the REQUEST_URI of the $_SERVER array, separate the components of the URI, and then analyze them to determine what best course of action to take.

Namely, do we break it a part and redirect to another page or do we duck out of the functionality because it’s malformed or doesn’t fit with the scheme of the rewrite rule I have in place.

The Code

The Initial Function

I wrote the initial function like this:

add_action('template_redirect', function() {   $requestUri = $_SERVER['REQUEST_URI'];   $requestUriParts = array_filter(explode('/', $requestUri));    if (     count($requestUriParts) === 2 &&      $requestUriParts[0] === 'report' &&      is_numeric($requestUriParts[1])) {         $userId = intval($requestUriParts[1]);         include plugin_dir_path(__FILE__) . 'templates/report.php';         exit;     } });

Granted, I made some concessions to keep it relatively simple. That is, there are no exception handling and there are no conditionals to handle the case when indexes of the array are invalid.

But part of this is to see just how much AI can improve the code above. They key things to take away, despite it not handling situations outside of the conditional, are:

I’m verifying the array has the proper number of parts,
The first index of the array is report,
The second index of the array is an integer,
And if all of the above is true, then we’ll redirect to a custom template.

Again, there are some immediate issues that you can spot with this (such as the case of strings, the range of integers for user IDs, and if the file even exists.

But I digress. This is what I fed into ChatGPT to help secure source code.

The Secured Code

Given the code above, this is what ChatGPT produced:

add_action('template_redirect', function () {     $requestUri = filter_var($_SERVER['REQUEST_URI'], FILTER_SANITIZE_URL);     $requestUriParts = array_values(array_filter(explode('/', $requestUri)));      if (         count($requestUriParts) === 2 &&         strtolower($requestUriParts[0]) === 'report' &&         ctype_digit($requestUriParts[1])     ) {         $userId = intval($requestUriParts[1]);          // Validate $userId and ensure it's within an appropriate range         if ($userId > 0 && $userId <= 1000000) { // Adjust the upper limit as needed             $templatePath = plugin_dir_path(__FILE__) . 'templates/report.php';              // Check if the template file exists before including it             if (file_exists($templatePath)) {                 include $templatePath;                 exit;             } else {                 // Handle the case where the template file is missing                 die('Report template not found.');             }         } else {             // Handle invalid user IDs             die('Invalid user ID.');         }     } else {         // Handle invalid URLs         die('Invalid URL.');     } });

Some of the things that immediately jump out at me in terms of improving the code are:

FILTER_SANITIZE_URL
strtolower on the report component of the URL
ctype_digit rather than is_numeric
Verifying a proper range of user IDs
Verifying the template file exists
Handling each case where it could fail

Now in terms of security, I don’t know where this would fall given that it’s not writing or reading data so much as sanitizing and validating it before redirecting a user to a page that should exist.

But I did like the steps that it took as they are things that we should be implementing naturally as engineers. Namely, sanitizing URL, verifying files exists, and making sure ID ranges are acceptable.

This is what struck me as the most interesting part though:

is_numeric. Determines if the given variable is a number or a numeric string. (A PHP string is considered numeric if it can be interpreted as an int or a float.).
ctype_digit. Checks if all of the characters in the provided string, text, are numerical.

Given the definitions above, we can verify that is_numeric(-5) would return true where ctype_digit(-5) would return false. Further, is_numeric(5.5) will be true and ctype_digit(5.5) will be false. This is important, especially when you’re working with non-negative whole numbers such as those that represent user IDs in a system such as WordPress.

I’m not recommending writing lazy code (like my example code above 🙃), feeding it into an AI system, and letting it do work for you. But if you’ve written something as strong and secure as you believe you can, then feeding that to an AI makes more sense as it can help take you a little further. And if you have a security analyst on your team, don’t hesitate to reach out to them for a code review.

For all the talk of AI replacing humans, we’re there yet – not in this field. But that’s not a discussion I care to have right now. If nothing else, using AI tools such as GitHub Copilot and ChatGPT to help secure source code isn’t a bad idea, but it’s not the best idea and it doesn’t replace someone who’s on your team. AI is going to be truly limited by its contextual knowledge of the environment and constraints of the system.

If anything, perhaps they are code assistants and nothing more.