Preventing Sql Injection Attacks In Arcpy Geoprocessing Scripts

The Dangers of SQL Injection

SQL injection attacks allow malicious users to execute arbitrary SQL commands on databases accessed by a web application. This can lead to data loss, corruption, unauthorized access to sensitive information, and other harmful outcomes. Geoprocessing scripts in ArcGIS that query or update enterprise geodatabases using SQL statements can be vulnerable to SQL injection if improper input validation and sanitization occurs.

For example, consider an arcpy script that builds a SQL query based on user-supplied input parameters to select features from a feature class table. If input validation does not occur, a user could supply specially crafted text containing additional SQL statements that get incorporated into the SQL query that is executed, potentially granting the user unauthorized privileges or access to confidential data.

Anatomy of an SQL Injection Attack

A typical SQL injection attack works by appending extra SQL syntax to user-supplied input parameters. For example, consider the following SQL query that is dynamically built in a script:

query = "SELECT * FROM Users WHERE Name = '" + user_input + "'"

If the user_input parameter is not properly validated or sanitized, a malicious user could supply the following text as the input value:

' OR 1=1--

This would cause the final dynamic SQL query to be:

SELECT * FROM Users WHERE Name = '' OR 1=1--'

The additional OR 1=1 condition added by the attacker causes the WHERE clause to always evaluate to true, allowing the attacker to bypass authentication and retrieve all user records without authorization.

Impacts of SQL Injection

Successful SQL injection attacks can have many devastating impacts:

Bypass authentication and authorization to access non-public data
Modify or delete confidential or sensitive information
Execute administration operations like shutting down the DBMS
Issue requests to other back-end databases and applications on the network
Install malware or backdoors to establish persistent unauthorized access

It is crucial that geoprocessing scripts properly validate and sanitize all externally supplied parameters that get incorporated into SQL queries to prevent against SQL injection vulnerabilities.

Validating User Input in arcpy

The most effective way to prevent SQL injection attacks is to validate and sanitize all externally supplied input before dynamically building SQL queries. The arcpy site-package contains classes to assist with this:

The Validate Functions

The arcpy.da.Validate functions such as ValidateTableName and ValidateFieldName should be used to check user input that will be directly used as SQL identifiers or in other locations where specialized validation is required:

import arcpy.da

table_name = get_user_supplied_table_name() 

# Validate input 
if not arcpy.da.ValidateTableName(table_name):
    raise Exception("Invalid table name provided")

# Proceed to build query using table name
query = f"SELECT * FROM {table_name}"

The ParseSQL Function

For cases where user input will be inserted into the body of a SQL string, the arcpy.da.ParseSQL method should be used to sanitize the values. This will handle escaping of special characters as well as verification that only valid SQL tokens are present:

field_name = get_user_supplied_field_name()

# Sanitize input 
sanitized_field_name = arcpy.da.ParseSQL(field_name, field_type="STRING")

# Proceed to build query using sanitized name 
query = f"SELECT {sanitized_field_name} FROM Users"

Using arcpy.da.ParseSQL prevents the possibility of injection by removing or escaping any invalid SQL syntax provided in the user input.

Using Parameter Objects Securely

In addition to proper input validation and sanitization, it is recommended to use arcpy parameter objects rather than variables when building dynamic SQL queries. The parameter objects handle proper escaping and integration of values into SQL queries.

table_name = get_user_supplied_table_name() 
query = f"SELECT * FROM {table_name}"

You should use:

table_param = arcpy.Parameter(
    displayName="Input Table",
    name="in_table",
    datatype="GPString",
    parameterType="Required",
    direction="Input")

query = f"SELECT * FROM {table_param.value}"

The parameter object will automatically handle validation and sanitization to prevent SQL injection attacks when integrating its value into the SQL query.

Benefits of Parameter Objects

Using parameter objects instead of variables for user input provides the following security advantages:

Automatic handling of input validation, sanitization, and type conversion
Enforcement of value constraints, like drop-down lists
Secure escaping prior to integration into queries
Easier to identify inputs when auditing

By relying on parameter objects rather than direct variable access, the possibility of missed validation and SQL injection is greatly reduced.

Example Script with Parameter Objects

Here is an example arcpy geoprocessing script that uses parameter objects correctly to prevent against SQL injection attacks:

import arcpy

# Define input parameters
table_param = arcpy.Parameter(
    displayName="Input Table", 
    name="in_table",
    datatype="GPString",
    parameterType="Required",
    direction="Input") 

field_param = arcpy.Parameter(
    displayName="Field Name",
    name="field_name", 
    datatype="Field", 
    parameterType="Required", 
    direction="Input")
      
output_fc = arcpy.Parameter(
    displayName="Output FC",
    name="out_fc",
    datatype="DEFeatureClass",
    parameterType="Required",
    direction="Output")        
        
# Use parameter values safely  
query = f"SELECT * FROM {table_param.value} WHERE {field_param.value} IS NOT NULL"

# Execute query and write output  
arcpy.MakeQueryTable_management(query, output_fc)

Key things to note:

User inputs are defined using parameter objects
Parameter data types automatically validate inputs
Values are accessed safely via the .value property
Queries are built using parameterized values

This helps prevent the possibility of raw user input being incorrectly embedded into SQL queries.

Additional Steps to Secure Scripts

Properly validating input and using parameters are the best defenses against SQL injection. But additional recommendations for securing arcpy geoprocessing scripts include:

Least Privilege Connections

When connecting to enterprise geodatabases, use the principle of least privilege and connect using an account with minimum permissions required rather than highly privileged administrator accounts.

Input Filtering

Consider narrowing inputs to specific domains where possible using validation code or parameters with restricted choice lists.

Automated Testing

Create automated unit tests that methodically submit invalid input values to try breaking validation routines and identify gaps.

Static Analysis

Use code scanning tools like ArcGIS Pro’s built-in static analysis to detect potential injection flaws prior to deployment.

Logging & Auditing

Extensive logging related to security events provides monitoring visibility and supports future audits and forensic analysis if an incident were to occur.

Obfuscating Sensitive Information

In addition to securing code against injection attacks, sensitive credentials, connection strings, and other confidential configuration items embedded in arcpy scripts should be protected through obfuscation techniques.

Encryption

Symmetric encryption algorithms like AES can be used to encrypt confidential strings that can only be decrypted at runtime by scripts holding the secret key.

External References

Store unobfuscated credentials and connection strings securely in a separate vault, configuration file, or environmental variables that are referenced dynamically by the script.

Access Control

If storing unencrypted sensitive strings directly in code is unavoidable, use filesystem permissions, access control lists, authentication, and physical control measures to prevent unauthorized access.

Source Control

Do not check in unobfuscated credentials and connection strings directly into source control repositories. Maintain these securely elsewhere.

With multiple overlapping controls, the exposure of sensitive scripting artifacts can be minimized while still supporting automated execution of arcpy processes.

Testing Scripts against Injection Attacks

Comprehensive testing of scripts using exploitation techniques commonly employed in SQL injection assists in verifying defenses prior to operational use. Types of testing include:

Fuzz Testing

Fuzz testing evaluates input validation routines by submitting random, unexpected, or invalid data to parameters to observe how failures are handled.

Negative Testing

Negative or failure testing focuses on directly attempting common SQL injection attacks like ' OR 1=1-- against parameters to confirm protections work.

Orthogonal Methods

Testing validation routines using different methods (i.e. manually, via automation, fuzzing) from difference perspectives improves coverage.

Static Code Analysis

Static code analysis tools scan source code to algorithmically identify potential injection vulnerabilities for remediation.

Verifying the ability to repel SQL injection attacks through managed testing reduces the risk vulnerabilities slip into production providing another layer of defense for secure coding.