Automated BOLA Detection in REST APIs: A Python Approach
Broken Object Level Authorization (BOLA), also known as Insecure Direct Object Reference (IDOR), remains a prevalent and critical vulnerability in REST APIs. It allows an attacker to bypass authorization checks by simply changing the value of a parameter that directly refers to an object, leading to unauthorized access to other users' data or functionality. Manual detection is often time-consuming and prone to oversight, making an automated scripting approach essential for efficient penetration testing. This document details a Python-based methodology and script for identifying potential BOLA vulnerabilities.Understanding the BOLA Attack Surface
BOLA typically manifests when an application directly uses user-supplied input to query a database or access a resource without sufficient authorization checks. Imagine a scenario where a user can access their profile via GET /api/v1/users/{user_id}. If an attacker simply changes {user_id} to another user's ID, and the API responds with the other user's data, BOLA is present. This vulnerability isn't limited to GET requests; it can impact PUT, POST, and DELETE operations as well, enabling data modification, creation, or deletion on behalf of other users.
Before even scripting, identifying potential API endpoints is crucial, often a task facilitated by internet-wide reconnaissance tools such as Zondex, which can help pinpoint exposed services and API documentation, providing valuable initial targets for testing. Once targets are identified, the next step involves systematically probing these endpoints.
Prerequisites for BOLA Scripting
To effectively automate BOLA detection, several key pieces of information are required:
- Target API Base URL: The root URL for the API endpoints (e.g.,
https://api.example.com/v1/). - Authentication Token: A valid authentication token (e.g., JWT, OAuth bearer token) for a standard, non-privileged user. This is crucial because we want to test if *this user* can access *other users'* resources, not if an unauthenticated user can. The more restricted the token, the better for pinpointing BOLA.
- Known Valid Object ID: An ID that the authenticated user legitimately owns or has access to (e.g., their own user ID, an order ID they placed). This serves as a baseline.
- Target Endpoints: The specific API paths that incorporate object IDs in their URL structure (e.g.,
/users/{id},/orders/{id},/documents/{document_uuid}). - Object ID Type: Understanding whether the IDs are numeric, UUIDs, alphanumeric strings, etc., is vital for effective fuzzing.
Script Architecture: Pentester's Field Notes
Our Python script will follow a modular architecture:
- Configuration: Centralized dictionary for API base URL, headers, and a list of target endpoints, each with its own specific details.
- Session Management: Utilizing
requests.Sessionfor persistent headers (like the Authorization token) across multiple requests, improving efficiency. - ID Fuzzing Logic: Functions to generate variations of the known valid object ID based on its type (numeric, UUID, etc.). This includes sequential increments/decrements, random UUID generation, and character alteration.
- Request Execution: A generic function to send HTTP requests (GET, POST, PUT, DELETE) to the target API.
- Response Analysis: Logic to evaluate the HTTP status code and response body for indicators of unauthorized access or, conversely, unexpected successful access.
- Reporting: Accumulating and presenting potential BOLA findings.
For traffic analysis and chaining requests, especially during larger engagements, routing our script's traffic through a proxy like GProxy allows for easy interception with tools like Burp Suite or OWASP ZAP. This can be invaluable for debugging the script's behavior and manually verifying suspicious findings.
The Python BOLA Detector Script
Here's a breakdown of the Python script designed for automated BOLA detection. Remember to replace placeholder values like YOUR_JWT_TOKEN_HERE with actual, valid data for your target.
import requests
import json
import uuid
import random
from urllib.parse import urljoin
# --- Configuration Section ---
API_CONFIG = {
"base_url": "https://api.example.com/v1/", # IMPORTANT: Update this to your target API base URL
"headers": {
"Authorization": "Bearer YOUR_JWT_TOKEN_HERE", # IMPORTANT: Replace with a valid, low-privileged user's token
"Content-Type": "application/json",
"Accept": "application/json"
},
"endpoints": [
{
"name": "User Profile Access",
"path": "users/{id}", # Path with {id} placeholder
"method": "GET",
"valid_id": "b1b3b5b7-1d1e-1f1a-2b2c-3d3e3f3a3b3c", # A valid UUID accessible by the token user
"id_type": "uuid", # 'uuid', 'numeric', 'alphanumeric'
"unauth_indicators": ["Access Denied", "Forbidden", "not authorized", "permission denied"], # Case-insensitive phrases
"expected_status_code": 200, # Expected status if authorized (for baseline check)
"id_fuzz_strategy": {
"uuid_alter_chars": 2 # Number of random characters to alter in a UUID for fuzzing
}
},
{
"name": "Order Details Access",
"path": "orders/{id}",
"method": "GET",
"valid_id": "12345", # A valid numeric ID accessible by the token user
"id_type": "numeric",
"unauth_indicators": ["Permission denied", "invalid order ID", "unauthorized"],
"expected_status_code": 200,
"id_fuzz_strategy": {
"numeric_delta": [-1, 1, 10, 100] # Values to add/subtract for numeric fuzzing
}
},
# Add more endpoints here as needed.
# Example for a PUT request:
# {
# "name": "Update User Settings",
# "path": "settings/{id}",
# "method": "PUT",
# "valid_id": "b1b3b5b7-1d1e-1f1a-2b2c-3d3e3f3a3b3c",
# "id_type": "uuid",
# "unauth_indicators": ["Access Denied", "Forbidden"],
# "expected_status_code": 200,
# "id_fuzz_strategy": {
# "uuid_alter_chars": 2
# },
# "request_body": {"theme": "dark"} # Example body for PUT/POST
# }
]
}
# --- Helper Functions ---
def generate_fuzzed_id(original_id, id_type, strategy):
"""Generates a list of fuzzed IDs based on the original ID, type, and strategy."""
fuzzed_ids = []
# Strategy for numeric IDs
if id_type == "numeric":
try:
num_id = int(original_id)
for delta in strategy.get("numeric_delta", [-1, 1]):
fuzzed_ids.append(str(num_id + delta))
# Add a completely random numeric ID of similar length
import random
random_num_str = ''.join(random.choices('0123456789', k=len(original_id)))
if random_num_str != original_id:
fuzzed_ids.append(random_num_str)
except ValueError:
print(f"Warning: Numeric ID expected but got '{original_id}'. Skipping numeric fuzzing.")
# Strategy for UUIDs
elif id_type == "uuid":
# Generate entirely new UUIDs
fuzzed_ids.append(str(uuid.uuid4()))
fuzzed_ids.append(str(uuid.uuid4()))
# Alter characters in the existing UUID
alter_chars = strategy.get("uuid_alter_chars", 1)
original_uuid_str = original_id
for _ in range(2): # Try altering twice
altered_uuid = list(original_uuid_str)
for __ in range(alter_chars):
idx = random.randint(0, len(altered_uuid) - 1)
# Ensure it's a hex char or a dash at the correct spots
if original_uuid_str[idx] == '-':
continue
altered_uuid[idx] = random.choice('0123456789abcdef')
fuzzed_ids.append("".join(altered_uuid))
# Generic fuzzing for all ID types
fuzzed_ids.extend(["non_existent_id", "0", "test_id", "admin", "null", "undefined"])
# Filter out duplicates and the original ID
return list(set([id_val for id_val in fuzzed_ids if id_val != original_id]))
def make_request(session, method, url, data=None):
"""Generic function to make HTTP requests using the provided session."""
try:
if method.upper() == "GET":
response = session.get(url, timeout=10)
elif method.upper() == "POST":
response = session.post(url, json=data, timeout=10)
elif method.upper() == "PUT":
response = session.put(url, json=data, timeout=10)
elif method.upper() == "DELETE":
response = session.delete(url, timeout=10)
else:
print(f"Unsupported method: {method}")
return None
return response
except requests.exceptions.RequestException as e:
print(f"Request failed: {e}")
return None
# --- Main Detection Logic ---
def detect_bola():
"""Main function to perform BOLA detection across configured endpoints."""
print("Initiating BOLA detection...")
print(f"Target Base URL: {API_CONFIG['base_url']}\n")
session = requests.Session()
session.headers.update(API_CONFIG['headers'])
bola_findings = []
for endpoint_config in API_CONFIG['endpoints']:
name = endpoint_config['name']
path_template = endpoint_config['path']
method = endpoint_config['method']
valid_id = endpoint_config['valid_id']
id_type = endpoint_config['id_type']
unauth_indicators = [ind.lower() for ind in endpoint_config['unauth_indicators']]
expected_status_code = endpoint_config['expected_status_code']
id_fuzz_strategy = endpoint_config.get('id_fuzz_strategy', {})
request_body = endpoint_config.get('request_body', None) # For PUT/POST
print(f"[*] Testing Endpoint: {name} ({method} {path_template})")
# --- Step 1: Verify access to the valid ID (baseline) ---
valid_url = urljoin(API_CONFIG['base_url'], path_template.format(id=valid_id))
print(f" [+] Baseline check for valid ID '{valid_id}': {valid_url}")
resp_valid = make_request(session, method, valid_url, data=request_body)
if resp_valid and resp_valid.status_code == expected_status_code:
print(f" [OK] Baseline request successful (Status: {resp_valid.status_code}).")
elif resp_valid:
print(f" [FAIL] Baseline request failed (Status: {resp_valid.status_code}). Check token validity or endpoint configuration.")
continue
else:
print(f" [FAIL] Baseline request for valid ID '{valid_id}' failed. No response.")
continue
# --- Step 2: Fuzz with different IDs ---
fuzzed_ids = generate_fuzzed_id(valid_id, id_type, id_fuzz_strategy)
print(f" [*] Fuzzing with {len(fuzzed_ids)} generated IDs...")
for fuzzed_id in fuzzed_ids:
target_url = urljoin(API_CONFIG['base_url'], path_template.format(id=fuzzed_id))
print(f" [>] Testing fuzzed ID: '{fuzzed_id}' at {target_url}")
resp_fuzzed = make_request(session, method, target_url, data=request_body)
if not resp_fuzzed:
print(f" [!] No response for ID '{fuzzed_id}'.")
continue
response_body_lower = resp_fuzzed.text.lower() if resp_fuzzed.text else ""
# Primary BOLA indicator: Unexpected 200 OK for an unauthorized resource
if resp_fuzzed.status_code == expected_status_code:
# If we get the expected success status code, but it's not the original ID
# and the response doesn't contain explicit unauthorized indicators
if not any(indicator in response_body_lower for indicator in unauth_indicators):
description = "Potential BOLA: Unexpected 200 OK for fuzzed ID with no explicit authorization denial. Manual review required to confirm access to unauthorized data."
bola_findings.append({
"endpoint": name,
"fuzzed_id": fuzzed_id,
"url": target_url,
"status_code": resp_fuzzed.status_code,
"description": description,
"response_snippet": resp_fuzzed.text[:500] # Capture a larger snippet
})
print(f" [!!! BOLA SUSPECT !!!] {description}. Response: {resp_fuzzed.text[:150]}...")
elif resp_fuzzed.status_code in:
print(f" [OK] Access for ID '{fuzzed_id}' explicitly denied (Status: {resp_fuzzed.status_code}).")
elif resp_fuzzed.status_code == 404:
# 404 for an existing resource could be an IDOR via enumeration,
# but for direct access, 404 usually means the ID doesn't exist, which is good.
# However, if 404 is returned instead of 403 for an *existing* but unauthorized resource,
# it's still information leakage. This script focuses on *access*, not just leakage.
print(f" [Info] Resource for ID '{fuzzed_id}' not found (Status: {resp_fuzzed.status_code}).")
else:
# Any other unexpected status code warrants attention, but might not be a direct BOLA.
print(f" [Info] Fuzzed ID '{fuzzed_id}' returned unexpected Status: {resp_fuzzed.status_code}. Response snippet: {resp_fuzzed.text[:150]}...")
print("\n--- BOLA Detection Complete ---")
if bola_findings:
print("\n!!! BOLA Findings Detected (Manual Review Recommended) !!!")
for i, finding in enumerate(bola_findings):
print(f"\nFinding {i+1}:")
print(f" Endpoint: {finding['endpoint']}")
print(f" Fuzzed ID: {finding['fuzzed_id']}")
print(f" URL: {finding['url']}")
print(f" Status Code: {finding['status_code']}")
print(f" Description: {finding['description']}")
print(f" Response Snippet: {finding['response_snippet']}")
else:
print("\nNo explicit BOLA findings based on current rules. Further manual testing advised.")
if __name__ == "__main__":
detect_bola()
Execution and Interpretation
To run the script, save it as a .py file (e.g., bola_detector.py) and execute it from your terminal:
python3 bola_detector.py
The script will iterate through each configured endpoint, first performing a baseline check with a known valid ID. This confirms the endpoint is reachable and the authentication token is active. Following this, it will generate a series of fuzzed IDs and attempt to access resources using them.
The core of BOLA detection in this script relies on identifying an "unexpected 200 OK" status code. If the script uses a fuzzed (unauthorized) ID and receives a 200 OK response, particularly if the response body doesn't contain any explicit "access denied" indicators, it flags this as a potential BOLA vulnerability. This means the API likely served data that the authenticated user shouldn't have access to. While platforms like Secably offer comprehensive vulnerability scanning, custom scripts provide granular control for specific, logic-based flaws like BOLA, allowing for highly tailored tests.
Example Output Snippet:
Initiating BOLA detection...
Target Base URL: https://api.example.com/v1/
[*] Testing Endpoint: User Profile Access (GET users/{id})
[+] Baseline check for valid ID 'b1b3b5b7-1d1e-1f1a-2b2c-3d3e3f3a3b3c': https://api.example.com/v1/users/b1b3b5b7-1d1e-1f1a-2b2c-3d3e3f3a3b3c
[OK] Baseline request successful (Status: 200).
[*] Fuzzing with 7 generated IDs...
[>] Testing fuzzed ID: 'non_existent_id' at https://api.example.com/v1/users/non_existent_id
[Info] Resource for ID 'non_existent_id' not found (Status: 404).
[>] Testing fuzzed ID: '0a1b2c3d-4e5f-6a7b-8c9d-0e1f2a3b4c5d' at https://api.example.com/v1/users/0a1b2c3d-4e5f-6a7b-8c9d-0e1f2a3b4c5d
[!!! BOLA SUSPECT !!!] Potential BOLA: Unexpected 200 OK for fuzzed ID with no explicit authorization denial. Manual review required to confirm access to unauthorized data. Response: {"id": "0a1b2c3d-4e5f-6a7b-8c9d-0e1f2a3b4c5d", "username": "other_user", "email": "[email protected]"...
[>] Testing fuzzed ID: 'b1b3b5b7-1d1e-1f1a-2b2c-3d3e3f3a3b3d' at https://api.example.com/v1/users/b1b3b5b7-1d1e-1f1a-2b2c-3d3e3f3a3b3d
[OK] Access for ID 'b1b3b5b7-1d1e-1f1a-2b2c-3d3e3f3a3b3d' explicitly denied (Status: 403).
--- BOLA Detection Complete ---
!!! BOLA Findings Detected (Manual Review Recommended) !!!
Finding 1:
Endpoint: User Profile Access
Fuzzed ID: 0a1b2c3d-4e5f-6a7b-8c9d-0e1f2a3b4c5d
URL: https://api.example.com/v1/users/0a1b2c3d-4e5f-6a7b-8c9d-0e1f2a3b4c5d
Status Code: 200
Description: Potential BOLA: Unexpected 200 OK for fuzzed ID with no explicit authorization denial. Manual review required to confirm access to unauthorized data.
Response Snippet: {"id": "0a1b2c3d-4e5f-6a7b-8c9d-0e1f2a3b4c5d", "username": "other_user", "email": "[email protected]", "first_name": "Jane", "last_name": "Doe", "role": "user", "created_at": "2023-01-01T12:00:00Z", "updated_at": "2023-01-01T12:00:00Z", "address": {"street": "123 Main St", "city": "Anytown", "zip": "12345"}}
Beyond the Basics: Advanced Considerations
- Response Content Analysis: A more sophisticated script might capture the baseline response (for the valid ID) and compare structural elements or specific field values against fuzzed ID responses to detect subtle differences that indicate data leakage without a clear 200 OK. For instance, if the
user_idfield in the response body doesn't match the authenticated user's ID but returns 200 OK, that's a strong BOLA indicator. - Different Authentication Schemes: The current script assumes a simple Bearer token. Adapt the
headersinAPI_CONFIGfor other schemes like API keys, session cookies, or more complex OAuth flows. - POST/PUT/DELETE BOLA: The script includes a placeholder for
request_bodyfor non-GET requests. To test BOLA on these methods, you'd need to provide a sample valid request body and then modify the ID within that body (if applicable) or the URL ID, similar to GET requests. - Error Message Variance: Pay close attention to how the API responds to genuinely invalid IDs versus unauthorized access. Sometimes, a
404 Not Foundfor an existing but unauthorized resource can be a form of information leakage (IDOR via enumeration), even if not a direct BOLA (access). - Rate Limiting: Aggressive fuzzing can trigger rate limits or IP bans. Implement delays (e.g.,
time.sleep()) or use rotating proxies for larger-scale assessments.
While this script provides a powerful starting point for automated BOLA detection, remember that the final confirmation of a vulnerability often requires manual review of the captured responses. Automated tools are excellent for identifying suspicious behavior, but human intelligence is crucial for verifying the impact and true severity.