Writing a Python Script to Automate Broken Object Level Authorization (BOLA) Detection in REST APIs

Automating Broken Object Level Authorization (BOLA) Detection in REST APIs

Detecting Broken Object Level Authorization (BOLA), also known as Insecure Direct Object Reference (IDOR), is critical for securing REST APIs. This post outlines a Python script approach to automate the identification of BOLA vulnerabilities by systematically manipulating object identifiers in API requests and analyzing responses for unauthorized data access. The goal is to move beyond manual proxy manipulation and establish a repeatable, script-driven methodology.

Understanding the BOLA Attack Surface

BOLA manifests when an API endpoint accepts an object ID, and the server-side authorization check fails to verify if the requesting user is genuinely authorized to access that specific object. Attackers typically exploit this by enumerating or guessing object IDs to access data belonging to other users or sensitive system resources. Common scenarios include user profiles, order details, document access, and transaction histories.

Identifying Target Endpoints

The initial phase involves identifying API endpoints that handle object IDs. These are often found in URL paths (e.g., /users/{id}/profile, /orders/{order_id}), query parameters (e.g., ?documentId=abc-123), or JSON request bodies (e.g., {"resource_id": "xyz-789"}). Any endpoint where a numerical, UUID, or string identifier points to a specific resource is a potential BOLA target.

Extracting Object IDs

During a normal authenticated interaction with the application, object IDs are transmitted. A proxy like Burp Suite is invaluable for capturing these legitimate requests. IDs can be sequential (1, 2, 3...), GUIDs/UUIDs (a1b2c3d4-e5f6-7890-1234-567890abcdef), or even custom alphanumeric strings. Understanding the ID format is key to effective enumeration. We're looking for parameters that uniquely identify a resource.

The Automation Pipeline: Python Script Design

The core of the automation pipeline involves capturing a legitimate request, parameterizing the object ID, iterating through a list of potential unauthorized IDs, and analyzing the server's response. This requires a robust Python script capable of handling HTTP requests, parsing responses, and performing conditional checks.

Initial Request Capture (Burp Suite)

Begin by capturing a legitimate request in Burp Suite that involves an object ID belonging to your test user. For instance, accessing your own user profile or a document you own. Right-click the request in Burp's HTTP history and select "Copy to file" -> "Request in cURL format" or "Copy as Python requests". Copying as Python requests code directly provides a solid starting point for the script's request structure, including headers and body.


# Example cURL captured from Burp Suite
curl 'https://api.example.com/v1/users/user_456/profile' \
  -H 'accept: application/json' \
  -H 'authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...' \
  -H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/116.0' \
  --compressed

Crafting the Python Script

The Python script will leverage the requests library for HTTP communication. A session object is preferable for maintaining headers like authentication tokens across multiple requests.

Sending Authenticated Requests

The script needs to send requests with the appropriate authentication headers. These are typically JWT Bearer tokens, API keys, or session cookies. Using requests.Session() simplifies header management.


import requests
import json
import re

# --- Configuration ---
BASE_URL = "https://api.example.com/v1"
AUTH_TOKEN = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..." # Replace with your valid Bearer token
TARGET_ENDPOINT_TEMPLATE = "/users/{object_id}/profile"
YOUR_USER_ID = "user_456" # The ID associated with your AUTH_TOKEN

# List of object IDs to test. This could be generated, known IDs, or common patterns.
# For demonstration, we'll use a small list. In a real scenario, this would be much larger
# and potentially derived from other requests or common ID patterns.
TEST_OBJECT_IDS = [
    "user_123",
    "user_789",
    "admin_001",
    "user_457", # Just an incremented ID
    "nonexistent_id"
]

# --- Setup Session ---
session = requests.Session()
session.headers.update({
    "Authorization": f"Bearer {AUTH_TOKEN}",
    "Accept": "application/json",
    "User-Agent": "Mozilla/5.0 (Automated BOLA Scanner/1.0)"
})

ID Enumeration Logic

The script iterates through TEST_OBJECT_IDS, dynamically constructing the request URL or modifying the request body. The key is to replace the known, legitimate ID (YOUR_USER_ID) with the test ID.

For path-based IDs, string formatting or f-strings are effective:


def test_bola(object_id_to_test):
    # Construct the full URL for the current test ID
    full_url = f"{BASE_URL}{TARGET_ENDPOINT_TEMPLATE.format(object_id=object_id_to_test)}"
    
    print(f"[INFO] Testing URL: {full_url}")
    try:
        response = session.get(full_url, allow_redirects=False)
        return response
    except requests.exceptions.RequestException as e:
        print(f"[ERROR] Request failed for {full_url}: {e}")
        return None

If the object ID is in the request body (e.g., for POST or PUT requests), the JSON payload needs to be dynamically adjusted.


# Example for body-based ID (if applicable for your target endpoint)
def test_bola_post_body(object_id_to_test):
    full_url = f"{BASE_URL}/api/v1/update_resource" # Example endpoint
    payload = {"resource_id": object_id_to_test, "data": "some_data"}
    print(f"[INFO] Testing POST with payload: {payload}")
    try:
        response = session.post(full_url, json=payload, allow_redirects=False)
        return response
    except requests.exceptions.RequestException as e:
        print(f"[ERROR] Request failed for {full_url}: {e}")
        return None

Unauthorized Access Detection

This is the most critical part. A BOLA vulnerability is confirmed if an unauthorized ID results in a successful response (typically 200 OK) containing sensitive or unexpected data. The script needs to differentiate between legitimate access, proper authorization failures (401/403), and BOLA.

200 OK with relevant data: If the response is 200 OK and the body contains data clearly belonging to object_id_to_test (e.g., their username, email, sensitive details), it's a BOLA.
200 OK with generic/empty data: Sometimes, an API might return a 200 OK but with an empty object or generic "not found" message, rather than a 401/403. While not as severe as direct data exposure, this still indicates a flawed authorization mechanism that could be bypassed or lead to information leakage.
401 Unauthorized / 403 Forbidden: These are expected and indicate proper authorization controls.


def analyze_response(response, object_id_tested):
    if response is None:
        print(f"[RESULT] No response for {object_id_tested}.")
        return

    print(f"[RESULT] ID: {object_id_tested} | Status: {response.status_code}")

    if response.status_code == 200:
        try:
            data = response.json()
            # This is a critical point: how do we *know* it's unauthorized access?
            # 1. Check for the requested object_id in the response data.
            # 2. Check if the returned data is different from an expected "empty" or "not found" response for unauthorized access.
            # 3. Look for specific fields that indicate successful retrieval of another user's data.

            # Example: Check if the returned profile ID matches the requested ID, and it's not our own ID
            if "id" in data and data["id"] == object_id_tested and object_id_tested != YOUR_USER_ID:
                print(f"  [VULNERABILITY DETECTED] BOLA: Retrieved profile for unauthorized ID '{object_id_tested}'!")
                print(f"  Response Data: {json.dumps(data, indent=2)}")
            elif "message" in data and "not found" in data["message"].lower():
                print(f"  [INFO] Resource '{object_id_tested}' not found (but still 200 OK).")
            else:
                # Further analysis might be needed here to confirm if it's unauthorized data.
                # This could involve comparing against a baseline of YOUR_USER_ID's data.
                print(f"  [POSSIBLE BOLA] 200 OK for {object_id_tested}. Manual review required.")
                print(f"  Partial Response: {json.dumps(data, indent=2)[:500]}...") # Print first 500 chars
        except json.JSONDecodeError:
            print(f"  [WARNING] 200 OK, but non-JSON response for {object_id_tested}. Manual review required.")
            print(f"  Response Content (first 200 chars): {response.text[:200]}...")
    elif response.status_code in:
        print(f"  [OK] Proper authorization denial for {object_id_tested}.")
    else:
        print(f"  [INFO] Unexpected status code {response.status_code} for {object_id_tested}.")
        print(f"  Response: {response.text[:200]}...")

# --- Main Execution Loop ---
print(f"[START] Beginning BOLA detection for endpoint: {TARGET_ENDPOINT_TEMPLATE}")
for obj_id in TEST_OBJECT_IDS:
    response = test_bola(obj_id)
    analyze_response(response, obj_id)

print("[END] BOLA detection complete.")

Practical Implementation Walkthrough

Setup and Dependencies

Ensure you have the requests library installed:


pip install requests

You'll need a valid authentication token for the application you're testing. Capture this from a legitimate login in Burp Suite. This token should belong to a standard user account, not an administrator, to simulate a typical attacker's perspective.

Example Scenario and Code

Consider an API endpoint /v1/users/{user_id}/profile which returns a user's profile details. We'll assume user_456 is our authenticated user and we want to test if we can access user_123 or user_789's profiles.

The full script consolidates the configuration, session setup, request function, and analysis logic:


import requests
import json
import re

# --- Configuration ---
# IMPORTANT: Replace these placeholders with actual values from your target API
BASE_URL = "https://api.example.com/v1"
AUTH_TOKEN = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJ1c2VyXzQ1NiIsImlhdCI6MTUxNjIzOTAyMn0.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c" # Replace with your valid Bearer token
TARGET_ENDPOINT_TEMPLATE = "/users/{object_id}/profile" # Adjust if your endpoint is different
YOUR_USER_ID = "user_456" # The ID associated with your AUTH_TOKEN. Use a known ID.

# List of object IDs to test. This can be:
# - Incrementing numbers (1, 2, 3...)
# - Known user IDs from other leaked data or enumeration
# - UUIDs if applicable
# - Common admin IDs (e.g., "admin", "1")
TEST_OBJECT_IDS = [
    "user_123",
    "user_789",
    "admin_001",
    "user_457",
    "user_500",
    "nonexistent_id_xyz",
    YOUR_USER_ID # Always test your own ID to establish a baseline
]

# --- Setup Session ---
session = requests.Session()
session.headers.update({
    "Authorization": f"Bearer {AUTH_TOKEN}",
    "Accept": "application/json",
    "User-Agent": "Mozilla/5.0 (Automated BOLA Scanner/1.0; PentesterFieldNotes)"
})

def test_bola(object_id_to_test):
    """
    Constructs and sends a GET request to the target endpoint with the specified object ID.
    """
    full_url = f"{BASE_URL}{TARGET_ENDPOINT_TEMPLATE.format(object_id=object_id_to_test)}"
    
    print(f"[INFO] Testing URL: {full_url}")
    try:
        response = session.get(full_url, allow_redirects=False)
        return response
    except requests.exceptions.RequestException as e:
        print(f"[ERROR] Request failed for {full_url}: {e}")
        return None

def analyze_response(response, object_id_tested):
    """
    Analyzes the HTTP response to detect potential BOLA vulnerabilities.
    """
    if response is None:
        print(f"[RESULT] No response for {object_id_tested}.")
        return

    print(f"[RESULT] ID: {object_id_tested} | Status: {response.status_code}")

    if response.status_code == 200:
        try:
            data = response.json()
            # Heuristic for BOLA:
            # 1. If the response contains an 'id' field that matches 'object_id_tested'
            #    AND 'object_id_tested' is NOT the authenticated user's ID.
            # 2. Further checks might involve looking for specific sensitive fields
            #    or comparing the response structure/content to an expected "access denied" state.
            
            # Simple check for ID matching and not being our own user
            if "id" in data and data["id"] == object_id_tested and object_id_tested != YOUR_USER_ID:
                print(f"  [VULNERABILITY DETECTED] BOLA: Retrieved profile for unauthorized ID '{object_id_tested}'!")
                print(f"  Response Data: {json.dumps(data, indent=2)}")
            elif "message" in data and "not found" in data["message"].lower() or "resource not found" in response.text.lower():
                print(f"  [INFO] Resource '{object_id_tested}' not found (200 OK). This could be an 'implicit deny' or weak authorization.")
            elif object_id_tested == YOUR_USER_ID:
                print(f"  [OK] Successfully retrieved own profile for '{YOUR_USER_ID}'. (Baseline)")
                # Optional: Store this response as a baseline for comparison with other 200 OKs
            else:
                # This block catches 200 OKs that don't immediately fit known BOLA or OK patterns.
                # Manual review is crucial here.
                print(f"  [POSSIBLE BOLA / INFO LEAK] 200 OK for {object_id_tested}. Manual review required.")
                print(f"  Partial Response: {json.dumps(data, indent=2)[:500]}...")
        except json.JSONDecodeError:
            print(f"  [WARNING] 200 OK, but non-JSON response for {object_id_tested}. Manual review required.")
            print(f"  Response Content (first 200 chars): {response.text[:200]}...")
    elif response.status_code in:
        print(f"  [OK] Proper authorization denial (Status {response.status_code}) for {object_id_tested}.")
    else:
        print(f"  [INFO] Unexpected status code {response.status_code} for {object_id_tested}. Manual review recommended.")
        print(f"  Response: {response.text[:200]}...")

# --- Main Execution Loop ---
print(f"--- Starting BOLA detection for endpoint: {TARGET_ENDPOINT_TEMPLATE} ---")
for obj_id in TEST_OBJECT_IDS:
    response = test_bola(obj_id)
    analyze_response(response, obj_id)

print("--- BOLA detection complete ---")

Analyzing Results

The script's output provides immediate feedback. Look for the [VULNERABILITY DETECTED] BOLA messages. If you see a 200 OK response for an ID that isn't your own, and the response body contains data clearly belonging to that ID, you've likely found a BOLA. Even 200 OK responses that return generic "not found" messages when a 403 Forbidden or 404 Not Found would be more appropriate indicate potential information leakage or weak authorization, warranting further investigation.


[INFO] Testing URL: https://api.example.com/v1/users/user_123/profile
[RESULT] ID: user_123 | Status: 200
  [VULNERABILITY DETECTED] BOLA: Retrieved profile for unauthorized ID 'user_123'!
  Response Data: {
    "id": "user_123",
    "username": "jane.doe",
    "email": "[email protected]",
    "role": "standard"
  }
[INFO] Testing URL: https://api.example.com/v1/users/user_456/profile
[RESULT] ID: user_456 | Status: 200
  [OK] Successfully retrieved own profile for 'user_456'. (Baseline)

Expanding Capabilities

Handling Different ID Formats

The TEST_OBJECT_IDS list is currently static. For robust testing, consider dynamically generating IDs:

Sequential IDs: Use range(start, end) to generate numerical IDs.
UUIDs: Capture known UUIDs and then modify a few characters, or generate random UUIDs for fuzzing (less effective for direct hits, but can reveal poor validation).
Enumeration from other endpoints: If another API call lists items (e.g., /v1/posts), extract those IDs to test against a different BOLA-vulnerable endpoint (e.g., /v1/posts/{id}/edit).

Regular expressions can be used to dynamically extract object IDs from a given URL or body pattern if the script needs to identify the ID parameter itself from a generic template.


# Example: Extracting IDs from a list response
def extract_ids_from_list_endpoint(list_url, id_field="id"):
    response = session.get(list_url)
    if response.status_code == 200:
        data = response.json()
        if isinstance(data, list):
            return [item[id_field] for item in data if id_field in item]
    return []

# Usage:
# found_user_ids = extract_ids_from_list_endpoint(f"{BASE_URL}/users", "id")
# TEST_OBJECT_IDS.extend(found_user_ids)

Integrating with Reporting

For large-scale assessments, integrate this script with a reporting framework. Log discovered vulnerabilities to a file (CSV, JSON) or a security issue tracker. Tools like DefectDojo can ingest findings via API, making continuous BOLA testing part of your CI/CD pipeline or regular pentest routines.