Writing a Custom Burp Suite Extension (Bambda) for Automated API Parameter Discovery

Writing a Custom Burp Suite Extension (Bambda) for Automated API Parameter Discovery

Automating the discovery of API parameters is a critical task for any penetration tester aiming to thoroughly assess web applications and their underlying services. Manually identifying every potential parameter across a sprawling API surface is time-consuming and prone to human error. A custom Burp Suite extension, often referred to as a "Bambda" when written in Python (Jython), can intercept and parse HTTP traffic on the fly, systematically extracting and cataloging API parameters from both requests and responses. This approach significantly enhances reconnaissance efforts, ensuring a more complete understanding of the attack surface.

The Case for Automated API Parameter Discovery

Modern web applications increasingly rely on complex APIs, often exposing numerous endpoints with a multitude of parameters. These parameters, whether in the URL query string, request body, or HTTP headers, define the application's functionality and are prime targets for manipulation and vulnerability exploitation. Without a comprehensive list of parameters, security testing efforts remain incomplete. Manual enumeration falls short quickly as API complexity grows. An automated solution provides consistency and speed, allowing pentesters to focus on analyzing the impact of parameters rather than merely finding them. Automating this discovery process with a custom Bambda extension complements broader security testing initiatives, especially when integrating with platforms like Secably for comprehensive vulnerability scanning and automated web security assessments.

Burp Extensibility: Jython and Bambda

Burp Suite's extensibility framework allows security professionals to tailor its functionality to specific testing needs. While extensions can be written in Java, Ruby, or Python, Jython (a Python implementation that runs on the Java Virtual Machine) offers a rapid development cycle, making it a popular choice for custom Burp tools. These Python-based extensions are colloquially known as "Bambdas." The core of a Bambda that interacts with HTTP traffic involves implementing the IBurpExtender and IHttpListener interfaces.

Setting Up Your Burp Extension Environment

Before writing code, ensure your Burp Suite environment is configured for Jython: 1. **Download Jython Standalone JAR**: Obtain the latest Jython standalone JAR file from the official Jython website. 2. **Configure Burp**: In Burp Suite, navigate to `Extender > Options`. Under the "Python Environment" section, click "Select file..." and point it to your downloaded Jython standalone JAR. 3. **Load Your Extension**: Once your Python script is ready, go to `Extender > Extensions > Add`. Select "Python" as the extension type and choose your `.py` file. Any output from `print()` statements in your script will appear in the "Output" tab.

Core Components of a Bambda Listener

A minimal Burp extension that listens to HTTP traffic must implement `IBurpExtender` and `IHttpListener`. The `registerExtenderCallbacks` method is where your extension initializes itself, sets its name, and registers the HTTP listener. The `processHttpMessage` method is the workhorse, called for every HTTP request and response passing through Burp.

from burp import IBurpExtender
from burp import IHttpListener
import json
import re
from urlparse import urlparse, parse_qsl

class BurpExtender(IBurpExtender, IHttpListener):

    def registerExtenderCallbacks(self, callbacks):
        self._callbacks = callbacks
        self._helpers = callbacks.getHelpers()
        self._callbacks.setExtensionName("API Parameter Discoverer")
        self._callbacks.registerHttpListener(self)
        self.discovered_params = set()
        print("[+] API Parameter Discoverer loaded.")

    def processHttpMessage(self, toolFlag, messageIsRequest, messageInfo):
        # We only care about requests for parameter discovery
        if messageIsRequest:
            self.analyze_request_for_params(messageInfo)
        # Optionally, you could analyze responses for parameters in links or embedded JSON
        # else:
        #     self.analyze_response_for_params(messageInfo)

    def analyze_request_for_params(self, messageInfo):
        # Implementation for analyzing request will go here
        pass

    # def analyze_response_for_params(self, messageInfo):
    #     # Implementation for analyzing response will go here
    #     pass

In the code above: * `IBurpExtender` is the fundamental interface for all extensions. * `IHttpListener` enables the extension to receive HTTP requests and responses. * `_callbacks` provides access to Burp's API. * `_helpers` offers utility methods for parsing and building HTTP messages. * `setExtensionName` gives your extension a display name in the Extender tab. * `registerHttpListener(self)` registers this class instance to receive HTTP messages. * `discovered_params` is a Python `set` to store unique parameter names, preventing duplicates.

Extracting Parameters: A Deep Dive

The core logic resides within `analyze_request_for_params` (and potentially `analyze_response_for_params`). We'll focus on request analysis here, covering URL query parameters, JSON body parameters, and form-encoded body parameters.

URL Query Parameters

Extracting query parameters involves parsing the URL. Burp's `IRequestInfo` object, obtained via `_helpers.analyzeRequest`, provides convenient methods. For Python's `urlparse` module is also highly effective.

    def extract_url_params(self, requestInfo):
        http_service = requestInfo.getUrl()
        parsed_url = urlparse(str(http_service))
        query_params = parse_qsl(parsed_url.query)
        for name, value in query_params:
            self.add_param(name, "URL_QUERY")

The `urlparse` function breaks down a URL into components, and `parse_qsl` specifically parses the query string into key-value pairs.

JSON Body Parameters

For API requests, JSON is a prevalent format. Extracting parameters from a JSON body requires parsing the body content and recursively identifying all keys.

    def extract_json_params(self, request_body):
        try:
            # Decode byte array to string, then parse JSON
            json_data = self._helpers.bytesToString(request_body)
            parsed_json = json.loads(json_data)
            self._recursive_json_param_extract(parsed_json)
        except ValueError:
            # Not valid JSON, or other parsing error
            pass

    def _recursive_json_param_extract(self, obj):
        if isinstance(obj, dict):
            for key, value in obj.items():
                self.add_param(key, "JSON_BODY")
                self._recursive_json_param_extract(value)
        elif isinstance(obj, list):
            for item in obj:
                self._recursive_json_param_extract(item)

Here, `json.loads` converts the JSON string to a Python dictionary or list, then a recursive function traverses the structure to find all keys.

Form-Encoded Body Parameters

Traditional HTML form submissions or API requests using `application/x-www-form-urlencoded` require a different parsing approach. Burp's helper functions or `urlparse.parse_qsl` can handle this.

    def extract_form_params(self, request_body):
        # For form-urlencoded, the body is essentially a query string
        form_params = parse_qsl(self._helpers.bytesToString(request_body))
        for name, value in form_params:
            self.add_param(name, "FORM_BODY")

Headers

While HTTP headers aren't typically "parameters" in the same mutable sense as query or body parameters, sensitive information or API tokens are often passed in headers (e.g., `Authorization`, `X-API-Key`). Collecting header names can still be valuable.

    def extract_header_names(self, requestInfo):
        headers = requestInfo.getHeaders()
        for header_line in headers:
            # Headers are typically "Name: Value"
            if ':' in header_line:
                name = header_line.split(':', 1).strip()
                self.add_param(name, "HEADER")

The `requestInfo.getHeaders()` method returns a list of header strings.

Storing and Presenting Findings

Maintaining a unique list of discovered parameters is essential. A Python `set` is perfect for this. Periodically printing the unique parameters to Burp's extension output (`stdout`) keeps the tester informed.

# ... (inside BurpExtender class) ...

    def add_param(self, name, param_type):
        param_identifier = f"{name} ({param_type})"
        if param_identifier not in self.discovered_params:
            self.discovered_params.add(param_identifier)
            print(f"Discovered: {param_identifier}")

    def analyze_request_for_params(self, messageInfo):
        requestInfo = self._helpers.analyzeRequest(messageInfo)
        
        # Extract URL parameters
        self.extract_url_params(requestInfo)

        # Extract Header names
        self.extract_header_names(requestInfo)

        # Extract Body parameters based on content type
        body_bytes = messageInfo.getRequest()[requestInfo.getBodyOffset():]
        content_type_header = self._helpers.getRequestHeaders(messageInfo) # Simplified, usually need to iterate
        
        # More robust content-type extraction
        content_type = ""
        for header in self._helpers.analyzeRequest(messageInfo).getHeaders():
            if header.lower().startswith('content-type:'):
                content_type = header.split(':', 1).strip().lower()
                break

        if body_bytes:
            if "json" in content_type:
                self.extract_json_params(body_bytes)
            elif "x-www-form-urlencoded" in content_type:
                self.extract_form_params(body_bytes)
            # Add other content types as needed (e.g., multipart/form-data, XML)

The Bambda in Action: Full Code Example

This combined script demonstrates a functional API parameter discovery extension.

from burp import IBurpExtender
from burp import IHttpListener
import json
import re
from urlparse import urlparse, parse_qsl

class BurpExtender(IBurpExtender, IHttpListener):

    def registerExtenderCallbacks(self, callbacks):
        self._callbacks = callbacks
        self._helpers = callbacks.getHelpers()
        self._callbacks.setExtensionName("API Parameter Discoverer")
        self._callbacks.registerHttpListener(self)
        self.discovered_params = set()
        print("[+] API Parameter Discoverer loaded. Monitoring HTTP traffic for parameters.")

    def processHttpMessage(self, toolFlag, messageIsRequest, messageInfo):
        # Only process requests from the Proxy or Repeater tools for active discovery
        # This prevents flooding the output with Scanner or Spider traffic unless desired
        if messageIsRequest and toolFlag in (self._callbacks.TOOL_PROXY, self._callbacks.TOOL_REPEATER):
            self.analyze_request_for_params(messageInfo)

    def analyze_request_for_params(self, messageInfo):
        requestInfo = self._helpers.analyzeRequest(messageInfo.getRequest())
        
        # 1. Extract URL Query Parameters
        self.extract_url_params(requestInfo)

        # 2. Extract Header Names (useful for API keys, custom headers)
        self.extract_header_names(requestInfo)

        # 3. Extract Body Parameters based on Content-Type
        request_body_bytes = messageInfo.getRequest()[requestInfo.getBodyOffset():]
        
        if request_body_bytes:
            content_type = ""
            for header in requestInfo.getHeaders():
                if header.lower().startswith('content-type:'):
                    content_type = header.split(':', 1).strip().lower()
                    break

            if "json" in content_type:
                self.extract_json_params(request_body_bytes)
            elif "x-www-form-urlencoded" in content_type:
                self.extract_form_params(request_body_bytes)
            # Add more handlers for other content types (e.g., XML, multipart/form-data) here
            # For XML, you'd need an XML parser (e.g., from xml.etree.ElementTree)
            # For multipart, it's more complex, often involving regex or specific libraries

    def extract_url_params(self, requestInfo):
        http_service = requestInfo.getUrl()
        parsed_url = urlparse(str(http_service))
        query_params = parse_qsl(parsed_url.query)
        for name, value in query_params:
            self.add_param(name, "URL_QUERY")

        # Basic path parameter detection (heuristic: common placeholders like {id})
        path_segments = parsed_url.path.split('/')
        for segment in path_segments:
            if re.match(r'^[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}$', segment, re.IGNORECASE): # UUID
                self.add_param("UUID_PATH_PARAM", "PATH")
            elif re.match(r'^\d+$', segment): # Numeric ID
                self.add_param("NUMERIC_PATH_PARAM", "PATH")
            # More sophisticated path parameter detection could involve analyzing Swagger/OpenAPI specs

    def extract_json_params(self, request_body_bytes):
        try:
            json_data = self._helpers.bytesToString(request_body_bytes)
            parsed_json = json.loads(json_data)
            self._recursive_json_param_extract(parsed_json)
        except ValueError:
            pass # Not valid JSON

    def _recursive_json_param_extract(self, obj):
        if isinstance(obj, dict):
            for key, value in obj.items():
                self.add_param(key, "JSON_BODY")
                self._recursive_json_param_extract(value)
        elif isinstance(obj, list):
            for item in obj:
                self._recursive_json_param_extract(item)

    def extract_form_params(self, request_body_bytes):
        form_params = parse_qsl(self._helpers.bytesToString(request_body_bytes))
        for name, value in form_params:
            self.add_param(name, "FORM_BODY")

    def extract_header_names(self, requestInfo):
        headers = requestInfo.getHeaders()
        for header_line in headers:
            if ':' in header_line:
                name = header_line.split(':', 1).strip()
                self.add_param(name, "HEADER")

    def add_param(self, name, param_type):
        param_identifier = f"{name} ({param_type})"
        if param_identifier not in self.discovered_params:
            self.discovered_params.add(param_identifier)
            self._callbacks.issueAlert(f"New Parameter Discovered: {param_identifier}")
            print(f"Discovered: {param_identifier}")

This script logs newly discovered parameters to Burp's Extender output tab and also issues an alert, providing immediate feedback.

Refinements and Next Steps

This basic framework can be extended significantly. Consider: * **Response Analysis**: Extracting parameters from JSON or XML responses, particularly when they contain links or dynamic data that might hint at new API endpoints or parameters. * **Context Menus**: Add a context menu item to send a selected request to Repeater or Intruder with discovered parameters pre-filled. * **Filtering**: Implement host-based or path-based filtering to focus on specific targets. * **Passive Scanning**: Integrate with Burp's `IScannerCheck` to report identified parameters as passive scan issues. * **Output Formats**: Save discovered parameters to a file in a structured format (e.g., CSV, JSON) for later use by other tools. * **Advanced Parameter Types**: Heuristic detection for path parameters (e.g., UUIDs, numeric IDs) and more complex body types. This Bambda works effectively whether Burp is directly proxying traffic or operating in a more complex setup involving proxy chains, such as those facilitated by GProxy to route traffic through various intermediary points for advanced reconnaissance or obfuscation. Such setups are common in advanced penetration testing scenarios where traffic needs to be routed through multiple layers before reaching the target.