Writing Custom Nuclei Templates to Detect Publicly Exposed Cloud Storage Buckets

Detecting Publicly Exposed Cloud Storage with Custom Nuclei Templates

Pinpointing publicly exposed cloud storage buckets is a critical task in any red team engagement or vulnerability assessment. While off-the-shelf tools offer broad coverage, true efficacy often demands custom Nuclei templates. These templates allow for highly specific detection logic, targeting unique naming conventions, application-specific exposure patterns, and subtle response indicators that generic scanners might miss. Directly addressing the attack surface presented by misconfigured S3, GCS, or Azure Blob storage accounts can uncover significant data leakage or unauthorized write opportunities.

Why Custom Templates?

Generic vulnerability scanners, while useful, often rely on broad signatures or common misconfigurations. Cloud environments, however, frequently feature bespoke naming schemes, application-specific subdomains, and unique deployment quirks. A custom Nuclei template provides the granular control necessary to craft requests tailored to these specifics. Instead of scanning for every possible S3 bucket, you can focus on permutations relevant to your target organization, drastically reducing noise and improving detection rates. This targeted approach is essential when dealing with the vast and often ambiguous landscape of cloud assets, especially after initial reconnaissance with tools like Zondex helps identify potential services and hostnames.

Understanding Cloud Storage Exposure Patterns

Each major cloud provider has distinct URLs and response characteristics for their storage services:

AWS S3: Buckets are typically accessed via https://<bucket-name>.s3.amazonaws.com/ or a custom CNAME. Publicly listable buckets often return XML containing <ListBucketResult>, <Contents>, and <Key> tags when accessed with a GET request to the root.
Google Cloud Storage (GCS): GCS buckets are found at https://storage.googleapis.com/<bucket-name>/. A public listing might return an HTML page showing "Index of /" or JSON if API access is permitted.
Azure Blob Storage: These are structured as https://<account-name>.blob.core.windows.net/<container-name>/. Listing a container often involves appending ?restype=container&comp=list and results in XML output detailing blobs within.

Detecting exposure hinges on understanding these access patterns and the typical HTTP responses for both open and closed configurations. A 403 Forbidden is common for private buckets, while a 200 OK with specific content is a strong indicator of exposure.

Nuclei Template Fundamentals

Nuclei templates are YAML-based files that define HTTP requests and how to match against their responses. Key components include:

id: A unique identifier for the template.
info: Metadata like name, author, severity, and description.
requests: Defines the HTTP request(s) to send (method, path, headers, body, etc.).
matchers: Specifies conditions to meet for a successful detection (status code, words, regex, size, etc.).

Here’s a skeleton to illustrate the structure:


id: basic-template-example
info:
  name: Basic Template Example
  author: your-name
  severity: info
  description: "A very basic Nuclei template structure."
requests:
  - method: GET
    path:
      - "{{BaseURL}}"
    matchers:
      - type: status
        status:
          - 200

Crafting a Basic S3 Bucket Enumeration Template

Our first custom template targets AWS S3. We'll look for publicly accessible buckets that return a 200 OK status code and contain specific XML tags indicative of an open directory listing. This initial check aims to identify read access.


id: aws-s3-bucket-listable
info:
  name: AWS S3 Publicly Listable Bucket
  author: your-name
  severity: high
  description: "Detects AWS S3 buckets with public listing enabled, indicating potential data exposure."
  reference:
    - https://docs.aws.amazon.com/AmazonS3/latest/userguide/example-walkthroughs-s3-access-control.html
  tags: [cloud, aws, s3, exposure, misconfiguration, enumeration]

requests:
  - method: GET
    path:
      - "{{BaseURL}}"
    headers:
      User-Agent: "Nuclei-Cloud-Scanner/1.0"
    matchers-condition: and
    matchers:
      - type: status
        status:
          - 200
      - type: word
        words:
          - "<ListBucketResult>"
          - "<Name>"
          - "<Key>"
        condition: and
        part: body
      - type: word
        words:
          - "x-amz-request-id"
        part: header
        condition: and

To use this template, save it as aws-s3-listable.yaml. Then, create a targets.txt file with potential S3 bucket URLs (e.g., https://mycompany-dev.s3.amazonaws.com, https://app-data-prod.s3.amazonaws.com). Execute Nuclei:


nuclei -t aws-s3-listable.yaml -l targets.txt

An example output indicating a hit might look like this:


[aws-s3-bucket-listable] [HIGH] https://example-public-bucket.s3.amazonaws.com/

Detecting Open Directory Listings (Read Access)

Building on the previous template, we can refine the detection of read access. While the ListBucketResult is a strong indicator, sometimes buckets are configured differently or serve static content without a formal listing page. The key is to look for common HTML elements or XML tags that imply directory listing or easily browsable content. For GCS, this often means checking for an "Index of /" title in the HTML response. For comprehensive vulnerability scanning and automated web security testing, consider platforms like Secably, which can integrate such custom checks into broader security workflows.

Identifying Writeable Buckets

Discovering a writeable bucket is a significant find. This means an attacker could upload malicious files, deface websites, or inject ransomware. Testing for write access involves sending a PUT request with a small test file. We look for a 200 OK or 204 No Content status code, indicating success. This is a destructive test, so always exercise caution and ensure authorization.


id: aws-s3-bucket-writeable
info:
  name: AWS S3 Publicly Writeable Bucket
  author: your-name
  severity: critical
  description: "Detects AWS S3 buckets with public write access, allowing unauthorized file uploads."
  reference:
    - https://docs.aws.amazon.com/AmazonS3/latest/userguide/example-walkthroughs-s3-access-control.html
  tags: [cloud, aws, s3, exposure, misconfiguration, write]

requests:
  - method: PUT
    path:
      - "{{BaseURL}}/nuclei-test-{{randstr}}.txt"
    headers:
      Content-Type: "text/plain"
      User-Agent: "Nuclei-Cloud-Scanner/1.0"
    body: |
      This is a test file uploaded by Nuclei to check for write permissions.
      {{randstr}}
    matchers-condition: and
    matchers:
      - type: status
        status:
          - 200
          - 204 # No Content often indicates success
      - type: word
        words:
          - "x-amz-request-id"
        part: header
        condition: and

Remember to clean up any test files uploaded if the bucket is confirmed writeable and authorized to modify. When performing such tests, especially against production systems, routing traffic through secure proxies like GProxy can help obscure your origin and manage multiple egress points, adding an extra layer of operational security.

Google Cloud Storage (GCS) Specifics

GCS buckets have a different structure and often return HTML for public listings. Our template needs to adapt to these response bodies and potential error messages for non-existent buckets.


id: gcs-bucket-open-listing
info:
  name: Google Cloud Storage Public Listing
  author: your-name
  severity: high
  description: "Detects GCS buckets with public listing enabled, indicating potential data exposure."
  reference:
    - https://cloud.google.com/storage/docs/access-control/making-data-public
  tags: [cloud, gcp, gcs, exposure, misconfiguration, enumeration]

requests:
  - method: GET
    path:
      - "{{BaseURL}}"
    headers:
      User-Agent: "Nuclei-Cloud-Scanner/1.0"
    matchers-condition: and
    matchers:
      - type: status
        status:
          - 200
      - type: word
        words:
          - "<title>Index of /</title>"
          - "googleusercontent.com"
        condition: or
        part: body
      - type: word
        words:
          - "x-goog-hash"
          - "x-goog-meta-owner"
        part: header
        condition: or

Azure Blob Storage Considerations

Azure Blob storage requires a slightly different approach, often needing the restype=container&comp=list query parameter to trigger a listing response. The response is typically XML.


id: azure-blob-container-listable
info:
  name: Azure Blob Storage Publicly Listable Container
  author: your-name
  severity: high
  description: "Detects Azure Blob Storage containers with public listing enabled, indicating potential data exposure."
  reference:
    - https://learn.microsoft.com/en-us/rest/api/storageservices/list-blobs
  tags: [cloud, azure, blob, exposure, misconfiguration, enumeration]

requests:
  - method: GET
    path:
      - "{{BaseURL}}?restype=container&comp=list"
    headers:
      User-Agent: "Nuclei-Cloud-Scanner/1.0"
    matchers-condition: and
    matchers:
      - type: status
        status:
          - 200
      - type: word
        words:
          - "<EnumerationResults>"
          - "<Blobs>"
          - "<Name>"
        condition: and
        part: body
      - type: word
        words:
          - "x-ms-request-id"
        part: header
        condition: and

Integrating with Recon Workflows

To effectively use these templates, you need a solid list of potential targets. This often starts with subdomain enumeration and then filtering for relevant cloud storage patterns. Tools like subfinder and httpx are invaluable here. You can pipe outputs to create target lists for Nuclei.


# Example recon workflow to generate potential cloud storage targets
DOMAIN="example.com"
subfinder -d $DOMAIN -silent |
    grep -E "\.s3\.amazonaws\.com|\.storage\.googleapis\.com|\.blob\.core\.windows\.net" |
    httpx -silent -mc 200,403,404 -status-code -o cloud_targets.txt

# Then run Nuclei against the generated list
nuclei -l cloud_targets.txt -t aws-s3-listable.yaml,aws-s3-bucket-writeable.yaml,gcs-bucket-open-listing.yaml,azure-blob-container-listable.yaml

The httpx command helps filter for responsive endpoints and capture status codes, which can sometimes pre-filter targets that are genuinely interesting for storage enumeration. Even a 403 Forbidden can be worth investigating with a custom Nuclei template, as it might signify an improperly secured bucket that only requires specific headers or parameters for access.

Best Practices and Advanced Matchers

When crafting templates, consider the following:

Specificity: Use matchers-condition: and to combine multiple checks (e.g., status code AND specific words) for fewer false positives.
Part Selection: Always specify the part (body, header, all) where your matcher should look.
Regex Matchers: For more complex patterns, utilize type: regex. For example, to extract bucket names or specific file types.
Fuzzing Bucket Names: Combine Nuclei with wordlists for common bucket name permutations ({{BaseURL}} or {{Hostname}} coupled with a list of common prefixes/suffixes).
Rate Limiting: Use Nuclei's built-in rate limiting (-rl flag) to avoid overwhelming targets or triggering WAFs.

An advanced matcher might look for the presence of specific file extensions (e.g., .env, .git/config) within a listed directory, indicating even more critical exposure than a simple listing.

The ability to write custom Nuclei templates transforms a basic scanner into a highly specialized, context-aware reconnaissance tool. Mastering this skill allows pentesters and security engineers to move beyond generic checks and uncover deep-seated misconfigurations in cloud environments, significantly improving the efficacy of security assessments.