Writing Custom Nuclei Templates to Detect Exposed Git Repositories

Detecting exposed Git repositories is a critical task in web application penetration testing. Misconfigured web servers can inadvertently expose the .git/ directory, allowing attackers to access sensitive information like source code, commit history, internal configurations, and potentially credentials. Writing custom Nuclei templates provides a flexible and efficient method to automate the discovery of such exposures, enabling targeted and scalable vulnerability assessment.

Understanding Exposed Git Repositories

A Git repository's .git/ directory contains all the information needed for version control. When this directory is unintentionally placed within a web root and accessible via HTTP, it presents a significant security risk. Attackers can leverage this exposure to reconstruct the entire repository, often with tools like git-dumper.

Key Files to Target within `.git/`

HEAD: Contains a reference to the current branch, e.g., ref: refs/heads/main. This is often the first file checked for Git exposure.
config: Stores repository-specific configuration options, including remote URLs (which might contain credentials or internal network paths).
index: The staging area; while not directly readable as text, its presence indicates a full repository.
logs/HEAD: Contains the commit history, revealing commit messages, authors, and timestamps.
objects/: This directory stores the actual content of your repository (commits, trees, blobs) in a compressed format. Specific object files are typically named after their SHA-1 hash.

Nuclei Template Fundamentals

Nuclei templates are YAML-based files that define HTTP requests, matching rules, and extraction logic to identify vulnerabilities or misconfigurations.

Template Structure

A basic Nuclei template consists of several key blocks:

id: A unique identifier for the template.
info: Metadata about the template, including name, author, severity, and description.
http: Defines the HTTP requests to be sent.
matchers: Specifies conditions that must be met in the HTTP response for a detection to be considered positive.
extractors: Captures specific data from the response.

Crafting a Basic Git Exposure Template

The simplest way to detect an exposed Git repository is to check for the presence of the .git/HEAD file. This file almost always exists in a Git repository and has a predictable format.

Initial Template for `.git/HEAD`

This template sends a GET request to /.git/HEAD and checks if the response body contains the string ref: refs/heads/ and if the status code is 200.

id: git-head-exposure
info:
  name: Exposed .git/HEAD File
  author: your-name
  severity: high
  description: Detects publicly accessible .git/HEAD file, indicating potential Git repository exposure.
  reference:
    - https://book.git-scm.com/
  tags: git,exposure,config,misconfiguration

http:
  - method: GET
    path:
      - "{{BaseURL}}/.git/HEAD"

    matchers-condition: and
    matchers:
      - type: status
        status:
          - 200

      - type: word
        words:
          - "ref: refs/heads/"
        part: body
        
    # Optional: Add a negative matcher to avoid false positives on generic 404 pages
    # - type: word
    #   words:
    #     - "<!DOCTYPE html>"
    #     - "The requested URL was not found on this server"
    #   part: body
    #   condition: or
    #   negative: true

Here, {{BaseURL}} is a Nuclei variable that will be replaced with the target URL during the scan. The matchers-condition: and ensures both the 200 status code and the specific string are present.

Enhancing the Template – Deeper Probes

While .git/HEAD is a good start, a comprehensive check requires looking for other critical files. We can extend the template to check for .git/config and .git/logs/HEAD, which often contain more immediately useful information.

Multi-Path Git Exposure Template

This template uses multiple paths within a single HTTP request block. Nuclei will iterate through these paths for each target. Incorporating internet-wide reconnaissance tools like Zondex can help identify a broad range of targets to apply these enhanced templates against, maximizing coverage for exposed services.

id: git-repository-deep-exposure
info:
  name: Exposed Git Repository - Deep Scan
  author: your-name
  severity: critical
  description: Detects multiple publicly accessible .git files, indicating a highly exposed Git repository.
  reference:
    - https://portswigger.net/daily-swig/source-code-disclosure-via-exposed-git-directory-leads-to-rce
  tags: git,exposure,config,logs,source-code,sensitive

http:
  - method: GET
    path:
      - "{{BaseURL}}/.git/HEAD"
      - "{{BaseURL}}/.git/config"
      - "{{BaseURL}}/.git/logs/HEAD"
      - "{{BaseURL}}/.git/index"

    matchers-condition: or
    matchers:
      - type: status
        status:
          - 200
        condition: and # Match 200 status for any of the paths

      - type: word
        words:
          - "ref: refs/heads/" # For .git/HEAD
          - "[core]" # For .git/config
          - "[remote" # For .git/config
          - "commit" # For .git/logs/HEAD
          - "tree" # Potentially for .git/index (though index is binary)
        condition: or
        part: body

This template uses an or condition for the matchers, meaning if any of the specified words are found in the body of any of the requested paths (with a 200 status code), it will be flagged. This approach increases the likelihood of detection by checking for different tell-tale signs. For instance, the [core] or [remote strings are strong indicators of a Git configuration file.

Extracting Information from `.git/config`

When .git/config is exposed, it's often valuable to extract the remote repository URL. Nuclei's extractors can achieve this using regular expressions.

id: git-config-remote-extractor
info:
  name: Exposed Git Config - Remote URL Extraction
  author: your-name
  severity: high
  description: Detects and extracts remote repository URL from a publicly accessible .git/config file.
  reference:
    - https://docs.github.com/en/github/getting-started-with-github/managing-remote-repositories
  tags: git,exposure,config,credentials,recon

http:
  - method: GET
    path:
      - "{{BaseURL}}/.git/config"

    matchers-condition: and
    matchers:
      - type: status
        status:
          - 200

      - type: word
        words:
          - "[remote"
          - "url ="
        condition: and
        part: body

    extractors:
      - type: regex
        part: body
        regex:
          - 'url\s*=\s*(https?|git|ssh|ftp|ftps):\/\/[^\s\n]+'
        name: git_remote_url

The extractors block uses a regex to pull out any URL following url = within the response body, naming it git_remote_url. This extracted information will be displayed in the Nuclei output, providing immediate context for further investigation.

Advanced Techniques and Deployment

For more sophisticated scenarios, you might consider chaining requests or using a workflow. While basic Git exposure can be found with direct requests, a multi-step workflow could, for example, first confirm .git/HEAD, then use extracted branch names to fetch more specific `refs` files.

Chaining Requests (Workflow Example)

Nuclei supports workflows that chain multiple templates or requests. This allows for more dynamic scanning. For instance, you could extract the branch name from .git/HEAD and then use that branch name to fetch .git/refs/heads/<branch_name> to get the latest commit hash.

A full workflow template is beyond the scope of a single template, but the concept involves using an extractor with internal: true in a first template to pass a variable to a subsequent template.

Integrating with Other Tools

During larger engagements, it's common to pipe targets from broader reconnaissance efforts into Nuclei. Tools like GProxy can be invaluable for routing your Nuclei scans through specific proxies, which might be necessary for bypassing IP-based rate limiting or accessing targets from particular geographic locations. Additionally, for automated and continuous security validation, integrating custom Nuclei templates into a platform like Secably allows for systematic scanning across an organization's assets and a centralized view of findings.

Running Custom Templates

To run your custom Nuclei templates, save them as .yaml files (e.g., git-config-exposure.yaml) and use the -t flag, or place them in a custom directory and point Nuclei to that directory.

# Scan a single target with a specific custom template
nuclei -u https://example.com -t /path/to/your/templates/git-head-exposure.yaml

# Scan a list of targets with all templates in a custom directory
nuclei -l targets.txt -t /path/to/your/custom-git-templates/

# Example output for a detected exposure
# [git-head-exposure] [http] [high] https://example.com/.git/HEAD
# [git-config-remote-extractor] [http] [high] https://example.com/.git/config [git_remote_url: https://github.com/org/repo.git]

The -l flag accepts a list of targets, one per line, making it suitable for mass scanning.

By developing and deploying these tailored Nuclei templates, pentesters gain a powerful capability to proactively identify and report exposed Git repositories, mitigating a common and high-impact vulnerability.