OSINT Module

The OSINT (Open Source Intelligence) module gathers publicly available information about the target without direct interaction. This passive intelligence gathering helps understand the target's digital footprint.


Module Overview

Function
Purpose
Tools Used

google_dorks

Find exposed files/pages via Google

dorks_hunter

github_dorks

Search GitHub for leaked secrets

gitdorks_go

github_repos

Analyze organization repositories

enumerepo, gitleaks, trufflehog

metadata

Extract document metadata

metagoofil, exiftool

apileaks

Detect exposed APIs

porch-pirate, SwaggerSpy

emails

Harvest email addresses

EmailHarvester, LeakSearch

domain_info

WHOIS and domain intelligence

whois, msftrecon, scopify

third_party_misconfigs

Third-party service misconfigs

misconfig-mapper

spoof

Email spoofing vulnerability

spoofy

mail_hygiene

SPF/DMARC analysis

dig

cloud_enum_scan

Cloud storage enumeration

cloud_enum

ip_info

IP intelligence (for IP targets)

WhoisXML API


Configuration Options

# In reconftw.cfg

# Master toggle
OSINT=true

# Individual toggles
GOOGLE_DORKS=true
GITHUB_DORKS=true
GITHUB_REPOS=true
METADATA=true
EMAILS=true
DOMAIN_INFO=true
IP_INFO=true
API_LEAKS=true
THIRD_PARTIES=true
SPOOF=true
MAIL_HYGIENE=true
CLOUD_ENUM=true

# Limits
METAFINDER_LIMIT=20  # Max documents to fetch (max 250)

Google Dorks

What It Does

Searches Google for sensitive files, pages, and information exposure using predefined dork queries.

How It Works

Example Dorks Searched

  • site:example.com filetype:pdf

  • site:example.com filetype:sql

  • site:example.com inurl:admin

  • site:example.com intitle:"index of"

  • site:example.com ext:log

Output

Sample Output:

Configuration

Note: Google may rate-limit or block automated queries. Results vary based on Google's indexing.


GitHub Analysis

GitHub Dorks (github_dorks)

Searches GitHub for secrets, credentials, and sensitive information related to the target.

Requires: GitHub tokens in $GITHUB_TOKENS file

How It Works:

Dork Categories:

  • API keys and tokens

  • Passwords and credentials

  • Configuration files

  • Database connection strings

  • Private keys

Output:

Sample Output:

GitHub Repos (github_repos)

Analyzes organization repositories for leaked secrets using multiple detection tools.

How It Works:

Output:

Sample Output:

Configuration

Creating Token File:

Tip: Multiple tokens help avoid rate limiting. Create tokens with read:org and repo scopes.


Metadata Extraction

What It Does

Downloads indexed documents (PDF, DOCX, XLSX) and extracts metadata that may reveal:

  • Author names (potential usernames)

  • Email addresses

  • Software versions

  • Internal paths

  • Creation/modification dates

How It Works

Output

Sample Output:

Configuration

Note: Downloading many documents can be slow. Adjust METAFINDER_LIMIT based on needs.


API Leaks

What It Does

Searches for exposed API documentation and collections in:

  • Postman - Public workspaces and collections

  • SwaggerHub - Public API specifications

How It Works

Output

Sample Postman Output:

Configuration


Email Harvesting

What It Does

Discovers email addresses associated with the target domain through:

  • Search engine results

  • Public databases

  • Leaked credential databases

How It Works

Output

Sample Output:

Configuration

⚠️ Ethics: Handle leaked credentials responsibly. Only use for authorized testing.


Domain Intelligence

What It Does

Gathers complete domain information:

  • WHOIS registration data

  • Microsoft 365/Azure tenant domains

  • Related domains via Scopify

How It Works

Output

Sample WHOIS Output:

Configuration


Third-Party Misconfigurations

What It Does

Checks for misconfigurations in third-party services used by the target:

  • Atlassian (Jira, Confluence)

  • Slack

  • Zendesk

  • HubSpot

  • And many more...

How It Works

Output

Sample Output:

Configuration


Email Spoofing Check

What It Does

Analyzes if the domain is vulnerable to email spoofing attacks by checking:

  • SPF record configuration

  • DMARC policy strength

  • DKIM presence

How It Works

Output

Sample Output:

Configuration


Mail Hygiene

What It Does

Performs a quick check of email security DNS records:

  • TXT records (SPF)

  • DMARC records

How It Works

Output

Sample Output:

Configuration


Cloud Enumeration

What It Does

Searches for exposed cloud storage buckets across multiple providers:

  • Amazon S3

  • Azure Blob Storage

  • Google Cloud Storage

  • DigitalOcean Spaces

How It Works

Output

Sample Output:

Configuration


IP Information

What It Does

For IP address targets, gathers:

  • Reverse IP lookups (domains on same IP)

  • WHOIS information

  • Geolocation data

Requires: WHOISXML_API key

How It Works

Output

Sample Output:

Configuration


Running OSINT Only

This executes all enabled OSINT functions without subdomain enumeration or vulnerability scanning.


Output Summary

Function
Output File(s)

google_dorks

osint/dorks.txt

github_dorks

osint/gitdorks.txt

github_repos

osint/github_company_secrets.json

metadata

osint/metadata_results.txt

apileaks

osint/postman_leaks*.txt, osint/swagger_leaks*.txt

emails

osint/emails.txt, osint/passwords.txt

domain_info

osint/domain_info_*.txt, osint/scopify.txt

third_party_misconfigs

osint/3rdparts_misconfigurations.txt

spoof

osint/spoof.txt

mail_hygiene

osint/mail_hygiene.txt

cloud_enum_scan

osint/cloud_enum.txt

ip_info

osint/ip_*_*.txt


Best Practices

  1. Configure API Keys: Many OSINT functions work better with API keys (GitHub, WhoisXML)

  2. Rate Limiting: Google may block automated searches; space out scans

  3. Legal Considerations: OSINT is generally legal but respect terms of service

  4. Credential Handling: Handle any discovered credentials responsibly

  5. Verification: Always verify OSINT findings with additional sources


Next Steps

Last updated