🌐
Subdomain Enumeration Guide
  • Home 🏠
  • Introduction
    • What's the need ?πŸ€”
    • Prerequisites
  • Types
    • Horizontal Enumeration
    • Vertical Enumeration
  • Passive Techniques
    • Passive Sources
    • Certificate Logs
    • Recursive Enumeration
  • Active Techniques
    • DNS Bruteforcing
    • Permutation/Alterations
    • Scraping(JS/Source code)
    • Google analytics
    • TLS, CSP, CNAME Probing
    • VHOST probing
  • Web probing
  • Automation πŸ€–
Powered by GitBook
On this page
  • Source Code Recon
  • Tools: πŸ› 
  • 1) Gospider
  • Installation:
  • Running:
  • 1) Web probing subdomains
  • 2) Cleaning the output
  • 3) Resolving our target subdomains

Was this helpful?

  1. Active Techniques

Scraping(JS/Source code)

PreviousPermutation/AlterationsNextGoogle analytics

Last updated 3 years ago

Was this helpful?

Source Code Recon

JavaScript files are used by modern web applications to provide dynamic content which contains various functions & events. Each website includes JS files and are a great resource for finding those internal subdomains used by the organization.

Tools: πŸ› 

1)

  • Author:

  • Language: Go

is a fast web spidering tool capable of crawling the whole website within in a short amount of time. This means gospider will visit/scrap each and every URL mentioned in the JS file and source code. So, since source code & JS files make up a website they may contain links to other subdomains too.

Installation:

go get -u github.com/jaeles-project/gospider

This is a long process so Brace yourself !!! πŸ’ͺ

Running:

This process is divided into3⃣steps:

1) Web probing subdomains

  • Since we are crawling a website, gospider excepts us to provide URL's, which means in the form of http:// https://

  • So, lets first web probe the subdomains:

cat subdomains.txt | httpx -random-agent -retries 2 -no-color -o probed_tmp_scrap.txt
  • Now, that we have web probed URLs, we can send them for crawling to gospider.

gospider -S probed_tmp_scrap.txt --js -t 50 -d 3 --sitemap --robots -w -r > gospider.txt

Caution: This generates huge traffic on your target

Flags:

  • S - Input file

  • js - Find links in JavaScript files

  • t - Number of threads (Run sites in parallel) (default 1)

  • d - depth (3 depth means scrap links from second-level JS files)

  • sitemap - Try to crawl sitemap.xml

  • robots - Try to crawl robots.txt

2) Cleaning the output

The parth portion of an URL shouldn't have more than 2048 characters. Since, we gopsider

sed -i '/^.\{2048\}./d' gospider.txt

The Point to note here is we have got URLs from JS files & source code till now. We are only concerned with subdomains. Hence we must just extract subdomains from the Gospider output.

cat gospider.txt | grep -Eo 'https?://[^ ]+' | sed 's/]$//' | unfurl -u domains | grep ".example.com$" | sort -u scrap_subs.txt

Break down of the command: a) grep - Extract the links that start with http/https b) sed - Remove " ] " at the end of line c) unfurl - Extract domain/subdomain from the urls d) grep - Only select subdomains of our target e) sort - Avoid duplicates

3) Resolving our target subdomains

  • Now that we have all the subdomains of our target, it's time to DNS resolve and check for valid subdomains.

( hoping you have seen the previous techniques, and you know how to run puredns)

puredns resolve scrap_subs.txt -w scrap_subs_resolved.txt -r resolvers.txt 

I love this technique as, it also finds hidden Amazon S3 buckets used by the organization.If such buckets are open and expose sensitive data than its a WIN WIN situation for us. Also the ouput of this can be sent to secretfinder tool, whihc can find hidden secrets,exposed api tokens etc.

So first, we need to web probe all the subdomains we have gathered till now. For this purpose, we will use .

This can be done using Tomnomnom's tool. It takes a list of URLs as input and extracts the subdomain/domain part from them. You can install Unfurl using this command go get -u github.com/tomnomnom/unfurl

Gospider
Jaeles
Gospider
httpx
unfurl