Photon – Incredibly Fast Web Crawler

CyberPunk Information Gathering

Last Release: 04/05/2019 Last Commit: 03/16/2022

Introduction

Photon is lightning fast web crawler, designed for recon, which extracts URLs, files, intel & endpoints from a target. It has a Ninja Mode, so you can use over 100 threads and everything will go smoothly.

Photon: Lightning Fast Web Crawler

Photon is very flexible web crawler with various options that allows you to control timeout, delay, add seeds, exclude URLs matching a regex pattern, etc. This extensive range of Photon options gives you freedom to crawl a web as you wish.

In addition, it has a smart thread management and refined logic that gives you high performance. Even besides this, it’s not all that Photon can offer you. You can jump in Ninja Mode (--ninja), and to feel real speed boost, because you’ll have for clients making requests to the same server simultaneously. So, you don’t have to worry about slow connection.

Features (data extraction):

URLs: in-scope & out-of-scope,
URLs with parameters.
Intel: emails, social media accounts, amazon buckets, etc.
Files: pdf, png, xml, etc.

Secret keys: auth/API keys & hashes,
JavaScript files & Endpoints present in them
Strings matching custom regex pattern
Subdomains & DNS related data

Plugins:

wayback, dnsdumpster, Exporter

Supported Platforms:

Linux (Arch, Debian, Ubuntu), Termux, Windows, Mac

Compatibility & Dependencies

Photon is compatible with python 2.x - 3.x. Beside standard python libraries, which comes preinstalled with a python interpreter, Photon requires:

requests
urllib3
argparse

Install

You can launch Photon using a lightweight Python-Alpine (103MB) Docker image:

$ git clone https://github.com/s0md3v/Photon.git
$ cd Photon
$ docker build -t photon
$ docker run -it --name photon photon:latest -u google.com

To view results you head over to the local docker volume, just run:

$ docker inspect photon

or by mounting the target loot folder:

$ docker run -it --name photon -v "$PWD:/Photon/google.com" -u google.com

If you want to install it as a library, just do as follows:

$ pip install photon --user

Update

Photon is under heavy development and updates are frequent and regular. To check for updates, simply type --update.

It has seamless update capabilities which means you can update Photon without losing any of your saved data.

Usage

usage: photon.py [options]

  -u --url              root url
  -l --level            levels to crawl
  -t --threads          number of threads
  -d --delay            delay between requests
  -c --cookie           cookie
  -r --regex            regex pattern
  -s --seeds            additional seed urls
  -e --export           export formatted result
  -o --output           specify output directory
  -v --verbose          verbose output
  --keys                extract secret keys
  --exclude             exclude urls by regex
  --stdout              print a variable to stdout
  --timeout             http requests timeout
  --ninja               ninja mode
  --update              update photon
  --dns                 enumerate subdomains & dns data
  --only-urls           only extract urls
  --wayback             Use URLs from archive.org as seeds
  --user-agent          specify user-agent(s)\

To crawl a single website, simply type -u or --url:

$ python photon.py -u "http://example.com"

Crawler linux macOS Python Windows