linkprobe
Find broken links before your users do.
A CLI tool that recursively crawls your site, checks every URL in parallel, and exports a clean CSV report — with optional email alerts.
Everything you need
Built on the Python standard library. Zero dependencies for the core tool — just clone and scan.
Recursive crawl
BFS traversal from the start URL through all internal pages. Visited-URL tracking prevents infinite loops on any site structure.
External links checked
External URLs are verified for their HTTP status without being crawled — full coverage with no recursion noise.
Parallel requests
Thread pool with configurable workers (default: 10). Scans large sites fast without hammering your server.
CSV + Markdown output
Each scan writes a machine-readable CSV and a human-readable
Markdown summary, organised by timestamp under
scans/.
Status-code filtering
--keep-status-codes 404,500
trims the report to only the codes you care about — no noise,
straight to the problems.
Email notifications
Receive a post-scan alert via the Resend API when broken links are found. Optional 3xx inclusion for redirect awareness.
How it works
A three-stage pipeline — crawl, check, report.
Crawl
BFS from the start URL. Internal links are followed; external links are collected but not recursed into. Fragments are stripped and URL normalisation prevents duplicates.
Check
Every discovered URL is status-checked in parallel. HEAD is attempted first; 405 responses fall back to GET. 3xx codes are recorded as-is — redirects are not followed.
Report
Results are sorted by
(referrer, link) and written to
CSV. A Markdown summary highlights all non-200 links.
Optionally, a Resend email is dispatched.
Quick start
Python 3.10+ required. Core tool needs no packages.
1 — Clone
git clone https://github.com/JeremieLitzler/linkprobe.git
cd linkprobe
# Optional: Resend SDK (only needed for --notify-email)
pip install -r requirements.txt
2 — Run a scan
# Auto-saves to scans/[WEBSITE]/[TIMESTAMP]/ python src/checker.py https://example.com # Custom output file, more workers python src/checker.py https://example.com \ --output report.csv \ --workers 20 # Show only broken links python src/checker.py https://example.com \ --keep-status-codes 404,500
3 — Email alerts (optional)
export RESEND_API_KEY=re_xxxx
export RESEND_FROM_ADDRESS=scanner@yourdomain.com
python src/checker.py https://example.com \
--notify-email you@example.com
CLI options
All flags are optional except
start_url.
| Option | Default | Description |
|---|---|---|
| start_url | required |
The URL to crawl from — must use
http or
https
|
| --output, -o | scans/[site]/[ts]/ | Path to the output CSV file |
| --workers, -w | 10 | Number of parallel threads for status checking |
| --timeout, -t | 10 | Per-request timeout in seconds |
| --user-agent | linkprobe/1.0 | User-Agent header sent with every request |
| --keep-status-codes | all |
Comma-separated codes to retain in the report, e.g.
404,500
|
| --notify-email | off | Recipient address for a post-scan email (requires Resend env vars) |
| --include-3xx-status-code | off | Include 3xx redirects in email notifications |
Output format
Results sorted by referrer then
link — broken links are
immediately actionable.
results.csv
link,referrer,http_status_code https://example.com/about/,https://example.com/,404 https://example.com/blog/,https://example.com/,200 https://example.com/contact/,https://example.com/,200 https://external.com/missing,https://example.com/contact/,404 https://external.com/page,https://example.com/contact/,200
The URL that was checked
The page containing the link — tells you exactly where to fix it
HTTP code or
ERROR:<ExceptionName> for
network failures
Email notifications
Get a summary email whenever broken links are detected. Powered by Resend — set two environment variables and you're done.
| Variable | Description |
|---|---|
| RESEND_API_KEY | API key from your Resend account |
| RESEND_FROM_ADDRESS | Verified sender address in Resend |
export RESEND_API_KEY=re_xxxx
export RESEND_FROM_ADDRESS=scanner@yourdomain.com
python src/checker.py https://example.com \
--notify-email you@example.com \
--include-3xx-status-code
Need help setting it up?
Contact Me