A command-line tool written in Go to extract URLs from XML sitemaps. Supports:
- Standard XML sitemaps
- Sitemap index files
- Gzipped sitemaps
- Export to file or stdout
# Clone the repository
git clone https://github.com/chandlerroth/sitemap-extract
cd sitemap-extract
# Build the binary
go build -o sitemap
Basic usage to print URLs to stdout:
./sitemap https://example.com/sitemap.xml
Save URLs to a file:
./sitemap -o urls.txt https://example.com/sitemap.xml
- Extracts URLs from standard XML sitemaps
- Supports sitemap index files (sitemapindex)
- Handles gzipped sitemaps automatically
- Can export URLs to a file or print to stdout
- Follows the Sitemaps XML format protocol
- Go 1.21 or later
MIT License