This project is a command line utility that generates HTML snapshots
for each URL from sitemap.xml
file.
Useful for pre-generating HTML files for website crawlers that do not run javascript.
Just in case you need a quick win and you don't want to rewrite whole stack to be server-side rendered :)
npm install sitemap-to-html --save-dev
sitemap-to-html --sitemap sitemap.xml --output build
provide a regular sitemap.xml
file and output
path.
The tool will:
- visit link found in
sitemap.xml
with puppeteer - wait for page to fully load (no network requests for at least 500ms)
- create HTML snapshot of loaded page
- save it in folder provided in
--output
flag (it isbuild
folder by default) - goto
1
, until all links visited
In case of nested links, for example
website.com/home/page/hello
, the output will be saved inbuild/home/page/hello.html