Today I discovered a pretty neat command line tool, wkhtmltopdf, that allows to print a website (or a local HTML document) to PDF preserving the original page style (CSS, images, tables, etc.) in a way that is much truer to the original compared to what a regular browser does, at least in my experience; it also runs headless, if desired, so it can integrate nicely in other automation workflows.
It’s very easy to use:
- Run the following command with just two parameters, the source URL / local file path and the destination file (or add additional parameters, if you need to tweak things a little):
wkhtmltopdf www.apple.com ~/Desktop/test.pdf
- Enjoy your PDF document, possibly without printing it to paper because this tool by default preserves the original page background, which is likely not the most environmentally-friendly choice.
For comparison, here’s apple.com homepage saved with this tool versus the same page printed to PDF from Safari:
Pretty cool, uh?