I built this because I wanted to programatically generate invoices as well as automatically tailor my resume to jobs but had no good way of generating well-formatted PDFs. I ended up building a templating engine to Chromium rendering pipeline to generate PDFs, and due to the amount of engineering effort, turned it into a tool for others that might want to do the same. There's a built-in API (https://htmldocs.com/docs/documents) that you can call to turn JSON into PDFs in a single call.
htmldocs is different from other tools like Wkhtmltopdf and Weasyprint in that it uses Chromium to generate PDFs, meaning that it supports the most modern CSS features and there's minimal drift between the rendered HTML document and PDF.
Will also consider open sourcing if there's enough interest in the project!
https://www.w3.org/TR/css-page-3/
When I looked into it several years ago, browser support for some critical features wasn't there yet. Not sure whether this has improved.
In principle, this would be a great alternative to proprietary PDF rendering libraries which require you to design your document completely in (e.g. Java) code, and to the typical LaTeX approach. You really appreciate the elegance of HTML+CSS once you had the misfortune of having to do a simple fucking table in LaTeX.
For simple stuff, sure HTML and CSS is great, but when I want something print-perfect I use LaTeX.
A simple table in LaTeX is no harder than in HTML, assuming the same level of knowledge in the tool.
But then when you throw in a simple requirement like "left margins on even pages and right margins on odd pages need to be larger", HTML becomes hell to work in.
HTML and CSS, even with a media query for print, sucks.
@page :left {
margin-left: 20mm;
margin-right: 10mm;
}
@page :right {
margin-left: 10mm;
margin-right: 20mm;
}
instead of needing to install/import different LaTeX packages. I think for most of my current use-cases level 2 has been sufficient but I can see if I can include Paged.js polyfill support by default so there's more support for customizability on that front.But it’s old. We need something modern, that represents current ways to make documents from code. And that way is CSS. I bet within my lifetime most scientific publications will move to a PDF tool that ingests CSS. We just need to find something open-source and clean that has no missing functionality.
:nth-of-type(2n+1)Even keeping a BasicTex environment working requires effort that HTML does not.
WeasyPrint implements its own rendering engine and might support some specific @page properties that Chromium does not but given the complexity of CSS and web browsers, you're generally better off sticking with Chromium due to its history.
Also – an added benefit is that we can use JS libraries like FontAwesome, Bootstrap, KaTeX, etc. which is where other HTML to PDF libraries fall flat.
WeasyPrint is implemented as a from-scratch and specific-purpose rendering engine, so yeah, it’s different. But wkhtmltopdf uses WebKit, meaning it’s much the same as htmldocs, just backed by a different browser engine.
It’s important to realise, though, that using an existing browser engine doesn’t mean everything’s hunky-dory: in fact, when it comes to some of the things you care about with producing PDFs, some things will be worse, and the WeasyPrint approach has significant advantages. Because browsers don’t care about your use case at all. From time to time they’ll improve things incidentally, but all browsers are missing things like a lot of CSS Paged Media stuff, stuff that’s often been specified for a decade or more, where things like WeasyPrint have had them for years and years.
Not at liberty to elaborate on exact details, but not so long ago I had to deal with wkhtmltopdf, when it turned out to be the (still preferred/recommended) PDF rendering solution as part of a major popular web middle-ware framework, at a large corporate client. I was rather shocked to see a top-tear prestigious international institution working with such outdated tech (albeit certainly in ignorance), but never mind that.
What struck me most was the nature of the bugs I encountered. Probably one of the most baffling: seemingly randomly changing formatting of the output. In the end it turned out to be a Windows specific problem, where multiple administrators logged onto the Windows Server hosting the web application. Because of different workstation display geometries on their end, they effectively kept changing the display DPI settings of that server (a headless machine, only accessed through RDP). That in turn affected the rendering internals of wkhtmltopdf. Rather hilarious when I finally figured it out. That's when I learned it best never again use wkhtmltopdf on any Windows system (if anywhere at all, for that matter).
Wasn't the WebKit core even older than 2014? Maybe something about it being older but then just maintained independently until 2014 .. or something like that? Or maybe my memory is just messed up and failing me.
Either way, what I do remember is my amazement about seeing this (at best) 10 year old code (arguably of questionable quality to start with and certainly outdated by now) still in use as a go-to solution for rendering PDF from HTML. Ended up replaced it with a puppeteer-based solution. Arguably with its own problems, but less of a black hole than wkhtmltopdf. Especially considering it was (also) rendering user-supplied data. What could ever go wrong, right?
Any suggestions on a web solution that allows non-devs to make great templates would be appreciated.
Historically I've built something simple with Tiny and added a preview button to render, but that super clunky.
For those interested in giving it a try, please contact us at support at cx-reports dot com. We are providing complimentary licenses to users who are eager to collaborate with us during this phase.
Screenshot: https://pasteboard.co/th5f4s0uVWJH.png
Thanks for your long lived contribution.
Its fun and fast to generate things but when you get to 1000+ pages for something like a year long planner the document can quickly balloon past 70 MB.
So far I have kept my 750-1300 page planner between 3-7 MB.
I will give this a try and compare.
Overall this looks very promising!
Quite frankly, htmldocs is the exact project i'm looking for months. I'm tired of word and same alternative and wanted something i can write html and css3 to convert to PDF.
You do and in a beautiful way !
Some question : i just want to use your product be also need to be sure my doc will by avaivable in futur. what's your plan ?
- opensource ? - community/enterprise ? - close source but a docker version to go on premise ?
Thanks for you answer and by the work very good works !
The web version is just the initial step, and will likely open source for people who want to self-host and to increase adoption. For pricing, will probably adopt a model similar to Overleaf where it’s free for most users and maybe charge for team collaboration or have an enterprise license.
But I'll be giving this a try!
> You acknowledge that the Website is a general-purpose search engine and tool. Specifically, but without limitation, the Website allows you to search multiple websites for music. Moreover, the Website is a general-purpose tool that allows you to convert audio files from videos and audio from elsewhere on the Internet. The Website may only be used in accordance with law. We do not encourage, condone, induce or allow any use of the Website that may be in violation of any law.
TOS is pretty general and just there for legal reasons, tl;dr feel free to use as you see fit as long as it's not for anything illegal.
What I like about Typst is that I can use it completely offline and with my editor of choice. Is this planned for htmldocs too?
Typst is great but there are a few key differences:
- full CSS support, allowing for more customizability and familiar styling without a learning curve
- less tailored towards academic use-cases are more towards personal/business use-cases like resumes, invoices, reports. Also adding a template gallery in the near future.
- has a templating engine and API baked in
- can use JS packages and ecosystem (ex. icon sets, fonts, etc.)
I had a similar itch to scratch and I found quarto (https://quarto.org/) - free, open-source and doesn’t depend on chrome (admittedly it has other dependencies, but at least not chrome).
"Typeset and Generate PDFs with HTML/CSS"
really should be the H1 on that page for clarity.
- having a web editor interface without needing to re-run a CLI command
- people who don't want to deal with packages/dependencies on their own system
- users that just want an affordable and easy-to-use API to generate PDF documents instead of having to set up their own server <> task queue <> worker system which would usually cost more
- in the future, teams that want to collaborate together on a document
But
could use some performance improvements...
Maybe you tested this on your state of the art workstation but my i7 8th gen is getting choked to death...
EDIT:
It doesn't seem to work: I picked the resume template and set the margin and padding on `li` to 0 but nothing changed in the output.
> htmldocs.com has not completed the Google verification process. The app is currently being tested, and can only be accessed by developer-approved testers. If you think you should have access, contact the developer.
Uses React Native like components and styling.
WYSIWYG: https://react-pdf.org/repl
2005: https://alistapart.com/article/boom/
2015: https://www.smashingmagazine.com/2015/01/designing-for-print...
Today, this.
I think you’re onto something here.
I work on an OSS reporting and analytics tool (https://github.com/evidence-dev/evidence) and the amount of time and effort that goes into a really good “print pdf”, and how valuable people find it, has been one of the more surprising parts of the project to me.
I’d also like to have a mode — the default? — where all rendering was being done locally. There are documents I write that just can’t be stored on a third party computer.
Overleaf, to some extent, has shown the appeal for letting people focus on writing 100% of the time and running software 0% of the time. I’d prefer to use your tool though to write a quick letter to go in a parcel as a packing slip.
By the way: on mobile, your animated headline causes the page to jump up and down.
To your point that htmldocs supports more modern CSS features, I can see the advantage. Although the most complicated things I needed - aside from @page rules - were a replacement for a 2nd page header (solved w/ Flexbox) and automatic page breaks that would display correctly (a few lines of CSS processed w/ Weasyprint came out better and didn't require me to print to PDF manually.)
I've used https://github.com/xpublisher/weasyprint-rest for a many years but now planning to switch from it as it's archived by the maintainer.
All the best for the future of htmldocs.
https://www.msweet.org/htmldoc/
It’s crazy fast, but doesn’t handle Unicode well or have much support for CSS.
One small thing, when signing up with Gmail, it still asks for my Name and Surname when supposedly it's something you can get out of the OAuth APIs.
About 20 years ago, I did something along similar lines, except going the HTML-XML-XSL/FO-Apache FOP.
That was ... "fun".
I also just open sourced part of my app that does PDF generation, but it’s completely bare bones.
I basically wanted a wrapper around the pdf generating part of chromium so that i could generate pdfs on pages within my own app.
Its here if anyone wants to check it out: https://github.com/awongh/bakso-doc