Heck you can even integrate a full-on requirements management system in them using sphinx-needs https://sphinx-needs.readthedocs.io/en/latest/
Also works with HTML documents produced in other ways.
Wait: also, how is what you're saying different from the built-in singlehtml builder? https://www.sphinx-doc.org/en/master/usage/builders/index.ht...
I always thought CHM files were a nice self-contained option for multi-page HTML docs. (Though they'd happily execute whatever JavaScript the author embedded in there... Maybe that's why they fell out favor?)
Also if you're going to embed a giant binary blob, please ship way to extract it.
Getting data from my Python analysis into the reports are tedious at best and updating numbers last minute is hair pulling frustrating.
But because of the good wysiwyg I can cheat on my adjustments when I need a graph to go “just there”, I can edit my paragraph wording such that I don’t get a almost completely blank page in between sections, etc, etc which is important to make a good looking report, imho.
How do you go about that with rst? I’d love to write a templates rst file that can be fed from my excel sheets and Python scripts, but how do I go about final layout adjustments?
Another (non-Sphinx) thing you can do is just write (portions of) your docx reports directly from Python using python-docx [1]. I use this approach when people give me strict docx templates that need to be filled in from Python in a very specific way. It can drop data-generated tables in at special placeholder sections and everything.
[1] https://python-docx.readthedocs.io/en/latest/
I will say that I've been more and more happy with just using sphinx straight to pdf for very professional looking reports. Given some latex preamble work in the config you can get it looking quite nice. I haven't personally struggled recently with too many egregious formatting issues on the sphinx-built latex stuff. You do have to swap over to landscape mode for large tables, etc. so it takes some work. But you're right that in many cases, formatting issues do still happen, so YMMV.
Another neat trick in sphinx is the csv-table directive [2], which loads table data directly from a csv file you have around, which you can obviously get from your xlsx.
[2] https://docutils.sourceforge.io/docs/ref/rst/directives.html...
1. Great markdown editor with both source and WYSIWYG views
2. Render to a wide range of formats including html, pdf, epub, docx
3. Generate books, web sites, single page docs, presentations
4. Incorporate code (like jupyter) except the source is plain text with fenced blocks
5. Supports code in a number of languages including Python and R.
6. Can use other editors too (iirc there's a plugin for VS Code though never tried it).
7. Built in support for MathJax for mathematical formulae and Mermaid for text-based diagramming with auto inline preview
I prefer it to Word for writing and jupyter for notebooks. No affiliation to Posit, the company that develops both Quarto & RStudio. Just a fan of the products.
--
[0]: https://quarto.org/ [1]: https://posit.co/download/rstudio-desktop/
It senses changes to any file and auto-updates the doc lightning fast - it's far better than LaTeX IMO
[1] https://www.sphinx-doc.org/en/master/usage/restructuredtext/...
Looks like AsciiDoc supports similar latex math blocks [2]. Are there reasons you can't stick with that when doing math?
[2] https://docs.asciidoctor.org/asciidoc/latest/stem/#block
And it's really tweakable, especially with html output where you can provide your own templates, or add in your own CSS/scripts even manual tags.
E.g. I don't care about a configurable formatting for bibliography, but I would want a pre-made template that implements the APA bibliography guidelines with all the tiny nuances correctly. I don't want to configure margins for columns, I want a template that does the IEEE formatting standard exactly. (95% compatibility doesn't work, if a single missing feature means the tool can't produce the required document because it's wrong at one spot on page 3, then I'd need to abandon the tool and pick something that works). And crucially, I want the separation between content and formatting so that I can easily take a blob of content that was formatted for one layout and just copy it in a completely different template and have it match the new formatting guidelines, e.g. automatically moving all the image captions to the other side, changing how they're numbered and referenced, etc.
Latex has all this baggage solved, almost everyone who wants a specific format from me will provide a Latex template with their weird typesetting fetishes included, and I just need to provide the content - while any upcoming tool has an uphill battle to become compatible and provide the same things, at the very least pre-made (and well made) templates for all the major formats (each discipline of science generally uses something different).
I have enjoyed including inline code using the literal-include directive, which allows you to just include sections of code directly from a file in disk. This is great because you can cover your example code with unit tests while also talking about it in docs without replication. You can even use little border comments to mark snippet sections so that it's not sensitive to specific line numbers.
https://www.sphinx-doc.org/en/master/usage/restructuredtext/...
I don't like the language, the ecosystem is too big, complicated and breaks, but the end result is hard to do any other way.
This applies both the equations part, and the text reflow part (I think them as separate things, but they usually go together).
It should be possible to write text in HTML or markdown, and write the equations in latex or asciimath, and turn it into a beautiful/article style pdf, but sadly it is not.
Although CSS (colored and rounded boxes and such) + MathJax-SVG also can look nice.
There are so many different ways people could want characters printed on a sheet of virtual paper that the problem is virtually unconstrained in its difficulty.
TeX was a major theoretical advance, and LaTeX is a nice enough UI layer on TeX that has gotten significant traction. But even outside of TeX, it feels like even software like MS Word are impossibly complex and clunky.
You can make something nicer by dramatically simplifying or cutting the feature set. I think that's probably how Google Docs has a pretty simple interface. But I'm not convinced there's a real replacement for the incumbents that simply tries to improve UI without having a deep technical insight about document layout the way Knuth had with TeX.
Unfortunately typst seems to have replicated the primary one - inventing a new turing complete programming language rather than piggybacking off an existing one.
It's possible to conceptualize a much better latex but it would take years to build properly and build the ecosystem around it to do all the odd things people need when doing markup requiring 1000-2000 community packages.
Sometimes it feels like they're only using LaTeX because they "learned it in college." You ever notice that? So many people in LaTeX threads say they learned it in college, or they've been using the same setting since college, or whatever. People learn LaTeX to make college papers look nice, and then they never need to configure it again? Isn't that strange?
The worst part, though, is that people complain if you call it latex. Which I think says quite a lot about it's userbase.
Of course it will take years to replace LaTeX, but we need to begin working on it.
https://news.ycombinator.com/item?id=39027543
Talks about "htmldocs" (which shows maths formulas on one of their templates) but there are also various other alternatives mentioned in the discussion.
Today, when I saw that I got an invitation to repost this article from the mods, I thought I'd take the time to try it out.
The two commands that the article suggests can be combined into one:
latexmlpost --dest=mydoc.html --format=html5 <(latexml mydoc.tex)
I did a comparison[1] of pdflatex and latexml using some old assignments, and it looks like compiling to HTML isn't fully there yet: the spacing was off in some places, and manual line breaks didn't work. But, I remain hopeful. If this gets polished, viewing LaTeX documents on phones would be much nicer.LaTeXML is maintained by a team at NIST, and they are actively responding to the bug reports on github issues.
The LaTeX team headed by Frank Mittelbach is also working to add more structural information to the output of LaTeX, which will make compiling to HTML much easier.
Looks like a pipe is also supported; you just need to pass `-` as the name of the file to `latexmlpost`.
latexml mydoc.tex | latexmlpost --dest=mydoc.html --format=html5 -The mods personally invited you to repost a year later?
But apart from math typesetting, my latex documents were usually very simple: They just used sections, paragraphs, some theorem environments and references to those, perhaps similar to what the stack project uses [3]. Simple latex such as this corresponds relatively directly to HTML (except for the math formulas of course). But many latex to html tools try to implement a full tex engine, which I believe means that they lower the high-level constructs to something more low level (or that's at least my understanding). This results in very complicated HTML documents from even simple latex input documents.
So what would've been needed for me was a tool that can (1) render all math that pdflatex can render, but that apart from math only needs to (2) support a very limited set of other latex features. In a hacky way, (1) can be accomplished by simply using pdflatex to render each formula of a latex document in isolation to a separate pdf, then converting this pdf to svg, and then incuding this svg in the output HTML in the appropriate position. And (2) is simply a matter of parsing this limited subset of latex. I've prototyped a tool like that here [1]. An example output can be found here [2].
Of course, SVGs are not exactly great for accessibility. But my understanding is that many blind mathematicians are very good at reading latex source code, so perhaps an SVG with alt text set to the latex source for that image is already pretty good.
[1] https://github.com/mbid/latex-to-html
Also check the diagrams: https://stacks.math.columbia.edu/tag/001U
If anyone can explain to me, a complete noob regarding html, how they achieve this result with html, css and whichever latex engine they use, I would be grateful. I want to make a personal webpage in this style.
Though it's in the roadmap!
Edit: yeah it’s managed through the Eclipse Foundation now. They’re slowly working towards a formal spec, haven’t hit 1.0 yet.
Details here https://gitlab.eclipse.org/eclipse/asciidoc-lang/asciidoc-la...
Details here https://gitlab.eclipse.org/eclipse/asciidoc-lang/asciidoc-la...
I personally find asciidoc easier to write manually.
I'm sure with Sphinx and reStructuredText you can get that large-scale document tracking stuff, but with LaTeX it just works for the most part and you don't need to juggle a bunch of different side-projects and extensions. Plus you get things like automatic index generation (for a physical book).
I wrote my thesis (50 pages) and multiple published papers this way. Maybe it seems janky but honestly my experience with Latex and it's 10 incompatible compilers and thousands of semi-incompatible packages has been much worse.
I also don't understand why (academic) publishing is so PDF focused. It's a horrible format to read on screens (think multi-column PDFs, and scrolling / jumping up and down to find references), and who actually prints stuff anymore?
The thing I love most about Pandoc is that my notes can just slowly turn into a fully fledged document. Like bullet points - The syntax in Latex is far too verbose to make taking notes with it comfortable.
It's also much easier to extend, I wrote a simple tool that automatically converts URLs into full and correctly formatted citations, so I don't even need a citation manager to get the same results:
The GAN was first introduced in [@gan](https://papers.nips.cc/paper/5423-generative-adversarial-nets).
Turns into https://github.com/phiresky/pandoc-url2cite/blob/master/exam...Another great project with similar structure is Manubot [3], though the PDFs there are not generated by LaTeX.
[1]: https://pandoc.org/ [2]: https://github.com/phiresky/pandoc-url2cite [3]: https://manubot.org/
Because academics still often publish physical books.
You prefer to have lots of tools and write custom extensions to programs. And you'll have to maintain those tools forever, and migrate them when the upstream software breaks, or the links you use die. Most academic authors don't want to do that, and with latex they can take the same typeset equations and diagrams (without learning any new tools):
- Publish a paper
- Write a talk
- Publish a book
- Manage a unified bibliography across all of theseMy memory of LaTeX has weakened over the years, since I am not writing long texts with lots of figures and such, but I know it's more than this statement let's on in the article: "Something that is more modern than learning a hundred bits of print typesetting that your student will never, ever need?"
What exactly is, in the end, is 'modern'? Is it because there is less syntax in Markdown to remember and the Modern is syntax-adverse? :D Aren't there editors for these in the first place to avoid the daily grind of remembering syntax?
I have not been following the development of mathjax, pandoc, etc. carefully, so I'm wondering: Have the main issues been solved? By these I mean
(1) support for most popular packages,
(2) automatically breaking long outputs into small pages that don't overheat my laptop or crash my browser and yet reference each other properly,
(3) printability (without lines broken in half, senseless overflows and the likes) or cross-compilability with a regular PDF compiler?
I know the ar5iv project is getting closer and closer to (1) and (3), but is that available to regular users?
(I've been trying to do 'math on the web' (ish)) since 2002, and it's always sucked in some way; and all that time, images/pdf have Just Worked(TM). The emphasis in the OP on how much you'll have to report/chip in/fix is telling...)
Why? If you're just printing to read on the train or whatever, wouldn't you just discard after reading?
I use lwarp to make https://tikz.dev/, an HTML version of the TikZ manual, which is probably one of the most complicated LaTeX documents in existence.
Yes, there should be easy ways to display math on the web. No, this doesn't mean that LaTeX is obsolete.
Besides, what about references, both external and internal? Probably needs more "modern" tooling.
That’s a horrible way to go about it. Already in the 90s it was clear that varying display sizes was a problem, and it has gotten orders of magnitudes worse since then.
The concept of a single set layout that is suitable for everyone is utterly absurd.
https://pdf2htmlex.github.io/pdf2htmlEX/
Example of a paper with equations:
I would use CSS+HTML for layout, but what do I do about automatically generating tables of contents and indexes?
I guess I could write my own tool for that. Hmm.
[1] https://github.com/retorquere/zotero-better-bibtex
[2] https://sphinxcontrib-bibtex.readthedocs.io/en/latest/usage....
It's faster than MathJax and also can be pre-rendered on the server (or in your SSG!).
For my notes, for anything that need to be "live" I use org-mode because:
- it's a far more natural markup than anything else
- it's rendered INLINE, no need to jump between a source form and a rendered one, a thing MD lovers fails to understand
- it's an outlining tool, another thing most other tools fails miserably to understand
- it easily incorporate live things in other languages (org-babel) a thing no modern REPL-alike DocUI like Jupyter can't do
Long story short I prefer the best tool depending on the job. HTML might be the least common denominator tool, making it the worst in essentially all cases. XML for machine usage, SGML in general, are good for machine usage, but they are very impractical in current usage, just see the actual crappy state of things for e-invoicing with XML/XADES docs + XSL to render them in the end as pdf for the human. They are a good too in some case, but again not the best for any specific case.
I was thinking about this recently. If you get pedantic enough* about it, the typesetting quality you can get from a LaTeX+PDF is strictly better than what can be achieved using (sane) HTML.
I wanted to blog in LaTeX, and to solve the screen-size issue I thought I'd pre-bake to a wide range of page geometries, and then serve up an appropriate one to the client using pdf.js.
Fortunately for everyone, I decided against it in the end and continued blogging in markdown+html (with mathml support)
*well beyond what most readers would possibly care about
If you don't need to convert entire LaTeX documents, MathJaX and KaTeX are really good at rendering a subset of LaTeX as MathML/SVG. I run MathJaX + an xypic extension for commutative diagrams with server-side rendering on my website, and it works great in practice.
A lot of this is just because latex has been a standard for publishers in my field since I started (approximately a thousand years ago).
When writing for journals, latex saves a lot of work. Publishers provide latex templates that ensure that articles have a prescribed format and scope of content. Being able to see a good facsimile of the final published form is quite handy for authors. Oh, this paragraph is going on for over a column -- I'll break it up. That sort of thing.
This still applies when writing for longer things, such as textbooks and course notes, but another factor (for me, the larger one) is that latex (more properly, the tex upon which latex sits) is a programming language. Macros can be written to do lots of things that would be a pain if done manually, and once a macro is written, altering an entire text is easy. I did this in a book I wrote a while back, writing macros to colourize text that would be indexed, add margin notes for things I wanted to return to, categorize paragraphs by function, and so on. I could turn all those macros on and off by uncommenting a line. This is really quite helpful in writing something that takes months to years to complete. Frankly, I use this macro approach even in memos written in markdown. Inside almost all of my markdown documents, there are latex commands.
As for reading things on a small screen, which I guess is really the topic here, I must admit that this is something I rarely do within my own field. Sure, I do it if reading one of those 10-km overview articles in Science or Nature. But when it comes to my own field, things are technical and demand long periods of study. I don't try to read this stuff on the bus or in a coffee queue. I need time (hours or days) and I need to be able to take notes.
Another reason I prefer PDF is that it is fixed. My brain puts information into a sort of spatial framework. Somehow, if I look at a paper I first read 40 years ago, I still know what information is on which page, which of the diagrams summarizes the whole thing, and which of the citations is key. This may be a flaw in my brain functioning, but I just don't find these sorts of memories forming when I read content that has a plastic format, with paragraph breaks changing if I adjust my view. But maybe this is just my age talking, I suppose.
We're almost there for skipping LaTeX entirely, but in my experience, Google Docs and Overleaf still offer vastly superior collaborating tools. Now if we could just edit {.md; .rmd; .ipynb} files directly on Overleaf, with comments and track changes, we'd be in business...
html rendering of LaTeX is a godsend. and imnsho asciidoc a work around to not fully having that.