Boy do I. One of my biggest annoyances is receiving an invoice in pdf format, where I can either not select the text at all, or where you cannot cleanly select text, i.e. when you try to select something it somehow half highlights the line above as well and I am not sure what is on my clipboard, and need to paste temporarily in a text editor, then select what I need ... etc
Super nice when the list IBAN numbers for payment in a tiny font size as well.
Maybe I should vibecode a little helper. tool to visually select a rectangle and perform OCR and detect IBAN numbers or show a popup with proper text to do my subselect.
I wanted a way to make submitting Inventory Changes at work easier, so I took the pdf, used StirlingPDF to convert to an html bundled zip, I converted the .png with the form border and symbols to base64, and then wrote a powershell script to replace <p> tags with variable data from a csv export of our inv data(I tried to use odbc to extract it but once a dev showed me the logical, physical, and views that made up our Inventory lookup I went back to using a xlsx export that is built into our environment and letting the ps1 trim and sanitatize the input). As the conversion places text with absolute positioning, I was able to fine tune the layout and spacing. I then used my local AI qwen3.6-27b to convert my ps1 to a single html binary webapp with html/css/js, no external framework, two js scripts are loaded via cdn for now.
Inspired by how well that worked I vibecoded a drag and drop editor to build forms for other processes, I upload a png w which gets converted to base64 and then I can drag and drop text elements to where they need to be and export.
I know how many people feel about AI coded projects so these are really only for me, I didn't expect my coworkers to adopt it or anything but they did.
Recently I released https://polotno.com/render-tag/ library to render rich text into 2d canvas context. And it turns out it was very easy to adapt it to work with pdflib library (via 2d canvas <-> pdf context) proxy. I was able to render good set of rich text features. Thinking to make that bridge open source as well. Maybe you will be interested in that?
I months into building a pasteboard transform library that normalises VS Code, Google Docs, PDFs and a bunch of Chromium apps provider-specific data so I can start pasting everything everywhere exactly how I want it. It's much, much messier than I expected.
Apps put different UTTypes on the pasteboard that are not really compatible with each other. Usually there's a plain text fallback, then rich text/HTML, then provider-specific data. You show how much insane work is needed just to make text selectable with glyph mappings, layout, links, code blocks, rendered styles, etc. But once you copy from that PDF, most viewers still only expose raw text, and often broken raw text at that...
We have responsive and open standards like HTML and EPUB (zipped XTML) and they work great. arXiv has HTML papers, and libgen and anna's archive often have EPUB versions of books. The issue for me with EPUB is the lack of good readers now.
There's even a package (cmarker) than can translate Markdown to Typst which could be enough for a MVP.