Don't write just in plain text (longevity vs. authenticity) (opens in new tab)

(blog.miris.design)

138 pointsyumiris4y ago128 comments

128 comments

109 comments · 35 top-level

This essay is actually deeper than its surface appearance, about text versus other formats. It's about semantics and richness of content, although I am not sure Miris fully grasps what s/he is wrestling with.

The author invokes the concept of "authenticity", and that's where it gets interesting.

I used to set my students a question about information content in a class on the philosophy of procedural representation.

We had a very high resolution photo of the aviation pioneer Amelia Earhart, and a short grainy video clip of her getting into a plane and smiling and waving.

My question was: Which one of these two media conveys more information about Amelia?

One gave extraordinary detail of her face, eyes, and seemed to many was a much better "fidelity" document. Others noticed that although you couldn't see her face in the video, you could feel from her gait, waving, body language and the way she shook hands _much more_ about her than from the static photo.

Both files are the same size in bytes.

So which one has more "information"? Which one is more "authentic"?

Not to attempt to answer here with a deep dive into phenomenology, but each carries a different kind of information, which can be static, dynamic, or meta-dynamic in higher orders relative to a matrix of assumptions that must be carried forward in parallel by the culture that wants to decode the message later.

I like that Miris tries to explore this by questioning the richness of text. But maybe the question doesn't hold up well under those conditions of investigation - because one might say that a great poet using only a few words might capture a landscape better than a painting, but if our culture drifts toward a visual one where poetry is no longer understood we cannot say that the medium itself degraded.

II2II4y ago

Then there is the value of the written word. While the grainy video may reveal more than the high resolution photograph, the written word may reveal more than a grainy (or high resolution) video. A diary is much better at exposing motivations, emotions, the perceived relevance of events. A newspaper article can articulate the events of a day much better than a film that captures a fragment of time in the framed image of space. It's not that written accounts are necessarily better on these accounts (one could, for example, have a video diary). It is simply that they turn out to be better than the often fragmentary accounts from higher fidelity sources.

(It is also worth noting that these higher fidelity sources are often left to decay or are intentionally destroyed due to the difficulty and expense of maintaining them.)

nicbou4y ago

I agree. My written travel journals are far more interesting than the pictures, because they show how I actually felt while I was there, with all the little stories that go with the trip.

deltarholamda4y ago

I get what the author is trying to say, and I certainly agree. Sometimes you simply need it to look a certain way or use an image that cannot be summarized in text in a meaningful way in order to get your point across.

All those monks slaving away doing doodles in the margins weren't just staving off carpal tunnel. There's meaning there you can't get any other way. Before Man could write, Man made art.

The other HN article about "plaintext only" I also agree with. HTML is the synthesis of the two. Sometimes I forget what a great idea and a blessing HTML is. Even if you don't have a browser that can render it, reading an HTML document isn't difficult if it isn't festooned with auto-generated nonsense.

Moru4y ago

A digital format needs to be as simple as possible. If you need a 1000 page long definition for the word document format to be able to get the information out of it, it's not really good for anything when the format is forgotten.

causi4y ago

one might say that a great poet using only a few words might capture a landscape better than a painting, but if our culture drifts toward a visual one where poetry is no longer understood we cannot say that the medium itself degraded.

That has little to do with the medium. If the painter takes as much effort as the poet, and the viewer as much time and effort as the reader, just as much information and emotion can be gleaned from the painting as the poem.

jacobr14y ago

The medium can matter in both cases, in the sense that the medium is not just the format, but also the cultural context of interpretation. There can be subtleties in word choice that evoke shared stories, or word connotations or otherwise make reference outside the work itself. You can have the same references in the form of visual symbols, stylistic choices or more. The viewer/reader must make much more effort to gain that cultural context for interpretation, which may very well be lost or degraded over time.

I would posit that while a painting has can be very high context, that the tendency is for poetry to be even more context dependent. Transplanted outside its native culture, I suspect visual works (again, on the margin) can be grasped with more depth by the viewer than than a reader of the (on the margin) poem.

aasasd4y ago

> very high resolution photo of the aviation pioneer Amelia Earhart

I thought photo quality was rather meh until the 50s or at least the 40s? Even with large films the results are often muddy in olden shots—while 70 mm movie film from the 60s will probably still be redigitized into super-duper-hd formats in the late 21st century (e.g. https://youtu.be/sCv-dIFGcd0).

adzm4y ago

There are a bunch of crisp photographs from the late 30s. The hard part was getting the subject to stay still long enough and not blur the background, since the higher resolution / finer grain required a longer exposure time / wider aperture. You can check out various archives for examples, https://www.shorpy.com/ is one.

aposm4y ago

This is not quite correct... For motion picture film, yes, anything older than the 70mm film you're talking about tends to be low-resolution because of the physical constraints of moving a foot or more of film through a camera each second. However, still photos from that era were much better - the limitations of film stocks were lighting, and with enough light to gather, a large format photo could be extremely sharp and high-resolution (if that was the priority). It sounds like this is referring to a formal portrait setting, with potentially very bright studio lighting and high quality film, so it could easily be sharper than all but very recent digital cameras (as large-format film still is).

kubb4y ago

I think the thought experiment still "works" (as much as philosophers can say something useful about the question it poses) even if the photo was upscaled.

nonrandomstring4y ago

Yeah, it was taken in like 1938 or something. Scanned using modern gear. Of course, it's surprising how much detail comes out of chemical photography of that era. But you're right, the "resolution" is arbitrary and probably oversampled.

nicbou4y ago

Have a look at Ansel Adam's photography. The sharpness is outstanding.

InitialLastName4y ago

It would be interesting to compare the video and the photo to a similarly-sized excerpt of her writing.

gilleain4y ago

I'm curious : what is "meta-dynamic" information?

nonrandomstring4y ago

Good question. So as I see it: something that's about generative behaviours. For example: The rules that describe the dynamics of generative systems like Collatz (n%2==0 ? n' = n/2 : n'=3*n+1) there's specific integers that make it work while for a Lorenz attractor you'd have three floats that could take different initial values. That would be a conversation about "meta-dynamics".

Or maybe, what changes are behind systems that change?

1 more reply

thomascgalvin4y ago· 15 in thread

This argument feels ... not quite like a strawman, but more pedantic than I think it needs to be.

I don't think anyone really argues that everything should be plain text, even if that's an easy shorthand. The real argument is "use the simplest, most open format possible."

Nobody is suggesting you go through all of your photos, transcribe your emotional reaction to each picture, and then delete the image. But, if you want to view those same photos when you're fifty years old, or seventy-five, you're better off storing them as a JPEG than a PSD, and you're better off storing them on a hard drive you have access to in addition to whatever cloud they're currently occupying.

"Write plain text" is a shorthand for "use open formats." Because so much of what this audience does is test-based, plain text is the most common format we use, from source code to journaling, but that message applies to pretty much anything: if you lock yourself into a proprietary format, or a proprietary editor, you will almost certainly lose data over the long term.

logifail4y ago

> Nobody is suggesting you go through all of your photos, transcribe your emotional reaction to each picture, and then delete the image. But, if you want to view those same photos when you're fifty years old, or seventy-five, you're better off storing them as a JPEG than a PSD, and you're better off storing them on a hard drive you have access to in addition to whatever cloud they're currently occupying.

OTOH there are many photos I have, taken a decade or two ago, where I wish I'd written down my thoughts and reactions at the time, rather than just taken the picture. A picture may be worth a thousand words, but just having lots of pictures and no contemporaneous words, leaves more of a gap the longer ago it was.

spc4764y ago

I have a ton of family photos where I'm lucky if someone scribbled the year on the back; a bit more lucky if it's the month and year. The number of photos where names were written is way far less ("I remember Bob, but who is standing next to him?" "I don't know.").

Also, the 70s and early 80s were a bit more orange than I remember.

mekoka4y ago

Spot on.

A few years ago, when on a hike, if I came about some beautiful scenery, out came the camera. I'd spend most of the time capturing images with the device, rather than take in the landscape through my own senses.

Later, I'd look at those photos and noticed that they failed to convey a great deal of the emotional dimension. Now, I spend more time looking at the landscape, trying to notice all the details, and only take one or two snapshots. The idea of writing down my thoughts and reactions is worthwhile, or for practicality, maybe just audio record them and transcribe later.

Vivtek4y ago

This is an excellent point. I've been trying to go back and fix that, by correlating pictures with journal entries or blog posts and the like. I wish I'd taken better notes about who all these people were and what they meant to me at the time.

2 more replies

Zak4y ago

It's a response to this, which does advocate the use of literal plain text files where possible: https://sive.rs/plaintext

The author mentions converting to other open, text-based formats like HTML and LaTeX for publishing and writes:

> Keep your graphics files alongside your text files. But keep your text as plain text.

pdonis4y ago

> It's a response to this

Seems more like a misunderstanding of it than a response. As you quote explicitly from the Sivers article, he is talking about keeping text as plain text, not about keeping images as plain text. And the Miris article is basically saying the same thing (at the end he even says plain text is still his first choice), yet appears to think he's giving some kind of opposing viewpoint.

freddie_mercury4y ago

I think you've over corrected too far in the other direction.

"Write plain text" is definitely not a shorthand for "use open formats".

PDF is an open format.

Approximately nobody who says "write plain text" thinks putting everything in PDF is an acceptable alternative.

They don't even want you writing in HTML, for that matter. They want Markdown.

They really do mean something fairly close to "plain text".

thomascgalvin4y ago

To quote myself:

> The real argument is "use the simplest, most open format possible."

For most collections of words, that means Markdown, not PDF. But if the words you're saving are a mortgage document or power of attorney, PDF is actually a better choice.

selfhoster114y ago

I'm never going to write directly in PDF, but if I want to preserve something? 100% I will save it as a PDF if it's document-like.

raman1624y ago

I got the same feeling as well. Open formats and avoiding proprietary lock-in is what the spirit of "write in plain text" is about.

mark-r4y ago

I took a different conclusion from the plain text article. The argument isn't about open formats, it's really about plain text, simply because there's no need to have a tool to make use of it. Even open formats can get abandoned and become unusable.

necovek4y ago

I took the original article on plain text to mean that you should aim for plain text formats which are human-readable even without specific tools to process them.

Thus HTML, Markdown and LaTeX make sense:

  \begin{document}
  Blah
  ...

Is completely understandable to a reader even 50 years down the line, even if they don't have LaTeX on-hand.

But, it does bring an interesting counter-point: what does $$\frac{1}{n}$$ mean (to not even bring up more complex examples). It's probably no surprise that LaTeX is the lingua franca of math input because it brings in terseness, simplicity and some readability to plain text. Still, it's a programming language, so literally all bets are off in a document (you can redefine \frac to mean something else entirely).

I guess both articles, as noted elsewhere, attempt to nail down one familiar truth: use the simplest expression possible, but not simpler. One thinks that's always plain-text except for images, but there are just more contexts where this applies.

paxys4y ago

> I don't think anyone really argues that everything should be plain text, even if that's an easy shorthand

Pretty sure this article is a rebuttal to the front page post on HN yesterday which said exactly that

CRConrad4y ago

> Pretty sure this article is a rebuttal to the front page post on HN yesterday which said exactly that

* It may be an attempt at a rebuttal, but in actuality it mostly agrees. But yeah, rather obviously in reaction to that article.

* That article didn't say quite exactly that everything should be plain text; only that most text should be plain text.

Chris20484y ago

> I don't think anyone really argues that everything should be plain text

  Plain text just works, everywhere, all the time.

  -- https://news.ycombinator.com/item?id=30525605

llarsson4y ago· 7 in thread

That "some" proprietary formats from the 80's and 90's are still readable is already causing real problems: because not *all* are. So text, possibly with Markdown or similar hints regarding emphasis and structure, is still vastly better than any alternative I can think of.

dhosek4y ago

I bumped into this recently with a couple Kodak PhotoCDs I uncovered last month. Trying to get the pictures out of the PCD files is turning out to be more challenging than I expected.

js24y ago

I converted mine years ago using iPhoto but according to Wikipedia, there are several programs that can do the conversion:

https://en.wikipedia.org/wiki/Photo_CD#Converting_Photo_CD_i...

1 more reply

mro_name4y ago

what was the issue - hardware access (a CD reader), bit rot or else?

The image format was jpeg if I remember correctly, wasn't it?

2 more replies

deltarholamda4y ago

Is ImageMagick not doing the job? It's been a while, but Photoshop used to be pretty adept at this as well.

1 more reply

c0balt4y ago

Additionally the option to (relatively) easily transform markdown or richtext to another format is a great when you want to try a new tool and/or format.

eddieroger4y ago

That's the real secret, I think, keeping content up to date. Text isn't the only medium that can be treated that way, too. My family converted lots of analog audio and video tapes to DVD some years ago, and I immediately turned around and ripped them to digital, lossless types and stored them on a few hard drives and eventually a cloud backup. Will FLAC and MP4 last forever? Nope, probably not, but if I check in on these files every few years, update the players that I've saved (VLC), and periodically convert them to newer formats, I feel comfortable that my grandchildren will be able to hear their great-great-grandparents' road trip to California, or see video of my bar mitzvah on whatever screen they're using years from now.

1 more reply

sleepycatgirl4y ago

Yup. Or .org for that matter,

ggm4y ago· 7 in thread

he said.. in courier, monospaced paragraphs format, morally as close to "plaintext" as you can be with a couple of diagrams which could have been ASCII art...

ciphol4y ago

Ironically, the "pro plain text" link posted earlier used lots of formatting.

I don't see why pure plain text is better in any way than plain text with formatting, like a simplified form of HTML (<a>, , , some kind of table formatting, etc). The latter is non-proprietary, easily read and diffed, and communicates better than pure text.

Images have their own value, as do animations and video on occasion. Here matters become more complicated - image formats are generally non-human-readable and non-diffable (though SVG or a similar format could solve those problems for schematic-type images) and image conversions generally involve data loss. For starters, though, one should at least use a non-proprietary format for images and video.

alanbernstein4y ago

As the pro-plain-text post said: HTML, Markdown, JSON, LaTeX, and many other standard formats, are just plain text.

2 more replies

b1124y ago

like a simplified form of HTML (<a>, , , some kind of table formatting, etc). The latter is non-proprietary, easily read and diffed, and communicates better than pure text.

Yes, but, the problem isn't typically being proprietary, when it comes to future use, but a closed, non standard, unknown format.

Yet you're creating a new standard here, with your own rules, which no one will understand, and which no automated tools can convert to another format.

(Eg some kind of table formatting)

Better to be 100% html than this.

(Maybe you meant that, but regardless, this is a good place for me to comment on standards being more important than anything else.)

1 more reply

lbriner4y ago

Exactly. The OP said what they needed to in plain text because it captures what they wanted to say in the simplest format.

If they had needed to convey an image or contextual information like some rich API spec, they would presumably have used something else.

kleiba4y ago

That is not a contradiction. The OP is just arguing that you should use the best medium + format for the job - and sometimes simple text is sufficient (as it says in the article).

gwern4y ago

And if the diagrams had been ASCII art diagrams (eg https://twitter.com/thorstenball/status/1498541884796542977 ), OP could fix the typo in the diagram with a keystroke.

yumirisOP4y ago

Cheers for the heads-up on the typo. The diagram's an SVG, with the labels being in plain ASCII. One keystroke is all it took indeed! :P

On the serious side, ASCII art diagrams are splendid and I very often use them myself, though they can get quite complex and thus messy to maintain. There comes a certain point where they lose their simplicity, sadly.

aasasd4y ago· 6 in thread

I got quite a lot of use out of metadata over the years, such that now I'll probably get a nervous itch and tremors all over my body if I attempt to use just plain text. Specifically, the creation and modification times for each addition to my notes are rather valuable, especially with the work-from-home lifestyle aka ‘day fades into night into day’—with which more people are gonna be familiarized in these years.

Thankfully I'm using Org-mode these days, which is reasonably ‘plain text’ under practical definitions—but I make dozens new headings every week, and each of them is stamped with the creation time. But boy do I miss having modification times too—should probably finally set up automatic commits to Git. Also need to mess with Orgzly so that it marks notes that are created on the phone.

gilleain4y ago

Indeed, digital archives use (I understand) various metadata standards such as:

https://www.dublincore.org/specifications/dublin-core/dcmi-t...

or 'Dublin core' which is RDF.

selfhoster114y ago

I supplement my workflow with some judicious use of text-expander macros. I can type a total of three characters for the current date-stamp, or four for a date + timestamp. This makes it easy to reflexively date literally anything systemwide: from archive filenames, to code comments, to config file tweaks, to actual notes.

dv35z4y ago

Can you touch on your org-mode journey, setup & current flow? I am just starting the journey - looking to have a fairly coherent notes/todo/planning/contacts/kb system, and have (portions of) it published out to a static website. Emacs is... something else.

aasasd4y ago

Eeeh, I already see from your description that your needs are different from mine, and Emacs and Org-mode tend to be customized by everyone to their smallest wants. You won't find a shortage of articles about Org, including here on HN.

ParetoOptimal4y ago

What are some things you use modification times for?

aasasd4y ago

When I used Evernote and my notes were larger in scale, I mostly used the modification time to figure out how long a particular note was lying around without updates—so abandoned-forgotten projects and such stuff, basically tracking how much I actually use the notes.

(Evernote went to shit over the years, so don't take this as an endorsement.)

Sometimes it's also useful to figure out what I was doing when writing a note, by placing the time among my other activities. This gives some context for the thoughts.

Now that I migrated to outlines and the notes are much more granular, plus I started making more of them—they can often serve as a timestamped log of my day. When did I eat the breakfast—so I can put the dinner in the stomach before it begins an acid-fest? Well, I logged watching an episode of the series during the breakfast, so the creation time tells me the answer.

I'm scatterbrained, okay. Or rather, the notes are part of my ‘brain’ now.

In fact, I do miss granular times in other logs of my activity—ironically, in regard to privacy. I watched a video on a particular topic around last summer, and would like to find it now—but YT's ‘watch history’ is crude and just leafing through all of it is infeasible. (Actually, perhaps I should look into the ‘takeout’ dumps of activity for the timestamps, and make a list of the vids in a better format.)

eatmygodetia4y ago· 5 in thread

I feel like a lot of use plain text proponents forget that outside of ASCII and now UTF-8, lots of alleged plain text documents with diacritics or non-latin characters are at least slightly difficult to open because of their somewhat esoteric encodings. Plain text isn't as universal as it is often claimed, although it is immensely simpler than some other formats.

But maybe we should all use monochrome bitmap files for everything? That would be very simple.

softwarebeware4y ago

Yes, I feel this in my bones as someone who previously worked for a text messaging provider. Plain text has the deceptive appearance of simplicity, but it is actually one of the most maddening things to get right, especially if you intend to support the accurate transmission of said text to any possible text message receiving device in the world.

selfhoster114y ago

If it's 2022 and someone is _still_ saving plaintext in a non-Unicode encoding where going with Unicode is a perfectly viable option, I will personally ensure that (figuratively) they are burnt at the stake.

In addition to UTF-8, my language happens to have ~2 additional code pages/Latin based encodings. Some websites still serve (or very recently used to serve) text files in such broken encodings, so I have to convert such files before use. It's deeply unpleasant. Windows has supported UTF-8 in some fashion for over 15 years, get with the program people.

(I would make an exception for preserving historical non-UTF-8 files in their original byte-exact form, for the same reason that I wouldn't digitise an analogue photograph and then burn the original - but let's be real, all such files have been created by now)

jjav4y ago

That is why I tend to always keep files in plain ASCII, even though two out of my three primary languages need characters not in ASCII.

File longevity wins over grammatical correctness most of the time for me. I have text files going back to the 80s, so I'm glad I didn't use any fancier software to write them as they'd be completely unreadable today.

smasher1644y ago

I think for a plaintext format to be "complete", it needs some mechanism of associating the language with some segment of text. Plaintext formats that don't acknowledge unified characters are just Latin-biased.

eatmygodetia4y ago

that's basically point - you can open an ascii file now because utf-8 is ascii oriented, but a utf-8 first editor will struggle with an old french text file for example. plain text has inbuilt biases which have changed over time, it's not as pure as simple as people say.

1 more reply

brians4y ago· 3 in thread

“all the binary formats of the 1990s can be opened today”

Oh, sweet summer child. Scribe/mss. Koalapad. A bunch of Apple 2GS, Apple 3, and Lisa formats. Lotus Improv.

The points about semantics and authenticity are wonderful, but I think the presumption that all formats can be opened is mistaken exactly because those that can’t be opened become effectively invisible and lost.

selfhoster114y ago

Emulation can bring them back in a limited fashion. Though obviously un-marrying them from the original system environment and making them accessible outside of the VM can be a challenge.

photojosh4y ago

This just gave me the completely random idea... (as someone whose parents used to have tons of ClarisWorks docs)... build OCR into the emulator. :)

mark-r4y ago

Survivorship bias it's called.

briandoll4y ago· 3 in thread

I assume this is a response to Derek Sivers post: Write Plain Text Files https://sive.rs/plaintext

I've been using computers daily for about 35 years now and I have a _lot_ of plain text files that I regularly use -- notes, lists, outlines, quotes, links, etc. Does anyone who has been around a while, have a large multi-decade collection of texts that are _not_ plain text? What formats do you use? How do you maintain access to those files over time?

paxys4y ago

My MP3 collection has been going on for at least 25 years and still works perfectly. Same for HTML pages (I have entire websites backed up from the early 90s). I still have Wordstar and Word 1.0 files which I can open and edit. I can't think of too many pieces of software or data formats from the last 30-40 years which achieved some threshold of popularity but have no support today.

briandoll4y ago

MP3 is a good one. Although I had to develop a lot of perl to manage the various incantations of ID3 tags, especially when VBR became popular. MP3 files may still play, but the full experience (properly attributed w/ band, album, song title, song number, album art, etc.) is likely less than perfect over time.

Do Wordstar files open in modern Word applications, even on iOS? That's part of the access aspect over the long term -- files that can be used, everyday, with your daily-driver tools with minimal special software needed.

jjav4y ago

> My MP3 collection has been going on for at least 25 years and still works perfectly.

Mine as well (maybe not quite 25 but close). But music isn't written word, clearly it wouldn't be in an ASCII text file.

The key is universal, non-proprietary formats that are supported by thousands of open source applications. Those are the formats that will last a lifetime and beyond. So, plain text for the written word (HTML counts as plain text, you can read and write it in any plain text editor), JPG for pictures, MP3 for music.

For video there doesn't seem to be an answer that is fully satisfactory, that I feel confident I can still view in 50 years. So I mostly take photos, not much video, since I can't trust the longevity of video.

copperx4y ago· 3 in thread

> but dismissing or abandoning media files is a much more guaranteed potential loss of information – information which plain text cannot capture due to its limitations.

Some examples are sorely needed. How is a Word/InDesign file more authentic than a plain text file? Or is the author talking about media? Is a ProTools session more authentic than Wav files?

coldtea4y ago

>Is a ProTools session more authentic than Wav files?

Dunno about 'authentic', but since the part you've quoted specifically talks about "loss of information", the WAV files indeed incur loss of information compared to a ProTools session.

E.g. if it's a single stereo wav file render, it would miss all the individual channels, for starters.

If it's multiple wav files with all the channels as stems, it will still miss the effect chain settings (and hardcode them in the final result), the MIDI notes (hardcoded as the rendered VST output), session markers, tempo change tracks, and other such things.

falcolas4y ago

> E.g. if it's a single stereo wav file render, it would miss all the individual channels, for starters.

A DAW session is like notes for writing a book. Not everything is going to make it in, and the choice of what does make it from the notes to the book, and how it's changed, is quite intentional. And I, personally, don't consider a book to be "lossy" or "unauthentic" because it doesn't also come with all the author's notes.

So, if it's not in the final mix, it's because it's not supposed to be in the final mix; it's not that the data is lost because of technical limitations. And like notes from a book, unless you throw them away, they're not going anywhere.

On a more technical note, underneath the hood, the recorded items are all stored as .wav files too...

1 more reply

lallysingh4y ago

I've read some docs with ASCII art diagrams far more complex than the medium really allowed.

I would have preferred PDF/A

dade_4y ago· 3 in thread

MD for all things text and SVG journals for handwritten notes, diagrams, sketches, screenshots. Works great, but haven’t found a way to integrate them beyond using a common set of folders.

mxuribe4y ago

> ...SVG journals for handwritten...

Would you kindly clarify this? Did you mean scan in handwritten material but save it in a scalable image format like SVG? I'm quite interested but maybe i'm not capturing what you mean here, because i have not had my breakfast. :-)

dade_4y ago

I use Write (iPad/Windows/Ubuntu/Android) to take handwritten notes and jot info. I first started using SVG on my Sony eReader, it used this vector format. http://www.styluslabs.com/

necovek4y ago

Not GP and I have no idea if they meant this, but "smart" vectorization/tracing would be ideal when use of touchscreen pens is impossible (still doesn't capture the angle a pen is being held at, but we are getting close).

For the state of the art, look up "image tracing".

1 more reply

amiga12004y ago· 2 in thread

The Epic of Gilgamesh was written in plain text.

wl4y ago

In a complicated, undocumented format that had to be reverse-engineered (Sumerian/Akkadian/Hittite cuneiform).

CRConrad4y ago

And those clay tablets were all craggly-surfaced and far too thick to fit in my DVD player! Shabby archivists, those Babylonians.

titzer4y ago· 1 in thread

> What ultimately matters is that information is captured and preserved as thoroughly as possible. Between a picture that expresses a thousand words, and plain text file that sacrifices its detail and authenticity, why wouldn't we choose the former? Indeed, this question applies even the choice may sacrifice the longevity. What's the point of longevity, when the pursuit of it can compromise our ability to capture the information we may be afraid of possibly losing?

I would contend that capturing a picture is absolutely a massive distortion of reality because reality is three dimensional, exists in many spectra beyond visible light, has sounds, smells, taste, and feeling, and exists in a historical context. The selection of framing, distance, focus, all of these are biases of the photographer. A photo is a lie, too. Just because it's higher resolution doesn't mean it has indeed captured the right information.

Text is a lie too, granted. But in our current digitization zeitgeist, we have forgotten that our media (pictures, video, recordings, not just the TV, cable, and internet) lie to us. Our own bias towards slicing apart the world into computer-digestible bits is just us lying more convincingly to ourselves.

selfhoster114y ago

By that definition, using literally any point of view to capture, measure or describe information is a lie.

I take issue with that. This is stripping the word "lie" from it's time-honoured meaning (~"distorting or fabricating truths to influence decision making or perception"), and dilutes it for when we actually need to call out lies.

jauco4y ago· 1 in thread

Real archivists (as in people that have archivist as a job description and work at places that have “storing data forever” as a mission statement) tend to store the data in multiple formats. The source + a few derivations. They also store a bunch of copies to ward against bitrot. And they periodically compare the copies.

Real archivists use a lot of data :)

selfhoster114y ago

I think part of the job of an "archivist" archivist (as opposed to an amateur archivist), is making information accessible to others. For that, you need derivations, because nobody will necessarily know how to deploy a Mac OS 7 virtual machine, install Claris Works (or whatever it is), load the original file onto the machine, and then navigate the contemporary UI (with it's unusual conventions) to get at the information they wanted. For personal data, I already know how to get an old environment up and running, so I'm happy enough to keep multiple copies of the original and of any software I need to open it.

davbryn14y ago· 1 in thread

"Prioritising the longevity of data can sacrifice the authenticity of what it tries to capture and preserve. When I say authenticity, I refer to how accurate and detailed the data in question preserves a particular state. An original raw image, for example, will capture a landscape much more authentically than written text would. Written text will inevitably comprise of ambiguity and even bias, if not distortion."

Or, you need to become a better writer.

selfhoster114y ago

There's a reason why it's said that a picture is worth a thousand words. There are trade-offs, and at the end of the day some things are more efficiently described in text, and some visually.

jdvh4y ago· 1 in thread

Plain text is so compelling because it's as simple as it gets, you can bring your own editor, you own your own data, and you can use version control.

Text+ is compelling because you can have images and some kind of formatting. You want to store metadata and have backlinks and tags. Ideally with the possibility of collaborative editing.

There should be a way to fuse these two.

Geezus-424y ago

The latter sounds like Obsidian or Logseq or most other markdown editors.

gandalfff4y ago· 1 in thread

Plain text is fine for some things but lacking for others. I like GUIs for formatting. I wouldn't be surprised if my ODTs could be opened a thousand years from now.

CRConrad4y ago

Betcha you'd be surprised to be around to find out, though.

dang4y ago

Related large thread from yesterday:

Write plain text files - https://news.ycombinator.com/item?id=30521545 - March 2022 (345 comments)

yumirisOP4y ago

This was concocted at 5AM -- my apologies for any peculiar sentence structures or odd phrasing.

Will re-re-re-revise it again with fresh eyes after resting 'em!

orzig4y ago

Render to ASCII, everyone wins! (e.g. https://ascii-generator.site/)

Annatar4y ago

"This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface."

http://catb.org/~esr/writings/taoup/html/ch01s06.html

nicbou4y ago

There doesn't need to be a compromise. You can have both if you keep your data in multiple formats. Storage is cheap and text files are small.

My timeline thing [0] keeps the original archives, stores the timeline entries in a database, and exports them hourly as JSON + files. If the code stops working or the database crashes, the files are still there. The automated backups are there too. No information is lost.

However, the richness is not lost in the process. This timeline has geolocation history, notebook scans and a bunch of other things that don't really translate to plain text.

The most important difference is that I can write to my timeline from my phone. Managing text files across devices is quite troublesome by comparison. If I want plain text out of it, I can write a new Destination that pipes entries to plain text files or to a fax machine.

[0] https://nicolasbouliane.com/projects/timeline

dorfsmay4y ago

Whenever choosing a markup, image format, or other technologies, keep the Lindy effect in mind. A boring technology that has been around for a long time will survive a lot longer and a brand new shiny one.

https://en.wikipedia.org/wiki/Lindy_effect

writegit4y ago

Or both?

I have a daemon that watches for binary changes in writing documents.

If changes are identified then it runs:

    $ libreoffice --headless --convert-to txt <CHANGED_FILES>

Then commits the plaintext to a git repo.

Allows for diffs, text search, and "longevity" across "authentic" docs.

VariableStar4y ago

IMO the question is more about which standards are used, rather than specifying an specific format. In particular, using open and free standards and formats increases the chance to retrieve and use data after long time storage. Different formats suit different data types.

highspeedbus4y ago

Obsidian/Markdown file structure is great for this. It can become a standard to "Offline Hypertext" format.

Despite text being fully portable, it is limited when it's needed to link a image or other files. People often forget how useful this concept is.

Html is not a viable option as it is awfully verbose for taking simple a note.

Markdown adds just enough semantics that is perfectly readable. From a hex editor to Microsoft Word.

We're in a somewhat critical moment, where markdown can either stay as it is, then dominate and become a godsend format of solid usability for decades, or a harmful feature is added that would slowly drag the whole thing down until the next Just Write Plain Text blog post.

ad404b8a372f2b94y ago

I think longevity is not just an issue of the data format but more so of its organization. It so happens that text files organized using the file system is the most easily producible, maintainable and queryable data organization tool. But other media can have the same properties if they're organized using the file system rather than any complex tools. I have graphs and datasheets that have endured decades that I refer to often and are easily findable because they are well-named files in well-named folders, even though the formats are comparatively much more complex.

Beldin4y ago

It seems the author overlooked the possibility of writing out the full binary string of whatever format he'd like (i.e., "zero one one ..."), prefaced by instructions on how to parse that.

That would give you great "authenticity" (in his definition) and great longevity.

Not practical for reading back, but that was not the point. With the help of a few simple scripts, writing is easy. So, in the end, not really an argument against storing information exclusively in plaintext.

jjice4y ago

We use Google Docs for pretty much all of our docs since they're easy to create, share, and modify, and it works pretty well. I just (selfishly) want a good integrated plain text editor as part of GSuite. Sharing code via Google Docs isn't great, and sometimes I don't want to think about headers and formatting, I just want to use tabs to separate my pieces. That said, I'm definitely in the minority of users and I'll deal with it, not that big of a deal.

thematrixadmin4y ago

What about writing data in markdown format, physically on the HDD. You can use bunch of different both online and local tools which will probably stay supported in the future. There is also no problem with implementing your own markdown editor (nice side, pet project as well). I store and run small server on my RPi, accessible through my phone and desktop. If I'd like to show the text to somebody I can easily copy it as a plain text, Word format or export it to HTML or PDF.

happyglands4y ago

I've struggled with this for quite some time now, and tried almost every tool out there. At the moment, I'm settling with Bear, writing my notes in Markdown. I prefer the ease of using nvAlt but I need the ability to store images and PDFs and I like the fact that it has some very nice export options should I eventually move to another tool, so I don't feel like I'm "locked in".

m348e9124y ago

This might be off topic but in terms of communication such as email, plain text seems the most authentic format to me. For example, if you are one of those sales guys that bolds and highlights the important parts of an email that you send, it's off-putting. The only exception I would give is if you wanted to add an inline image or an emoji -- everything else, plain text.

quasarj4y ago

Wrote a whole article about not using plain text. Used plain text for everything except a useless image. A+++

chaxor4y ago

I like the idea of making a binary file into a plaintext file - but you could store it as the ASCII characters "0000110100111011110001111100101..."

This would be great for many reasons. At the top of that list for example, is getting a lot more use out of those hard drives you paid for.

anotherevan4y ago

Reminds me of the Einstein quote: Make something as simple as possible, but no more so.

Paraphrased: Make your information capture format as simple as possible, but no more so.

a1445c8b4y ago

s/comprise of/comprise/g

j / k navigate · click thread line to collapse

128 comments

109 comments · 35 top-level

nonrandomstring4y ago· 15 in thread

The author invokes the concept of "authenticity", and that's where it gets interesting.

I used to set my students a question about information content in a class on the philosophy of procedural representation.

We had a very high resolution photo of the aviation pioneer Amelia Earhart, and a short grainy video clip of her getting into a plane and smiling and waving.

My question was: Which one of these two media conveys more information about Amelia?

Both files are the same size in bytes.

So which one has more "information"? Which one is more "authentic"?

II2II4y ago

(It is also worth noting that these higher fidelity sources are often left to decay or are intentionally destroyed due to the difficulty and expense of maintaining them.)

nicbou4y ago

I agree. My written travel journals are far more interesting than the pictures, because they show how I actually felt while I was there, with all the little stories that go with the trip.

deltarholamda4y ago

All those monks slaving away doing doodles in the margins weren't just staving off carpal tunnel. There's meaning there you can't get any other way. Before Man could write, Man made art.

Moru4y ago

causi4y ago

jacobr14y ago

aasasd4y ago

> very high resolution photo of the aviation pioneer Amelia Earhart

adzm4y ago

aposm4y ago

kubb4y ago

I think the thought experiment still "works" (as much as philosophers can say something useful about the question it poses) even if the photo was upscaled.

nonrandomstring4y ago

nicbou4y ago

Have a look at Ansel Adam's photography. The sharpness is outstanding.

InitialLastName4y ago

It would be interesting to compare the video and the photo to a similarly-sized excerpt of her writing.

gilleain4y ago

I'm curious : what is "meta-dynamic" information?

nonrandomstring4y ago

Or maybe, what changes are behind systems that change?

1 more reply

thomascgalvin4y ago· 15 in thread

This argument feels ... not quite like a strawman, but more pedantic than I think it needs to be.

I don't think anyone really argues that everything should be plain text, even if that's an easy shorthand. The real argument is "use the simplest, most open format possible."

logifail4y ago

spc4764y ago

Also, the 70s and early 80s were a bit more orange than I remember.

mekoka4y ago

Spot on.

Vivtek4y ago

2 more replies

Zak4y ago

It's a response to this, which does advocate the use of literal plain text files where possible: https://sive.rs/plaintext

The author mentions converting to other open, text-based formats like HTML and LaTeX for publishing and writes:

> Keep your graphics files alongside your text files. But keep your text as plain text.

pdonis4y ago

> It's a response to this

freddie_mercury4y ago

I think you've over corrected too far in the other direction.

"Write plain text" is definitely not a shorthand for "use open formats".

PDF is an open format.

Approximately nobody who says "write plain text" thinks putting everything in PDF is an acceptable alternative.

They don't even want you writing in HTML, for that matter. They want Markdown.

They really do mean something fairly close to "plain text".

thomascgalvin4y ago

To quote myself:

> The real argument is "use the simplest, most open format possible."

For most collections of words, that means Markdown, not PDF. But if the words you're saving are a mortgage document or power of attorney, PDF is actually a better choice.

selfhoster114y ago

I'm never going to write directly in PDF, but if I want to preserve something? 100% I will save it as a PDF if it's document-like.

raman1624y ago

I got the same feeling as well. Open formats and avoiding proprietary lock-in is what the spirit of "write in plain text" is about.

mark-r4y ago

necovek4y ago

I took the original article on plain text to mean that you should aim for plain text formats which are human-readable even without specific tools to process them.

Thus HTML, Markdown and LaTeX make sense:

  \begin{document}
  Blah
  ...

Is completely understandable to a reader even 50 years down the line, even if they don't have LaTeX on-hand.

paxys4y ago

> I don't think anyone really argues that everything should be plain text, even if that's an easy shorthand

Pretty sure this article is a rebuttal to the front page post on HN yesterday which said exactly that

CRConrad4y ago

> Pretty sure this article is a rebuttal to the front page post on HN yesterday which said exactly that

* It may be an attempt at a rebuttal, but in actuality it mostly agrees. But yeah, rather obviously in reaction to that article.

* That article didn't say quite exactly that everything should be plain text; only that most text should be plain text.

Chris20484y ago

> I don't think anyone really argues that everything should be plain text

  Plain text just works, everywhere, all the time.

  -- https://news.ycombinator.com/item?id=30525605

llarsson4y ago· 7 in thread

dhosek4y ago

I bumped into this recently with a couple Kodak PhotoCDs I uncovered last month. Trying to get the pictures out of the PCD files is turning out to be more challenging than I expected.

js24y ago

I converted mine years ago using iPhoto but according to Wikipedia, there are several programs that can do the conversion:

https://en.wikipedia.org/wiki/Photo_CD#Converting_Photo_CD_i...

1 more reply

mro_name4y ago

what was the issue - hardware access (a CD reader), bit rot or else?

The image format was jpeg if I remember correctly, wasn't it?

2 more replies

deltarholamda4y ago

Is ImageMagick not doing the job? It's been a while, but Photoshop used to be pretty adept at this as well.

1 more reply

c0balt4y ago

Additionally the option to (relatively) easily transform markdown or richtext to another format is a great when you want to try a new tool and/or format.

eddieroger4y ago

1 more reply

sleepycatgirl4y ago

Yup. Or .org for that matter,

ggm4y ago· 7 in thread

he said.. in courier, monospaced paragraphs format, morally as close to "plaintext" as you can be with a couple of diagrams which could have been ASCII art...

ciphol4y ago

Ironically, the "pro plain text" link posted earlier used lots of formatting.

alanbernstein4y ago

As the pro-plain-text post said: HTML, Markdown, JSON, LaTeX, and many other standard formats, are just plain text.

2 more replies

b1124y ago

like a simplified form of HTML (<a>, , , some kind of table formatting, etc). The latter is non-proprietary, easily read and diffed, and communicates better than pure text.

Yes, but, the problem isn't typically being proprietary, when it comes to future use, but a closed, non standard, unknown format.

Yet you're creating a new standard here, with your own rules, which no one will understand, and which no automated tools can convert to another format.

(Eg some kind of table formatting)

Better to be 100% html than this.

(Maybe you meant that, but regardless, this is a good place for me to comment on standards being more important than anything else.)

1 more reply

lbriner4y ago

Exactly. The OP said what they needed to in plain text because it captures what they wanted to say in the simplest format.

If they had needed to convey an image or contextual information like some rich API spec, they would presumably have used something else.

kleiba4y ago

That is not a contradiction. The OP is just arguing that you should use the best medium + format for the job - and sometimes simple text is sufficient (as it says in the article).

gwern4y ago

And if the diagrams had been ASCII art diagrams (eg https://twitter.com/thorstenball/status/1498541884796542977 ), OP could fix the typo in the diagram with a keystroke.

yumirisOP4y ago

Cheers for the heads-up on the typo. The diagram's an SVG, with the labels being in plain ASCII. One keystroke is all it took indeed! :P

aasasd4y ago· 6 in thread

gilleain4y ago

Indeed, digital archives use (I understand) various metadata standards such as:

https://www.dublincore.org/specifications/dublin-core/dcmi-t...

or 'Dublin core' which is RDF.

selfhoster114y ago

dv35z4y ago

aasasd4y ago

ParetoOptimal4y ago

What are some things you use modification times for?

aasasd4y ago

(Evernote went to shit over the years, so don't take this as an endorsement.)

Sometimes it's also useful to figure out what I was doing when writing a note, by placing the time among my other activities. This gives some context for the thoughts.

I'm scatterbrained, okay. Or rather, the notes are part of my ‘brain’ now.

eatmygodetia4y ago· 5 in thread

But maybe we should all use monochrome bitmap files for everything? That would be very simple.

softwarebeware4y ago

selfhoster114y ago

jjav4y ago

That is why I tend to always keep files in plain ASCII, even though two out of my three primary languages need characters not in ASCII.

smasher1644y ago

eatmygodetia4y ago

1 more reply

brians4y ago· 3 in thread

“all the binary formats of the 1990s can be opened today”

Oh, sweet summer child. Scribe/mss. Koalapad. A bunch of Apple 2GS, Apple 3, and Lisa formats. Lotus Improv.

selfhoster114y ago

Emulation can bring them back in a limited fashion. Though obviously un-marrying them from the original system environment and making them accessible outside of the VM can be a challenge.

photojosh4y ago

This just gave me the completely random idea... (as someone whose parents used to have tons of ClarisWorks docs)... build OCR into the emulator. :)

mark-r4y ago

Survivorship bias it's called.

briandoll4y ago· 3 in thread

I assume this is a response to Derek Sivers post: Write Plain Text Files https://sive.rs/plaintext

paxys4y ago

briandoll4y ago

jjav4y ago

> My MP3 collection has been going on for at least 25 years and still works perfectly.

Mine as well (maybe not quite 25 but close). But music isn't written word, clearly it wouldn't be in an ASCII text file.

copperx4y ago· 3 in thread

> but dismissing or abandoning media files is a much more guaranteed potential loss of information – information which plain text cannot capture due to its limitations.

Some examples are sorely needed. How is a Word/InDesign file more authentic than a plain text file? Or is the author talking about media? Is a ProTools session more authentic than Wav files?

coldtea4y ago

>Is a ProTools session more authentic than Wav files?

Dunno about 'authentic', but since the part you've quoted specifically talks about "loss of information", the WAV files indeed incur loss of information compared to a ProTools session.

E.g. if it's a single stereo wav file render, it would miss all the individual channels, for starters.

falcolas4y ago

> E.g. if it's a single stereo wav file render, it would miss all the individual channels, for starters.

On a more technical note, underneath the hood, the recorded items are all stored as .wav files too...

1 more reply

lallysingh4y ago

I've read some docs with ASCII art diagrams far more complex than the medium really allowed.

I would have preferred PDF/A

dade_4y ago· 3 in thread

MD for all things text and SVG journals for handwritten notes, diagrams, sketches, screenshots. Works great, but haven’t found a way to integrate them beyond using a common set of folders.

mxuribe4y ago

> ...SVG journals for handwritten...

dade_4y ago

I use Write (iPad/Windows/Ubuntu/Android) to take handwritten notes and jot info. I first started using SVG on my Sony eReader, it used this vector format. http://www.styluslabs.com/

necovek4y ago

For the state of the art, look up "image tracing".

1 more reply

amiga12004y ago· 2 in thread

The Epic of Gilgamesh was written in plain text.

wl4y ago

In a complicated, undocumented format that had to be reverse-engineered (Sumerian/Akkadian/Hittite cuneiform).

CRConrad4y ago

And those clay tablets were all craggly-surfaced and far too thick to fit in my DVD player! Shabby archivists, those Babylonians.

titzer4y ago· 1 in thread

selfhoster114y ago

By that definition, using literally any point of view to capture, measure or describe information is a lie.

jauco4y ago· 1 in thread

Real archivists use a lot of data :)

selfhoster114y ago

davbryn14y ago· 1 in thread

Or, you need to become a better writer.

selfhoster114y ago

There's a reason why it's said that a picture is worth a thousand words. There are trade-offs, and at the end of the day some things are more efficiently described in text, and some visually.

jdvh4y ago· 1 in thread

Plain text is so compelling because it's as simple as it gets, you can bring your own editor, you own your own data, and you can use version control.

Text+ is compelling because you can have images and some kind of formatting. You want to store metadata and have backlinks and tags. Ideally with the possibility of collaborative editing.

There should be a way to fuse these two.

Geezus-424y ago

The latter sounds like Obsidian or Logseq or most other markdown editors.

gandalfff4y ago· 1 in thread

Plain text is fine for some things but lacking for others. I like GUIs for formatting. I wouldn't be surprised if my ODTs could be opened a thousand years from now.

CRConrad4y ago

Betcha you'd be surprised to be around to find out, though.

dang4y ago

Related large thread from yesterday:

Write plain text files - https://news.ycombinator.com/item?id=30521545 - March 2022 (345 comments)

yumirisOP4y ago

This was concocted at 5AM -- my apologies for any peculiar sentence structures or odd phrasing.

Will re-re-re-revise it again with fresh eyes after resting 'em!

orzig4y ago

Render to ASCII, everyone wins! (e.g. https://ascii-generator.site/)

Annatar4y ago

"This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface."

http://catb.org/~esr/writings/taoup/html/ch01s06.html

nicbou4y ago

There doesn't need to be a compromise. You can have both if you keep your data in multiple formats. Storage is cheap and text files are small.

However, the richness is not lost in the process. This timeline has geolocation history, notebook scans and a bunch of other things that don't really translate to plain text.

[0] https://nicolasbouliane.com/projects/timeline

dorfsmay4y ago

https://en.wikipedia.org/wiki/Lindy_effect

writegit4y ago

Or both?

I have a daemon that watches for binary changes in writing documents.

If changes are identified then it runs:

    $ libreoffice --headless --convert-to txt <CHANGED_FILES>

Then commits the plaintext to a git repo.

Allows for diffs, text search, and "longevity" across "authentic" docs.

VariableStar4y ago

highspeedbus4y ago

Obsidian/Markdown file structure is great for this. It can become a standard to "Offline Hypertext" format.

Despite text being fully portable, it is limited when it's needed to link a image or other files. People often forget how useful this concept is.

Html is not a viable option as it is awfully verbose for taking simple a note.

Markdown adds just enough semantics that is perfectly readable. From a hex editor to Microsoft Word.

ad404b8a372f2b94y ago

Beldin4y ago

It seems the author overlooked the possibility of writing out the full binary string of whatever format he'd like (i.e., "zero one one ..."), prefaced by instructions on how to parse that.

That would give you great "authenticity" (in his definition) and great longevity.

jjice4y ago

thematrixadmin4y ago

happyglands4y ago

m348e9124y ago

quasarj4y ago

Wrote a whole article about not using plain text. Used plain text for everything except a useless image. A+++

chaxor4y ago

I like the idea of making a binary file into a plaintext file - but you could store it as the ASCII characters "0000110100111011110001111100101..."

This would be great for many reasons. At the top of that list for example, is getting a lot more use out of those hard drives you paid for.

anotherevan4y ago

Reminds me of the Einstein quote: Make something as simple as possible, but no more so.

Paraphrased: Make your information capture format as simple as possible, but no more so.

a1445c8b4y ago

s/comprise of/comprise/g

j / k navigate · click thread line to collapse