Ask HN: How to learn about text editor architectures and implementations?

195 pointss3arch4y ago72 comments

I am a self taught developer. Its been more than 3 years. I know decent JavaScript, and full-stack developement knowledge. I recently started admiring text editors. I use vscode. I have also used VIM and EMACS. I tried reading their source code, also of atom, brackets, light table etc.

Honestly I don't understand anything. I am not able to make sense of the data flow and the architecture. I want to understand how text editors work under the hood. Also I want to understand the plugable architecture they use to extend the functionalities of the editor.

Please suggest me any articles, videos, conferences, blogs, where I can pick up the concepts. I have been troubled by this lack of knowledge and unclear path to access it.

Edit: Reasons for this quest: I am not here to create yet another text editor. But I do understand that they are one of the complex peice of software which still is under constant improvisation and developement. Also text processing is the one of the core concepts of computer science. A lot of algorithm and data structure knowledge is hidden inside it. Besides, I feel through real world projects one can learn alot about core computer science foundations.

Ask HN: How to learn about text editor architectures and implementations?

195 pointss3arch4y ago72 comments

Please suggest me any articles, videos, conferences, blogs, where I can pick up the concepts. I have been troubled by this lack of knowledge and unclear path to access it.

72 comments

65 comments · 39 top-level

bakul4y ago· 5 in thread

I have a different suggestion. Start with a single line. define an api for move, insert and delete. figure out how to display this line, associate the cursor with the current position and how to map keystrokes to these simple functions. Add commands to load/save this one line file. Also some convenience functions (move to star/end of a line etc). Add simple search, search and replace. Next extend this to as many lines as you can display on your screen. Add more commands. Add commands to operate on a sequence of lines. Next allow arbitrary number of lines, and arbitrary length lines (now you can see only a rectangular slice of the file). Add commands to move that window, move lines around etc. Resist the urge to micro optimize or do so early. Just use the simplest data structures that help you write the clearest code. Later you can profile the code and fix slow parts.

I suggest this as it will force you to solve problems yourself as opposed reading about other people’s solutions. Basically learn by doing. Learn by struggling to come up with solutions and data structures, thereby developing some insight. The stepwise development should help focus on a small subset of problems at a time. Don’t be afraid of changing data structures as you gain knowledge. In fact write code to make it easy! Use a language that won’t trip you up in low level issues such as memory management. The api will make testing easier. Armed with this knowledge you’ll better appreciate and understand other people’s solutions as well!

b3morales4y ago

Similar suggestion: if you're trying to learn the architecture of a big established project like Vim or Emacs, don't start at the top and try to follow everything. Start with a small feature you would like to tweak, implement, or understand. How does Emacs render the modeline, for example. Just finding the relevant code in the repo will give you some info. Read through it, follow its calls and the data structures it touches. Getting used to the code style might take a bit. You probably won't understand or retain everything about how this piece works, but in hacking on it you will naturally brush up against other areas and build up your familiarity with the project practices and larger structure. And you can use that as a stepping stone to check out another area of functionality.

TheRealPomax4y ago

Except text editors _need_ other people's solutions to learn from, because text editing is one of those cases where "a good implementation for a part of the problem" is a terrible solution for the full problem. Editing implementation for single lines of text do not work well at all for entire text files. Operations become incredibly slow, which is why we invented things like rope data structures and buffered views.

geocar4y ago

> Editing implementation for single lines of text do not work well at all for entire text files. Operations become incredibly slow, which is why we invented things like rope data structures and buffered views.

memmove() is about 100 gigabytes a second on my computer. That's 100 megabytes every single msec. My screen refreshes at about 100hz. None of my source code files are even 1 megabyte let alone 100 megabytes, so at least on my computer, for all of the files I edit, there isn't a single operation that a "rope" or a "buffered view" could offer over the naive because I could literally just rewrite the entire file in C 100 times every keystroke, and I think for many operations it is obvious how to do better.

For this reason and others I think it's absolutely plausible for a beginner to self-teach themselves how to write a good text editor and even improve on things, and I would never want to discourage someone from a discovery they have both the interest and the time for.

3 more replies

s3archOP4y ago

I really appreciate your thoughts.

The process of incremental building things. Figure out what needs to be built first and then implement it. Finally compare your work with others. To be frank I used to feel that there must be some sort of deep intellectual concepts lying around, and whatever I try or implement is just dumb. But now after what you described the process and technique seems not so complicated.

hellectronic4y ago

Yeah its good advice. All Architectures are build like this. Inkrementally.

scandox4y ago· 4 in thread

You could start by looking at something super simple like Kilo:

https://github.com/antirez/kilo

Even I could understand this one pretty well and that's no small matter.

dakra4y ago

There is also a nice tutorial that guides you through building the kilo editor: https://viewsourcecode.org/snaptoken/kilo/

incanus774y ago

This. I was going to come here to post this specifically.

fcatalan4y ago

I loved following this one a few summers ago and even went beyond the end of the tutorial adding functionality and trying more advanced data structures. I think I would have ended with a completely usable hyper-personal editor but unicode support didn't look like fun.

sam_lowry_4y ago

The screencast by antirez is a joy to watch.

1 more reply

whartung4y ago· 4 in thread

If you're interested in visual editors and if you're looking for something perhaps more accessible (and I can't honestly say how much it really is, I have not looked at it) then consider taking a look at 'less', the pager.

Less does almost everything an "editor" does, at least visually, except change text. It pages through text, forward and backward, line by line, it handles lines that are too long, tab expansion, it searches, even has an extended command set. (Can less do syntax coloring?)

It also handles files too big for memory. These are all editor problems. Mind its solutions may not be optimized for an editor, but it's certainly smaller.

Today, modern machines "suffer" from "too much" performance which actually frees you from not having to worry so much about the actual backing store, especially early on. Do you really intend to be editing a 2GB file? Honestly, how big is an average text file? And how many billion cycles per second does a modern CPU handle? Sucking the entire file in to RAM, and just pushing stuff around with block moves will take you very far on a modern machine. Not that you should not look at the other data structures (there are many), but you don't have to start there, depending on where your interest lies.

Also consider hunting down the book "Software Tools". There's two editions, the original and "Software Tools in Pascal". It's by Kernighan and Plauger. They go through in detail and write a version of the 'ed' line editor.

And if you really want to work on an editor, the CP/M world would love a new one. There, it's all about efficiency.

nicoburns4y ago

> Do you really intend to be editing a 2GB file?

I don't know about most people, but I deal with such files (usually JSON) on a weekly basis. It's surely one of the things that makes implementing text editors tricky.

whartung4y ago

Honest question, do you just open the files, or do you change them and write them back out?

I, too, have dealt with huge files. My primary large file use case for vim is as an interactive grep, opening files, then just performing iterative filters reducing the file to something "manageable". If the file is too big, I might resort to raw streaming filters to temp files before going all vim on it (even vim has issues with huge files -- huge files are issues all their own).

But it's more a matter of pragmatism as to how far one wants to take a pet project like this. There's a curve of diminishing return depending on what someone wants from such a thing.

A single RAM buffer is a bad idea, to be sure, but raw horsepower of modern systems and CPUs cover up a lot of bad ideas. Simply the idea that you COULD mmap or even malloc a 2G buffer to suck the file in, and it would WORK, is enough to make ones teeth itch. I remember long a ago an article from someone encountering an early workstation with 1G of RAM, and the worlds it opened up. We have come far from those days.

If the OP is interested in buffer mechanics, then starting with an API backed by a simple buffer goes a long way. If they're interested in screen painting, window management, etc, then the RAM buffer may be all they need. Otherwise, they can work on replacing their back end with all of the assorted structures folks have mentioned, see what they like best, as they all have tradeoffs.

1 more reply

hutzlibu4y ago

Most people won't, they will just get annoyed, if they accidently open a video and the editor hangs up for some time, trying to process it.

And I regulary work with JSON files the size of some MB, which already puts some editors into a real struggle.

Out of curiosity, why do you have so big files? (A whole DB as a json file?) And what editor do you use for them?

1 more reply

cmg4y ago

Just this morning I needed to do a global find/replace for a string in a 3GB .sql file. I was in VS Code anyway so tried that - but the app choked on it. Ended up going with sed and it worked perfectly.

jesperlang4y ago· 3 in thread

The rope data structure is an interesting concept worth checking out!

https://en.wikipedia.org/wiki/Rope_(data_structure)

evanmoran4y ago

Glad to see this here. This is was my original thought too as one of the original data structures for text editing. It's used to quickly insert/delete within a very large string.

Has anyone used other mechanisms for large document types? It seems like an array of paragraphs, each containing a simple string could probably handle most editing tasks and might be easier to layout and manage. Curious what other data structures people have used!

raphlinus4y ago

In my opinion, the rope is easily superior to these other types, but it also depends on whether your language supports abstractions well. The problem with an "array of paragraphs" is that it helps when your problem involves the paragraph boundary, but gets in the way when it doesn't. For example, pressing backspace at the beginning of paragraph 2 causes a merge of paragraphs 1 and 2, which is not trivial. With a rope, it's the same as deleting a backspace within a simple string, once you have a good rope library under you.

The other big reason to prefer a rope is that the worst case complexity is excellent. Basically all incremental operations are O(log n). With an "array of paragraphs" you get various pathological performance cases such as a huge number of small paragraphs or one very big one.

A good rope implementation is not trivial, but when done right it hides its internal complexity from the layer above. And it's a solved problem. There are at least two or three solid rope crates for Rust (just to pick the language I'm most familiar with), and will likely be more, as people find it fun to implement.

2 more replies

jhallenworld4y ago

I used a double-linked list of gap buffers (each in its own fixed length block that can swapped out to disk) for JOE. It works great on large files, but still I would start with rope these days.

Gap buffer was appealing on really slow machines, where you are counting cycles on each key-press. If the gap is at the right position, the cycle count is very low. But you can probably do the same even with a tree-structure: you need to keep a pointer to the leaf in the abstract pointer used for the cursor.

Also I would tie in the undo system to the data structure if possible. Rope does this with copy-on-write. Every version of the file could be a different top-node, and most middle and leaf nodes would be shared between revisions.

teddyh4y ago· 3 in thread

You want this:

The Craft of Text Editing

—or—

Emacs for the Modern World

–by–

Craig A. Finseth

https://www.finseth.com/craft/

buescher4y ago

And the earlier EMACS: The Extensible, Customizable Display Editor by rms https://www.gnu.org/software/emacs/emacs-paper.html

which is more of a 10,000 foot view.

s3archOP4y ago

This is awesome.

compressedgas4y ago

I like Finseth's 1980 thesis better than the 1991 book

Theory and practice of text editors, or, A cookbook for an Emacs

https://dspace.mit.edu/handle/1721.1/15905

antidnan4y ago· 2 in thread

Prosemirror is a good place to start. https://prosemirror.net/docs/guide/ https://marijnhaverbeke.nl/blog/prosemirror.html

They have a well architected model, including plugins to extend functionality, see https://tiptap.dev/ which is built ontop of Prosemirror

bonus: The author talks about collaborative editing here: https://marijnhaverbeke.nl/blog/collaborative-editing.html

codethief4y ago

I came here to mention Marijn Haverbeke's blog! His articles (about ProseMirror but also CodeMirror) are full of insights!

evilhackerdude4y ago

Seconded! Marijn has thought about this stuff a _lot_.

grisha4y ago· 1 in thread

An excellent tutorial for terminal based text editor from scratch: https://viewsourcecode.org/snaptoken/kilo/

timw4mail4y ago

There's also a Rust-native version: https://www.philippflenker.com/hecto/

munificent4y ago· 1 in thread

I have a little fantasy console project I tinker on that has a built-in text editor. Working on that has been an insightful trip into all of the subtleties of text editing that we intuitively know but don't know we know. Here's one I stumbled onto recently. Say your file looks like this, with the cursor at `|`:

    12345|678
    123
    12345678

Press arrow key down once, and you get:

    12345678
    123|
    12345678

Because the second line's length is too short to preserve the cursor column, the cursor snaps to the end of the line. Note that the cursor is really here. If the user were to press left once, it would take them to:

    12|3

Instead of pressing left, say the user presses down again. The cursor is currently on column 4, so you'd expect:

    12345678
    123
    123|45678

But in the text editors I've tested, you get:

    12345678
    123
    12345|678

So the cursor is snapped to the end of the line on line 2, but if the user keeps cursoring past that, the original cursor column is remembered and restored. The text editor has to display where the cursor currently is, but track where it "wants" to be if the line length weren't getting in the way.

This is definitely useful behavior. If you arrow down through a bunch of lines of various lengths, it's really annoying if the cursor starts drifting left. But implementing it correctly was much more subtle than I expected.

There are all sorts of other edge cases too. If a user presses Command-Up to move the cursor to the beginning of the file, and then presses Down, does the cursor always stay on column one, or does it remember the original column before Command-Up was pressed?

dllthomas4y ago

This is used as a mechanic in the demo of vim-adventures, incidentally.

Someone4y ago· 1 in thread

https://news.ycombinator.com/item?id=20603567, discussing https://viewsourcecode.org/snaptoken/kilo/index.html probably will give you some hints.

Architecture-wise, you can start with an ordered list of lines, with each line stored as a string.

Features that complicate things are:

- supporting large documents and staying speedy (“replace all” is a good test case)

- supporting line wrapping or proportional fonts (makes it harder to translate between screen locations and (line, character) offsets)

- supporting Unicode (makes it harder to translate between screen locations and (line, byte position) offsets)

- syntax-colouring

- plug-ins

- regular expression based search (fairly simple for single-line search _if_ you store each line as a string; harder for custom data structures, as you can’t just use a regexp library)

- supporting larger-than-memory files (especially on systems without virtual memory, but I think that’s somewhat of a lost art)

- safely saving documents even if the disk doesn’t have space for two files (a lost art. Might not even have been fully solved, ever)

Edit: you also want to look at https://news.ycombinator.com/item?id=11244103, discussing https://ecc-comp.blogspot.com/2015/05/a-brief-glance-at-how-...

mkhnews4y ago

Yea, a double-linked list of lines or what some call a gap-buffer. And how to display it all is another big part.

danielbarla4y ago· 1 in thread

Object oriented programming and design patterns in particular get a bad rap these days, however, the original Design Patterns book [1] has a case study chapter about designing a WYSIWYG document editor. Also, one of the authors, Erich Gamma, joined Microsoft in 2011, and works on the Monaco suite of components that VS Code is built on top of. So, while I am sure there's a fair bit of difference in the years since they wrote that book, as well as the needs of implementation in JS, I'd say it's a fairly good deep dive into some of the topics from one of the actual architects behind it.

Fair warning though, it's a fairly hard book to read. For a lighter, more fun intro to design patterns in particular, I always recommend Game Programming Patterns [2]

[1] https://www.amazon.com/Design-Patterns-Elements-Reusable-Obj...

[2] https://gameprogrammingpatterns.com/contents.html

forinti4y ago

I was once tasked with writing a simple text editor. I knew this book inside out so I decided to try putting the text into a tree the way it describes.

This made selecting text quite hard. So I gave up and just put the whole thing into an array. It made everything a lot easier.

bebop4y ago· 1 in thread

Ewig is an interesting implementation using immutable data structures. https://github.com/arximboldi/ewig Very proof of concept, tries to be a little emacs like. Might be worth checking out.

otikik4y ago

+1 to this, also give a look at its author's presentation about it: https://www.youtube.com/watch?v=sPhpelUfu8Q

nurbel4y ago

Have you seen The Architecture of Open Source Applications book [1] ? There is a chapter on Eclipse and its plugin system in it [2]. Lots of other interesting content not really related to text editing however.

[1]: http://aosabook.org/en/index.html [2]: http://aosabook.org/en/eclipse.html

bmitc4y ago

I’m interested in this as well and know nothing but know of a few things.

I know of The Craft of Text Editing.

https://www.finseth.com/craft/

There’s also the Racket editors, which includes a text editor control, in the Racket Graphical Interface Toolkit. It’s what is used to implement the DrRacket IDE.

https://docs.racket-lang.org/gui/editor-overview.html

marttt4y ago

Rob Pike has published several great papers about sam, the Plan 9 text editor he wrote.

1. General overview of the editor; maybe scroll to the "Implementation" section here: http://doc.cat-v.org/plan_9/4th_edition/papers/sam/

Some of the references at the end of that paper may also be relevant or interesting.

2. Tutorial for the command language: http://doc.cat-v.org/bell_labs/sam_lang_tutorial/sam_tut.pdf

3. And explanations on structural regular expressions that sam uses: http://doc.cat-v.org/bell_labs/structural_regexps/se.pdf

4. Pike's paper representing Acme, the (sort-of) follow-up editor of sam: http://doc.cat-v.org/plan_9/4th_edition/papers/acme/

mattferderer4y ago

This article on VS Code has been on my read list for a while. Maybe you'll get to it faster than me.

Bracket pair colorization 10,000x faster - https://code.visualstudio.com/blogs/2021/09/29/bracket-pair-...

caconym_4y ago

I wrote a toy text editor for fun a while back. I was aware of the 'rope' and 'gap buffer' data structures, but beyond that I had no knowledge of how "mainstream" text editors are put together. I still don't, really, but I feel that I did get an understanding of many of the core problems.

My editor had a modular architecture flexible enough to implement different input modes, including a mostly-complete subset of Vim bindings, and was fast enough to open and edit files on the order of a few hundred megabytes without perceptible slowdown. I'm sure my implementation would have looked insane to anybody who's worked on a real text editor, but I was fairly proud of it myself.

Anyway, I guess the point is that it was interesting and rewarding to navigate the core challenges myself. I'm not sure I would have gotten as much out of trying to understand how a massive project like vscode is put together, since the actual text editing functionality is (presumably) a comparatively small part of the software as a whole.

danking004y ago

I think this might be the kind of content you seek:

https://news.ycombinator.com/item?id=11244103

It’s about data structures for editable text. Surprisingly complex!

atfzl4y ago

xi-editor https://xi-editor.io/docs.html

This is written in rust and has docs about rope data structure and editor architecture.

jdefelice4y ago

Not sure if this is entirely what you are after but Andreas Kling of SerenityOS builds an IDE from scratch[1] for his own operating system. While I don't personally know C++ I found the videos interesting and fun to watch.

[1] DevTools hacking playlist - https://www.youtube.com/playlist?list=PLMOpZvQB55bfeIHSA71J8...

andrewstuart4y ago

>> I tried reading their source code, also of atom, brackets, light table etc. Honestly I don't understand anything. I am not able to make sense of the data flow and the architecture.

Even so, this is exactly the right thing to do, except probably you are studying editors that are too big for your purposes - find something smaller in scope.

You won't understand without significant effort - you need to put in the work.

So, here's what I suggest:

1: keep examining the source code of various editors - however, focus on trying to find small editors that are very focused in what they do. Also look for old editors for operating systems like DOS - they might be smaller in scope and therefore easier to understand. Also look at editors in other languages, such as Pascal.

2: The essence of learning is to implement - so start writing the smallest editor you can.

3: continue searching for and reading any sort of written documentation/blog posts/articles as you are doing.

Eventually the light will switch on.

tester344y ago

Honestly nothing gives as much experience/knowledge

as trying to write it yourself

and then reading about how other people do it e.g VS Code blog

https://code.visualstudio.com/blogs/2018/03/23/text-buffer-r...

coutego4y ago

I was on the same boat many years ago, when I co-founded Monodevelop, a port of #SharpDevelop to Mono. I started by porting the text component, which was the core of the text editor itself.

There was a book explaining the implementation of this IDE:

https://www.amazon.com/Dissecting-C-Application-Inside-Sharp...

I think you might find it interesting for what you are trying to do, even though it's a bit old.

RapperWhoMadeIt4y ago

As far as I know, some of these terminal text editors use the ncurses library to handle their "frontend", the way in which they properly display text in your terminal. As mentioned in another comment it is utmost necessary that you feel comfortable reading code in C (with a decent knowledge of syscalls) in order for you to understand the programs' source code. But when you have already achieved that level, you can start learning about ncurses with this basic tutorial [0].

[0] https://tldp.org/HOWTO/NCURSES-Programming-HOWTO/

jmiskovic4y ago

Lately I've been studying lite editor which is mostly Lua code. It's quite easy to read the code and follow along. Before that I wrote my own editor which kind of works, but it mostly made me appreciate the lite's design.

    https://github.com/rxi/lite
    https://rxi.github.io/lite_an_implementation_overview.html
    https://rxi.github.io/a_simple_undo_system.html

cdrini4y ago

If you're looking for more CS fundamental things--which I take to mean slightly more theoretical--you might be more interested in programming language design. A lot of the features that make an IDE powerful, things like tracking variable references, resolving/unioning types for better autocomplete, dependency graphs, tree parsing, etc., are based on programming language design/theory. I haven't used this, but this seems like an interesting open book on the subject: https://craftinginterpreters.com/

If you're interested more in the complexity management--which I would call more software engineering-y--not sure what the best way might be! I know one cool thing about Vs code is the language server provider (LSP). This provides all the IDE-goodies for all the languages Vs code supports in an abstract interface any plugin can implement. The spec has been so popular it's now supported in a bunch of editors! So you could develop an LSP for eg python, and then just have a light wrapper for vim, Vs code, Emacs.

buescher4y ago

It's from a completely different time and point of view than "full-stack development", vi, or emacs, but Petter Hesselberg's Programming Industrial Strength Windows: Shrink-Wrap Your App! presents a complete text editor for Windows as an example of a complete "real-world" Windows application.

ketanmaheshwari4y ago

I would suggest start with getting familiar with the C programming language. I suggest use Beej's guide for C. Next, pick up the "The Linux Programming Interface" book by Michael Kerrisk. As you read the book you will get better at understanding source code of at least Vim and Emacs if not others.

cellularmitosis4y ago

Gary Bernhardt of destroyallsoftware did a screencast episode about that: https://www.destroyallsoftware.com/screencasts/catalog/text-...

Note that it’s behind a paywall (his catalog is likely worth it for a curious hacker such as yourself).

Watch his “compiler from scratch” for free to get a sense of what his casts are like: https://www.destroyallsoftware.com/screencasts/catalog/a-com...

cartesius134y ago

Emacs and vim are way too complicated to just jump directly to the source code. Try something smaller and more manageable. I would personally recommend vis[1]. Also, take a look at https://texteditors.org/ to discover new editors and resources on design and implementation

[1]https://github.com/martanne/vis

epberry4y ago

Here's 2 articles on building a SQL editor.

- https://arctype.com/blog/sql-query-editor-switch/

- https://arctype.com/blog/indexeddb-localstorage/

Disclaimer: I wrote the 2nd one. My general takeaway is that it gets interesting quickly as the amount of text edited increases.

schemathings4y ago

'Challenging projects every programmer should try' - first example is a text editor with some pointers to some data structures to consider, some design patterns to use, and a few links to other resources. https://austinhenley.com/blog/challengingprojects.html

jhallenworld4y ago

Read technical documentation for existing editors, for example:

https://sourceforge.net/p/joe-editor/mercurial/ci/default/tr...

mxstbr4y ago

If you're curious about looking at something novel and ambitious, Zed has a really interesting writeup of their… unusual architecture involving a GPU-powered UI framework and CRDTs written in Rust: https://zed.dev/

wnolens4y ago

Off topic/meta: I really really love this question, and appreciate so many good replies. <3 HN

cx0der4y ago

Neatpad https://www.catch22.net/tuts/neatpad

This is a series that I used to follow as I too wanted to implement my own text editor to learn the underlying architecture.

mathieubordere4y ago

A small `vi` implementation that might be understandable -> https://git.busybox.net/busybox/tree/editors/vi.c

armchairhacker4y ago

See RSyntaxTextArea (open-source code editor) and the underlying JTextArea / swing code.

There are a few weird design decisions especially in swing, but overall it's very easy to read and understand.

maxk424y ago

vim is kinda hard to follow and emacs is probably too complex. Start with some simpler open-source editors like joe, nano, or gEdit. neovim is probably a lot easier to follow than vim, as well. You could also peruse these github projects: https://github.com/collections/text-editors

amichail4y ago

For WYSIWYG scientific/math editors, see the open source TeXmacs.

j / k navigate · click thread line to collapse

72 comments

65 comments · 39 top-level

bakul4y ago· 5 in thread

b3morales4y ago

TheRealPomax4y ago

geocar4y ago

3 more replies

s3archOP4y ago

I really appreciate your thoughts.

hellectronic4y ago

Yeah its good advice. All Architectures are build like this. Inkrementally.

scandox4y ago· 4 in thread

You could start by looking at something super simple like Kilo:

https://github.com/antirez/kilo

Even I could understand this one pretty well and that's no small matter.

dakra4y ago

There is also a nice tutorial that guides you through building the kilo editor: https://viewsourcecode.org/snaptoken/kilo/

incanus774y ago

This. I was going to come here to post this specifically.

fcatalan4y ago

sam_lowry_4y ago

The screencast by antirez is a joy to watch.

1 more reply

whartung4y ago· 4 in thread

It also handles files too big for memory. These are all editor problems. Mind its solutions may not be optimized for an editor, but it's certainly smaller.

And if you really want to work on an editor, the CP/M world would love a new one. There, it's all about efficiency.

nicoburns4y ago

> Do you really intend to be editing a 2GB file?

I don't know about most people, but I deal with such files (usually JSON) on a weekly basis. It's surely one of the things that makes implementing text editors tricky.

whartung4y ago

Honest question, do you just open the files, or do you change them and write them back out?

But it's more a matter of pragmatism as to how far one wants to take a pet project like this. There's a curve of diminishing return depending on what someone wants from such a thing.

1 more reply

hutzlibu4y ago

Most people won't, they will just get annoyed, if they accidently open a video and the editor hangs up for some time, trying to process it.

And I regulary work with JSON files the size of some MB, which already puts some editors into a real struggle.

Out of curiosity, why do you have so big files? (A whole DB as a json file?) And what editor do you use for them?

1 more reply

cmg4y ago

jesperlang4y ago· 3 in thread

The rope data structure is an interesting concept worth checking out!

https://en.wikipedia.org/wiki/Rope_(data_structure)

evanmoran4y ago

Glad to see this here. This is was my original thought too as one of the original data structures for text editing. It's used to quickly insert/delete within a very large string.

raphlinus4y ago

2 more replies

jhallenworld4y ago

I used a double-linked list of gap buffers (each in its own fixed length block that can swapped out to disk) for JOE. It works great on large files, but still I would start with rope these days.

teddyh4y ago· 3 in thread

You want this:

The Craft of Text Editing

—or—

Emacs for the Modern World

–by–

Craig A. Finseth

https://www.finseth.com/craft/

buescher4y ago

And the earlier EMACS: The Extensible, Customizable Display Editor by rms https://www.gnu.org/software/emacs/emacs-paper.html

which is more of a 10,000 foot view.

s3archOP4y ago

This is awesome.

compressedgas4y ago

I like Finseth's 1980 thesis better than the 1991 book

Theory and practice of text editors, or, A cookbook for an Emacs

https://dspace.mit.edu/handle/1721.1/15905

antidnan4y ago· 2 in thread

Prosemirror is a good place to start. https://prosemirror.net/docs/guide/ https://marijnhaverbeke.nl/blog/prosemirror.html

They have a well architected model, including plugins to extend functionality, see https://tiptap.dev/ which is built ontop of Prosemirror

bonus: The author talks about collaborative editing here: https://marijnhaverbeke.nl/blog/collaborative-editing.html

codethief4y ago

I came here to mention Marijn Haverbeke's blog! His articles (about ProseMirror but also CodeMirror) are full of insights!

evilhackerdude4y ago

Seconded! Marijn has thought about this stuff a _lot_.

grisha4y ago· 1 in thread

An excellent tutorial for terminal based text editor from scratch: https://viewsourcecode.org/snaptoken/kilo/

timw4mail4y ago

There's also a Rust-native version: https://www.philippflenker.com/hecto/

munificent4y ago· 1 in thread

    12345|678
    123
    12345678

Press arrow key down once, and you get:

    12345678
    123|
    12345678

    12|3

Instead of pressing left, say the user presses down again. The cursor is currently on column 4, so you'd expect:

    12345678
    123
    123|45678

But in the text editors I've tested, you get:

    12345678
    123
    12345|678

dllthomas4y ago

This is used as a mechanic in the demo of vim-adventures, incidentally.

Someone4y ago· 1 in thread

https://news.ycombinator.com/item?id=20603567, discussing https://viewsourcecode.org/snaptoken/kilo/index.html probably will give you some hints.

Architecture-wise, you can start with an ordered list of lines, with each line stored as a string.

Features that complicate things are:

- supporting large documents and staying speedy (“replace all” is a good test case)

- supporting line wrapping or proportional fonts (makes it harder to translate between screen locations and (line, character) offsets)

- supporting Unicode (makes it harder to translate between screen locations and (line, byte position) offsets)

- syntax-colouring

- plug-ins

- regular expression based search (fairly simple for single-line search _if_ you store each line as a string; harder for custom data structures, as you can’t just use a regexp library)

- supporting larger-than-memory files (especially on systems without virtual memory, but I think that’s somewhat of a lost art)

- safely saving documents even if the disk doesn’t have space for two files (a lost art. Might not even have been fully solved, ever)

Edit: you also want to look at https://news.ycombinator.com/item?id=11244103, discussing https://ecc-comp.blogspot.com/2015/05/a-brief-glance-at-how-...

mkhnews4y ago

Yea, a double-linked list of lines or what some call a gap-buffer. And how to display it all is another big part.

danielbarla4y ago· 1 in thread

Fair warning though, it's a fairly hard book to read. For a lighter, more fun intro to design patterns in particular, I always recommend Game Programming Patterns [2]

[1] https://www.amazon.com/Design-Patterns-Elements-Reusable-Obj...

[2] https://gameprogrammingpatterns.com/contents.html

forinti4y ago

I was once tasked with writing a simple text editor. I knew this book inside out so I decided to try putting the text into a tree the way it describes.

This made selecting text quite hard. So I gave up and just put the whole thing into an array. It made everything a lot easier.

bebop4y ago· 1 in thread

Ewig is an interesting implementation using immutable data structures. https://github.com/arximboldi/ewig Very proof of concept, tries to be a little emacs like. Might be worth checking out.

otikik4y ago

+1 to this, also give a look at its author's presentation about it: https://www.youtube.com/watch?v=sPhpelUfu8Q

nurbel4y ago

[1]: http://aosabook.org/en/index.html [2]: http://aosabook.org/en/eclipse.html

bmitc4y ago

I’m interested in this as well and know nothing but know of a few things.

I know of The Craft of Text Editing.

https://www.finseth.com/craft/

There’s also the Racket editors, which includes a text editor control, in the Racket Graphical Interface Toolkit. It’s what is used to implement the DrRacket IDE.

https://docs.racket-lang.org/gui/editor-overview.html

marttt4y ago

Rob Pike has published several great papers about sam, the Plan 9 text editor he wrote.

1. General overview of the editor; maybe scroll to the "Implementation" section here: http://doc.cat-v.org/plan_9/4th_edition/papers/sam/

Some of the references at the end of that paper may also be relevant or interesting.

2. Tutorial for the command language: http://doc.cat-v.org/bell_labs/sam_lang_tutorial/sam_tut.pdf

3. And explanations on structural regular expressions that sam uses: http://doc.cat-v.org/bell_labs/structural_regexps/se.pdf

4. Pike's paper representing Acme, the (sort-of) follow-up editor of sam: http://doc.cat-v.org/plan_9/4th_edition/papers/acme/

mattferderer4y ago

This article on VS Code has been on my read list for a while. Maybe you'll get to it faster than me.

Bracket pair colorization 10,000x faster - https://code.visualstudio.com/blogs/2021/09/29/bracket-pair-...

caconym_4y ago

danking004y ago

I think this might be the kind of content you seek:

https://news.ycombinator.com/item?id=11244103

It’s about data structures for editable text. Surprisingly complex!

atfzl4y ago

xi-editor https://xi-editor.io/docs.html

This is written in rust and has docs about rope data structure and editor architecture.

jdefelice4y ago

[1] DevTools hacking playlist - https://www.youtube.com/playlist?list=PLMOpZvQB55bfeIHSA71J8...

andrewstuart4y ago

>> I tried reading their source code, also of atom, brackets, light table etc. Honestly I don't understand anything. I am not able to make sense of the data flow and the architecture.

Even so, this is exactly the right thing to do, except probably you are studying editors that are too big for your purposes - find something smaller in scope.

You won't understand without significant effort - you need to put in the work.

So, here's what I suggest:

2: The essence of learning is to implement - so start writing the smallest editor you can.

3: continue searching for and reading any sort of written documentation/blog posts/articles as you are doing.

Eventually the light will switch on.

tester344y ago

Honestly nothing gives as much experience/knowledge

as trying to write it yourself

and then reading about how other people do it e.g VS Code blog

https://code.visualstudio.com/blogs/2018/03/23/text-buffer-r...

coutego4y ago

I was on the same boat many years ago, when I co-founded Monodevelop, a port of #SharpDevelop to Mono. I started by porting the text component, which was the core of the text editor itself.

There was a book explaining the implementation of this IDE:

https://www.amazon.com/Dissecting-C-Application-Inside-Sharp...

I think you might find it interesting for what you are trying to do, even though it's a bit old.

RapperWhoMadeIt4y ago

[0] https://tldp.org/HOWTO/NCURSES-Programming-HOWTO/

jmiskovic4y ago

    https://github.com/rxi/lite
    https://rxi.github.io/lite_an_implementation_overview.html
    https://rxi.github.io/a_simple_undo_system.html

cdrini4y ago

buescher4y ago

ketanmaheshwari4y ago

cellularmitosis4y ago

Gary Bernhardt of destroyallsoftware did a screencast episode about that: https://www.destroyallsoftware.com/screencasts/catalog/text-...

Note that it’s behind a paywall (his catalog is likely worth it for a curious hacker such as yourself).

Watch his “compiler from scratch” for free to get a sense of what his casts are like: https://www.destroyallsoftware.com/screencasts/catalog/a-com...

cartesius134y ago

[1]https://github.com/martanne/vis

epberry4y ago

Here's 2 articles on building a SQL editor.

- https://arctype.com/blog/sql-query-editor-switch/

- https://arctype.com/blog/indexeddb-localstorage/

Disclaimer: I wrote the 2nd one. My general takeaway is that it gets interesting quickly as the amount of text edited increases.

schemathings4y ago

jhallenworld4y ago

Read technical documentation for existing editors, for example:

https://sourceforge.net/p/joe-editor/mercurial/ci/default/tr...

mxstbr4y ago

wnolens4y ago

Off topic/meta: I really really love this question, and appreciate so many good replies. <3 HN

cx0der4y ago

Neatpad https://www.catch22.net/tuts/neatpad

This is a series that I used to follow as I too wanted to implement my own text editor to learn the underlying architecture.

mathieubordere4y ago

A small `vi` implementation that might be understandable -> https://git.busybox.net/busybox/tree/editors/vi.c

armchairhacker4y ago

See RSyntaxTextArea (open-source code editor) and the underlying JTextArea / swing code.

There are a few weird design decisions especially in swing, but overall it's very easy to read and understand.

maxk424y ago

amichail4y ago

For WYSIWYG scientific/math editors, see the open source TeXmacs.

j / k navigate · click thread line to collapse