- What problem does the software solve, and - roughly - how?
- How does it look? If it has any visual component, even if it's a CLI interface that's meant for human consumption, screenshots are mandatory. Screenshots + videos/ASCIInemas preferable.
- If there are alternatives you know of, mention them - be honest about when an alternative is a better fit than your project.
- What platforms does it work on? What are the requirements?
- How do I install it? In exact steps, for most common workflows. If that gets too long, link to a separate page containing these detailed instructions.
- How do I run it? Examples of common use cases, with exact invocations/procedures to perform.
- Any relevant remarks that could prevent me from installing or using the software.
- Links to further docs, project webpage, communities, etc.
When I evaluate a project - even briefly, skimming the repo README - these are all the points I'm on the lookout for - they're all helpful for deciding whether to look closer at the project, and possibly install and use it.
Find someone who might be interested in the project and offer to screenshare with them as they navigate the README and try to get things set up. Note: you just watch them go through this while they describe their thinking and take notes.
I've done similar things and it never ceases to surprise me the how big the gaps are in my view of the project vs someone fresh to it.
Also include the first bullet point (what it is, why and maybe how) in every blog post, landing page, release note, news article, etc. It is frustrating to see an HN article about Blahblah v2.7 with Blergity and not be able to figure out what the heck Blahblah is without clicking around a dozen times.
Some time ago one of the big data serialization formats of the apache system was discussed. I didn‘t understand the benefit of it as it looked like all the other ones. But the hn comments had discussed the main benefit: it‘s an in-memory so the different (layered) big data frameworks can directly operate on it without permanent encoding/decoding for the next layer.
Do you understand what it does?
Good job on putting a code sample early, and adding ASCIInema videos; these are somewhat illuminating even for a person like me, who doesn't know anything about the ecosystem.
Here's what I would improve:
- Fill in the "Introduction" page; right now it's empty.
- Put appropriate links under first mention of "MinIO", "mc" and "micro" - to make it that much easier for people to fill the missing context.
- Is that Python 2 or Python 3 library? Or it works on both? ASCIInema below suggests Python 3, but that's too far down. I'd put the number(s) in the first sentence anyway.
- Where do I get it from and how? What dependencies does it require?
My takeway is that it's an extremely specialized piece of software, only of any use to people already familiar with whatever mc and minio are, who hopefully find that explanatory enough.
I think this sort of thing is okay if you're targeting very specialized uses. But it is almost definitely too obscure if a newcomer to the field could have an use for it, without yet knowing what tools are used in that specific area.
No. But I also do not know what MinIO is, nor mc, minio or iko.
It also does not say why it is called bmc.
One thing to improve it would be to link the MinIO project at the beginning. A simple hyperlink in the intro will do!
When one is stuck on a problem, maybe with the deadline or angry boss axe pending on her head, last thing she needs is fancy stuff, emojis, gifa or having to sort through marketing crap (not your case but I've seen plenty of those in "fancy projects")
Look at manpages, emergency operation manuals, medical procedures and copy the style
Use the website for fancy stuff, it's what's its for
Personally the link you shared is perfect to me, at a glance
I seldom find less helpful readme's than manpages. Technical correct, but hard too read with too many details and no simple straight forward example.
Info manuals (mentioned in most manpages, usually triggered by info <command>) are precisely what you're looking for.
I wrote much more deeply about this on my blog: https://skerritt.blog/make-popular-open-source-projects/
It would also be nice to see a comparison (performance, ease of use, etc.) with other frameworks on why I should use it vs whatever else comes up on Google
the name of the package;
the version number of the package, or refer to where in the package the version can be found;
a general description of what the package does;
a reference to the file INSTALL, which should in turn contain an explanation of the installation procedure;
a brief explanation of any unusual top-level directories or files, or other hints for readers to find their way around the source;
a reference to the file which contains the copying conditions. The GNU GPL, if used, should be in a file called COPYING. If the GNU LGPL is used, it should be in a file called COPYING.LESSER.”
— GNU Coding Standards, https://www.gnu.org/prep/standards/standards.html#Releases (June 12, 2020)
“Good things to have in the README include:
1. A brief description of the project.
2. A pointer to the project website (if it has one)
3. Notes on the developer's build environment and potential portability problems.
4. A roadmap describing important files and subdirectories.
5. Either build/installation instructions or a pointer to a file containing same (usually INSTALL).
6. Either a maintainers/credits list or a pointer to a file containing same (usually CREDITS).
7. Either recent project news or a pointer to a file containing same (usually NEWS).”
— Software Release Practice HOWTO, https://tldp.org/HOWTO/Software-Release-Practice-HOWTO/distp... (Revision 4.1)
The fact is you can't satisfy all users or all use cases so making me, the developer, not waste any of my time trying to implement something that is not possible with the library makes me really appreciate it even though I may not end up using it.
Being boring isn't necessarily bad. It's just that under- or over-communicating is a delicate balance that has to adjusted based on the project and its scope.
If the project maintainers added a GitHub stars counter, then I assume they check their starts, and I'll consider staring their project (one sec and could make the maintainers 0.1% happier). In case I see a Twitter badge, then I'd follow them.
So I see the first group as a signal about what I need to know before I pull in a dependency, and the second group as something I can do to make the maintainers day.
To illustrate, let's look at summaries of top popular projects (by number of commits, because that's what Apache shows on stats page):
> Camel is an Open Source integration framework that empowers you to quickly and easily integrate various systems consuming or producing data.
> Apache Flink® — Stateful Computations over Data Streams [here image of taking events and DB as inputs of process]
> Airflow is a platform created by the community to programmatically author, schedule and monitor workflows.
??? Why would i need them?
> Apache Spark™ is a unified analytics engine for large-scale data processing. > spark.read.json("logs.json").where("age > 21").select("name.first").show()
> Lucene Core is a Java library providing powerful indexing and search features, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities.
These tell or show what they do. Good.
My GF is a cell biology girl dipping her toes in bio informatics because she has to. She does read docs before asking, but the problem is that the docs she finds are for people who would already had figured out how to use the program on their own.
As some examples, she can’t start if:
Doesn’t know that the program has dependencies, and which ones those are.
Doesn’t know what the slang of the field is, keep it generic. “Just Bootstrap the gradle config to the mdl Sasquatch compiler” doesn’t mean anything to someone just landing to the page and just frustrates them because they feel like they are supposed to know stuff and don’t even know where to start.
The “quick start” guide can be short if existing, but show how someone would get to the same thing in their own project. What config files must be set up and how, for example.
- What problem is this repo solving / why would I use it (ideally explained in a way that a person who uses completely different tech stack could understand; this is a tough requirement though)
- What problems this repo does not solve and when I should not use it
- What are the alternatives and trade-offs
Second this, really can help if you're navigating through a few possible alternatives
- Include a quick getting started (installation instructions, etc.), even if it's meant for those already familiar with the extension ecosystem. Just a few commands. I don't want to install it via docker, hardly ever. I have my own dockerfiles for that.
- Bullet points in the beginning have a LOT of repetitive content (Apache AGE enables|supports|etc) that turns those really important bullet points into a wall of text.
- You have three different links to docs, one of which seems to be a duplicate. Why would I need docs? For cypher? I already know cypher. For development? What about if I want to hack on the source? Where are those docs? Quick links are always appreciated since I'm almost always going to be on a time budget evaluating things like this.
Just to name a few.
If diagrams (like those produced by draw.io) could be used to better communicate how incubator-age functions and why it is important, you could use them, but I don't think you need to.
An explanation of repo architecture could be useful for people wanting to hack on it. [1]
[1] https://matklad.github.io//2021/02/06/ARCHITECTURE.md.html
Edit: Just noticed the Documentation section doesn't link to the website. If I'm getting started with your project, the getting started documentation on your website will be far more useful to me than a very long PDF.
- tell what the software does
- list its requirements
- explain how to set it up
It must do this only relying on local files.
The README of AGE meets some of these. The description could better. I'd open with: Apache AGE is an PostgreSQL extension that implements the openCypher property graph query language.
I would remove the latest happenings from the README and keep them on the project home page which I would have a link to right after the description.
I would also I would include the source of Apache_AGE_Guide.pdf in the repository. It is fine to have a link to the already formatted PDF should some one be viewing the README on a device where building the software isn't possible. But I would not want a user to download a snapshot or the repository and then discover that they have to have further internet access to get the documentation.
- what is this
- how do I pronounce the name (note there's a popular crypto tool called "age", which is "pronounced like the Japanese 上げ (with a hard g)")
- what are the alternatives (regular postgres, dgraph, neo4j, ...), and how do they compare, along whatever dimensions you like (performance, guarantees, scale, ease of use, license, ...)
- what are some use cases that are good for AGE vs the alternatives
- when do I not want to use this
- what does simple example usage look like
- what is the maintenance status, what does "(incubating)" mean, etc
But the github readme is a first-glance place, and once I've made it to the docs I might not go back to the readme. So the docs need to contain all that information.
When mentioning pronouncing in Japanese, I would think that it would be better to use entirely kana, rather than kanji or mixed kanji/kana. (When writing actual words/sentences, I think mixed kanji/kana is better, but for writing pronounciation, I think kana alone is better.) (Note: I don't know so well understanding Japanese, but I can pronounce words written with kana, at least.)
> But the github readme is a first-glance place, and once I've made it to the docs I might not go back to the readme. So the docs need to contain all that information.
At least some of the information, yes. Any information which is relevant to the use of the project should be mentioned in the documentation, including the license, although the documentation might not need to contain screenshots, a list of alternatives, etc.
It's utterly ridiculous that a non-phonetic language won out (for now) the race to global lingua franca. Oh well, I guess we're stuck with it.
Then I honestly think it's pretty good. Maybe two changes would be:
- Maybe embed one or two use comparisons of some search in plain SQL, and something using age to show what you can get. This both shows the wins, as well as shows people who already know cypher the general idea of how the languages two combine.
- As an incubating project, maybe talk a bit about roadmap. At what point will it no longer be incubating? What's the threshold? What are the risks of using an incubating project?
Some suggestions:
- One thing that obviously can be added are examples, but I wouldn't litter the whole README with examples. Maybe link to an examples page if you have one?
- When mentioning openCypher and AgensGraph, link to the respective project pages. I should not have to use a search engine to understand those references.
In your case, what kind of queries will I be able to make with this extension. For example, getting a list of neighborhoods, the houses in the neighborhoods and the pets in each household. All of that in a single graph query, what does it look like? Boom selling point.
Provides context and helps see a bigger picture or where a particular project stands.
1. What does this thing do?
2. How to install this thing.
3. Examples of using this thing along with output.
Don't go overboard with images. A logo, some screen shots or whatever is plenty. Avoid fluff like cute pictures, memes, image macros, emojis or anything that clutters up the readme file in a terminal.
Or are you bored? Maybe that is the real issue?
But for real, nothing wrong with boring READMEs, as long as they do a good job of explaining the important things using relatively simple language. Images, videos, and demos are always a good idea too, provided they are relevant.
* Provide a graphic, if possible, either of the software in use (if appropriate) or a logo of the project. Presenting people with a wall of text is off putting, so it's nice to break it up a bit with some pictures.
* Describe what the project is as succinctly as possible. If I can't figure out what the project is through a stream of buzzwords or vague descriptions and have to go to another source online to know what a project is, this is an abject failure of the README.
* Describe how to use it, either with a short example or a 'quick start' section. This should be the third or fourth section and should be how to actually use the software. The simpler the better.
* Give a brief description of the documentation and provide a link to more extensive documentation.
* Give a brief description of how to install it or contribute to it. This is most likely not going to be a portion of the README that will be most useful to people but for those that it will be, it provides a nice entry point
* Describe the license. This should be the last thing in the README but should be there to clearly mark this is a FOSS project (or not, if that's the case)
The README is there basically as a directory to the project. The things I initially look for are, in this order:
* What is this project (why am here/what is it good for/why should I care)?
* How do I install it?
* How do I use it, once installed (preferably with an example)?
* What is the license?
All the rest is about ushering the person looking at the project to the appropriate, more detailed, portion of the project, be it documentation, issues, usage, tutorials etc.
Also realize that, as a good approximation, there are roughly four types of documentation [0]. My view is that the README should clearly fall in the reference/information oriented section. It's meant to convey information about the project in the most succinct way possible and give pointers to other areas of the project if someone wants more details.
And, if possible, get feedback from people who actually use it.
EDIT: Sorry, I just wanted to add that it's OK to be boring. Excitement is not the purpose of a README. Utility is. It's more like "fundamental infrastructure" than "a fun document to read!" Leave the fun for the tutorials, how-to guides or other resources. The README is there to be the smallest payload for the maximum utility to convey useful information.
;;;;;;;;;;;;;;;;;;;
;;;;; HEADING ;;;;;
;;;;;;;;;;;;;;;;;;;
or: emoji + heading
plus a rare touch of ascii art for diagrams
EDIT all of the above does not work on HN
i wouldn't care about it being boring but one thing I dislike about a lot of technical documents is I feel like there's always a simple way to translate it into simpler english.
Throw in some humor. Avoid any unnecessary hoidy-toidy or pretentious language.
The rule of thumb I learned was "avoid anything that ends in -ly" (there are exceptions, but this has always been a good rule).
I would suggest leading with a full-size artwork incorporating skulls, dragons and gothic letters.
First impressions last. Make a good one.