I Wrote an Activitypub Server in OCaml: Lessons Learnt, Weekends Lost (opens in new tab)

(gopiandcode.uk)

154 pointsgopiandcode2y ago108 comments

108 comments

The author makes the basic mistake of most of the people implementing ActivityPub services: they want to map the logic of an existing type of web application and contort existing domain objects into encoding/decoding to an "impractically large number" of options. That happens because they want two things in one: a server and a client.

The ActivityPub specification needs to be read with a goal similar to an email server in mind. It should do one thing: receive JSON-LD objects in inbox, process them according to the specification, and(maybe) store them on disk.

The idea of "users", "friends", "posts", "feeds" etc, are concepts that belong to the clients on top of this server, not in the server itself.

This separation between clients and server will also allow better interop/graceful degradation of object types that the client/server don't specifically understand.

still_grokking2y ago

This comment raised a whole bunch of red flags for me.

Fist and foremost: Saying that something is like an email server translates for me into "this is an under- and over-specified swamp at the same time, full of quirks, and actually not implementable in any reasonable way". Because that's what email is. I almost can't think of a greater horror than writing an email server from scratch…

I don't know enough about ActivityPub to judge whether it's really like email. I would strongly hope it isn't, as otherwise it would be a tech you should probably better never touch as a developer.

The next thing is: If an ActivityPub server only receives and sends some opaque BLOBs what's the whole point of it?

But when it's not about opaque BLOBs you need to map the structures in the spec to proper types in a statically typed languages as you can't manipulate them otherwise in any meaningful way. If it's not possible to do that because the spec is vague and/or there is no coherent data model behind it that would be just another reason to not touch this tech. Nobody needs the next underspecified, stringly-typed "email".

I really hope I'm reading this wrong!

mariusor2y ago

The email comparison helps people to understand the directional way ActivityPub works, I don't know enough about email (whichever of SMTP or IMAP/POP3/samd you consider that to be) to make a comparison at protocol level.

> If [...]receives and sends some opaque BLOBs what's the whole point of it?

There are some rules about how to have side effects for said blobs. Some of the blobs themselves have side effects. That's mostly what ActivityPub is: rules about how to distribute the blobs in the federated context, rules to what to do with the blobs when they reach your servers (when coming from other servers, or directly from clients).

The vocabulary that ActivityPub is based upon, is another whole specification, called ActivityStreams, and which didn't originate in the W3C group. This vocabulary has three (*main) types of objects: Activities - which provide the backbone of ActivityPub (Like, Follow, Create, Update), Actors - basically different types of users (these are the entities that operate the activities) and, Objects - whatever the Activities operate on.

WorldMaker2y ago

> If an ActivityPub server only receives and sends some opaque BLOBs what's the whole point of it?

There's still a difference between "try to black-box the incoming data as much as possible" and "treat the incoming data as opaque BLOBs and assume". The data is mostly JSON-LD which is a far cry from "binary large objects". It is always going to be "semi-transparent" as it will always be JSON. Whether or not you like the "-LD" extensions to JSON (they are heavy, they do have a lot of RDF baggage you may not desire), they give you a bunch of guaranteed "baseline schema" for the JSON objects that you can use for static typing that might be "good enough" for a lot of "meaningful manipulations" (such as following links to pick up related objects; LD => linking data) and that is all easily transparent.

A lot of the schemas beyond "LD" in ActivityPub are client/application-specific beyond most of the JSON-LD basics and should be easy to treat as a black box unless doing client/application-specific tasks. That's not necessarily "stringly typed", it's kind of a classic "serialization onion": The server at best needs to know that it is JSON and it may have JSON-LD metadata for relevant related linked objects (and a few other metadata fields common to "introspection", similar to "headers"). The client can dig deeper and know it is not just "any" JSON object but a more specific schema for a given class of thing the client cares about.

2 more replies

vidarh2y ago

E-mails are not opaque blobs, and neither are ActivityPub messages. The point is that at it's lowest layer an implementation should care about receiving messages addressed to one or more Collections. That's it. It makes implementing a functioning ActivityPub implementation a lot easier.

The next layer up then specifies some rules for how to process those messages: Like on an e-mail server, if a message is sent to your "local" server intended for onwards delivery, the server must forward it on. Otherwise it is added to an OrderedCollection - effectively a mailbox.

The spec then sets out a structure for giving the messages an Activity type that determines further fields, and for some of these activities there are rules specifying how the relevant Actor's should act when those activities / messages are processed by them.

You can decide to do that synchronously when receiving the message. Sometimes that may be fine. But you can also strictly layer the implementation and deliver to a collection first and then asyncronously have workers process those messages. What you in either case ought to do for your own sanity is to at least logically separate the low level message pump (inbox/outbox) from the processing of activities.

For starters, doing this separation cleanly makes writing a scaleable implementation far easier.

> you need to map the structures in the spec to proper types in a statically typed languages as you can't manipulate them otherwise in any meaningful way

This is just not true. You can handle dynamic structures in statically typed languages just fine. It is in any case irrelevant, as ActivityStreams (which ActivityPub is based on) defines a typed vocabulary [1]. An implementation can choose to dynamically process extensions or it can choose to statically type the activities it understands and treat the rest as mostly opaque blobs other than the envelope/addressing -- this is exactly why it's beneficial to apply the layering as suggested with the comparison to e-mail and decouple the message pump from the processing of activities.

[1] https://www.w3.org/TR/activitystreams-vocabulary/

MuffinFlavored2y ago

> JSON-LD

https://json-ld.org/ for anybody else not super familiar

cratermoon2y ago

OK, but for someone who wants to build a useful tool that does what the author wants, "interacting with the Fediverse", such as federating with Mastodon, how useful is doing that one thing?

jeroenhd2y ago

It depends on your goal. If your server is just a tool you use, you can ignore lot of concepts. There is no local timeline, there are no users, all follows belong to a single user, etc.

I can't find the link but a while back there was a post on the front page about how to get a findable, read only ActivityPub profile by just uploading some static JSON files. Not exactly a Twitter competitor, but you don't need much to start exchanging messages.

1 more reply

vidarh2y ago

I think this is the wrong way of looking at it. If you're doing the whole stack, sure you will end up implementing quite a few things.

But consider that you can write a generic ActivityStreams server without supporting any of the ActivityPub activities. Now you have a generic platform to build on.

Tack on a tiny bit of support for e.g. addressing etc. as found in ActivityPub and you have what you need for federation.

With that generic platform, doing what you're suggesting is a matter of implementing a handful of Activities that mutates Objects and Collections.

What the author did is the equivalent of implementing a mailing-list manager by first writing a mail server from scratch instead of just writing the bits managing the list and sends, because he didn't have that lower level layer to build on.

There is indeed a lot of missing tooling to work with ActivityStreams/ActivityPub, that makes it painful now, and unfortunately a lot of ActivityPub implementers takes the same tack as the author and builds one big monolith instead of first building that lower layer.

mariusor2y ago

If you want to create one just for yourself, sure. If you want to create something for the rest of the world, probably not very much.

I get the "scratch your own itch" mentality, but not if you kneecap all efforts that try to build on top of it. :D

JustSomeNobody2y ago

Do you know of a small sample project that does this as an example?

mariusor2y ago

There are no "small sample" projects as far as I know. But if you look in my profile (or other comments in this thread) I did develop a server which only does ActivityPub, client to server and server to server.

iudqnolq2y ago

(My only knowledge of activitypub comes from reading this article.)

To receive JSON-LD messages don't you need to send follow requests? And to do that don't you need to deal with the fact the spec is too complicated and most servers implement inconsistent parts of it?

vidarh2y ago

To receive JSON-LD messages, someone needs to send them to you. Sending follow requests is perhaps the easiest way to do that, but those follow requests do not need to be initiated by the same code that hosts the inbox.

The point is there are several potentially independent layers and modules there: The message pump itself at least can be implemented separately from the decoding of individual message types, and separate from managing followers and following, the same way e.g. a mail server knows nothing about how to follow mailing lists, or decoding email messages past the header.

1 more reply

dahwolf2y ago

Saw some comments on the protocol being fluffy and typical implementations resource hungry. This is an interesting guy to follow:

https://universeodon.com/@supernovae

He's the admin of universeodon, a mastodon instance with 13K MAU. He recently shared that in a month's time, 3TB of text was transferred just in ActivityPub events. Images a multiple of it. I don't know what the bill is, but I was pretty shocked by the stats...for "just" 13K users.

And the cruel thing is that it still doesn't work properly. Likes/boosts and replies do not properly synchronize.

robga2y ago

Universeodon uses Fastly as a CDN. My masto admin experience is 80-90% CDN caching though obviously that’s images and static content.

Bandwidth is a function of the number of connected servers for remote followers. A popular person like George Takei makes a post and 15000 servers ask universeodon for the text. IMHO some sort of smarter fan out or message relay capability would help, much like email infrastructure.

Having said that, you can run 10k Mastodon MAU on a $200pm cloud resource budget, though double that adds headroom, a dynamically scalable architecture, a staging instance, elastic search and translation, more admin tooling, etc. Yes, some instances spend multiples of that per 10k but if you budget $400pm you’ll sleep well.

(Thing is, a 1k MAU would cost $100+ if you want to start with a scalable footprint).

Universedon had 70k MAU after the November surge and AIUI is still easily scalable to 100k+.

I don’t know about you, but a cloud running cost of $0.02-$0.10 per MAU sounds cheap to me. All large instances can cover it with donations. The “real” cost is moderation and administration labour.

> And the cruel thing is that it still doesn't work properly. Likes/boosts and replies do not properly synchronize.

It works exactly as designed. Personally I am fine that on someone else’s post I don’t need to see every reply across the world and fully synced like counts, but I can respect the POV that many users expect just that. This is not an activity pub limitation but rather a software choice by Mastodon. Many users, including @supernovae, push for functional changes around this.

latch2y ago

3TB for the text of 13K user sound crazy, you're right. But for the bill, strictly speaking about bandwidth, a 10mbps unmetered connection gets you roughly that. And 10mbps is pretty uncommon now because it's so low. So I'd expect the bandwidth bill to essentially be free (i.e. included)

MR4D2y ago

Would love to see how this compares to git.

xvilka2y ago

Because they should have designed more compact and less talkative protocol. It was a similar problem with XMPP.

still_grokking2y ago

But people are saying that the Fediverse could replace Twitter and Facebook and Tiktok and Instagram and what not, don't they?

How much hardware would you need for 100 million MAUs? (And that's just a fraction of the current social media users.)

If it really "scales" like indicated in the parent post this tech will never provide any alternative to centralized social media sites just for technical reasons no matter what people want or do.

Maybe someone experienced in effective distributed systems should start to design an alternative.

Otherwise there won't be any viable alternative to the commercial silos no matter how bad people would like one.

dahwolf2y ago

Mastodon absolutely does not scale that far and this should be considered fact. Far smaller instance owners sometimes complain about bills of hundreds of USD per month.

So the only way to scale up "indefinitely" is by having many small/medium-sized instances, but not really. In a 100M+ network, instances will suffer due to the wasteful nature of federation plus social media being append-only.

Costs will forever go up and this doesn't even mention the burden and liabilities of moderation.

Amidst all our hate for big social, we've forgotten about all the things they do very well. They are (financially) free. They are reliable and scale up without you noticing. You do not have to generally worry about your entire account and content being gone because some mod gave up. If you don't do anything funny, all your content is preserved, forever. Moderation works reasonably well, even if never perfect.

We've taken all that for granted. But it costs billions, an army of engineers, mods, legal, marketing, UX, top notch infrastructure to make it run and work this smoothly.

The idea that a bunch of enthusiasts can replicate this, is misplaced.

4 more replies

robga2y ago

> How much hardware would you need for 100 million MAUs?

Given an informed yardstick of $.02-.05 per MAU for a 10k instance, possibly $2-5M/month, though much depends on the profile of users per instance.

The support, moderation, legal, administration, and corporate costs would dwarf this cloud/hardware cost.

erwinh2y ago

A bit off-topic but the post title will probably attract relevant people.

What are the thoughts on OCaml on HN?

cccbbbaaa2y ago

It replaced Python for everything longer than a couple hundred of lines long for me. Fast language, fast compile times, clean(-ish) syntax, strong typing system, good ecosystem, and now multicore support? Yes please!

I must be more nuanced, though: existing libraries in opam are generally very, very good (I really like cmdliner), but many things may be missing. There is no alternative to Django, for instance. No serious IDE, except emacs. The standard library was so lacking that there is at least an alternative. The situation improved, but there's still missing stuff compared to Python.

mattpallissard2y ago

> There is no alternative to Django, for instance.

https://aantron.github.io/dream/, which is new and used by ocaml.org as well as OP

> No serious IDE, except emacs

and vim, and visual studio, and whatever else supports the LSP protocol via https://github.com/ocaml/ocaml-lsp

> The standard library was so lacking that there is at least an alternative.

While janestreet does have an publish their own stdlib, I personally try to stick to the stdlib whenever possible. Not to knock janestreet. I'm glad they're around and have contributed a bunch.

But overall I agree with you. It's been my favorite language to write in for years now. You can't just reach for off-the-shelf libraries for every little thing. Although the ones that do exist tend to be written halfway decently.

1 more reply

mdaniel2y ago

> No serious IDE, except emacs.

https://plugins.jetbrains.com/plugin/9440-reasonml (72k downloads)

https://plugins.jetbrains.com/plugin/18531-ocaml (2k downloads)

I'm not in the ocaml ecosystem enough to evaluate their quality, but anything on top of IJ is for sure a serious IDE

amelius2y ago

Do you make GUIs in OCaml, and which libraries do or would you use?

And how about scientific computing (SciPy), deep learning (PyTorch etc.), or computational geometry (Shapely etc.)?

2 more replies

still_grokking2y ago

I've heard good things about OCaml in general.

But "no serious IDE, except emacs" is a non-starter imho, if it's true.

They should really invest in this. Otherwise the language won't attract any professional developers in the large.

2 more replies

yodsanklai2y ago

It's my favorite language by far!

Pros: type safe, GC, fast, (arguably) a simple and practical language if you have a functional mindset (much simpler and pragmatic than Haskell IMHO).

Cons: it's a niche language, so tooling/libraries/online help aren't on par with more mainstream languages. No canonical standard library (different codebases will use different standard libraries and even disagree on pervasive functions such as List.map). Whenever the code uses monad (e.g. concurrency monad / error handling), I find the language loses its simplicity.

Maybe it's true of every languages but I'm disappointed by some OCaml codebases where often two extreme cohabit

1. people who don't know the language and don't write idiomatic code (like, refusing to write .mli, abusing imperative features)

2. OCaml experts who over-engineer things and want to use the latest features and make the code hard to read/maintain

In a professional settings, it can be hard to have these two populations coexisting, and people tend to be quite opinionated when it comes to such languages (love it or hate it -> it's often a source of struggle).

jolux2y ago

OCaml is a great language but I would probably choose F# if I had to pick a language for a new project because of libraries.

WorldMaker2y ago

I haven't used OCaml much directly, but F# is a common enough tool in my toolbelt at this point. My experience of F# is that overall it's a good language family. The access to .NET's standard library (the BCL) and easy interop with C# are the biggest reasons F# is the tool I more often reach to as it already fits the ecosystem most of my other development is in, but I'd love to work more directly with OCaml should the need arise.

zem2y ago

one of my favourite languages! not so much for its (excellent) technical qualities, but just as a matter of personal taste - it joined ruby and racket in a short list of languages that just feel nice to program in. (i suspect D would join that list too but despite being interested in it for a while i haven't yet had a compelling project to use it for.)

SideburnsOfDoom2y ago

My question is this: if I was to try to hack up an ActivityPub server in my platform of choice, how would I know how compliant it is? Is there any compliance test suite to verify this?

"Try and load it up in a client app" seems suboptimal.

"load it up and see" attitude is part of what made parsing and renderings HTML so hairy, and compliance test suites helped.

mariusor2y ago

There was a suite of tests, that sadly fell to bitrot. One of the developers in the community created a parallel application that could test implementations, but then this too ended up unmaintained[1].

[1] https://github.com/go-fed/testsuite

nologic012y ago

I found the post well written and informative. Though I am clueless about OCaml it feels as this would be useful for anybody working on a new server implementation in any language ecosystem as it highlights what needs to be done and potential bottlenecks.

As for the activitypub spec and the currently popular implementations it doesnt take long exposure to the fediverse to realise there are some rough edges and historical accidents (e.g mastodon being actually the defacto interpretation of the standard). Imho now that there is substantial more mindshare devoted to decentralized social it would be opportune to revisit these things and if needed revise before they get backed in.

mikece2y ago

Im looking forward to a solid ActivityPub server written in Go or Rust that can run on modest hardware/small resource Docker hosts.

SideburnsOfDoom2y ago

> Im looking forward to a solid ActivityPub server written in Go or Rust that can run on modest hardware/small resource

The "Lightweight" GoLang ActivityPub server is GoToSocial https://github.com/superseriousbusiness/gotosocial

The better-known lightweight servers are Pleroma and fork Akkoma, written in Elixir https://akkoma.dev/AkkomaGang/akkoma/

Some of this info I got via: https://social.treehouse.systems/@ariadne/110226729543740723

zimpenfish2y ago

There's also Honk[1] which is written in Go but has slightly wacky source and doesn't support the Mastodon API (but does provide an inbuilt web UI.)

[1] https://humungus.tedunangst.com/r/honk

jeroenhd2y ago

I think there is (was?) an attempt to rewrite Mastodon into Rust but I haven't heard much about it.

A single user Mastodon instance takes an unreasonable amount of resources. I don't know if it's just because of Ruby (Gitlab has the same problem, so it might just be) or because everyone is wasting money on expensive servers, but an RSS feed on steroid shouldn't take this much RAM.

WorldMaker2y ago

Mastodon itself is designed for "flagship scale" (given lead developers run mastodon.social and mastodon.online, two of the biggest instances and the most "dogfooding" two instances) so it bundles an entire cluster of services: background processors (sidekiq), caches (redis, I think?), database server (postgres), optional ElastiCache, and more. I don't know how much Ruby itself accounts for expensive overhead, but just running all of those other things on a single server vertically for a single user instance is a massive, expensive overhead. (It's clearly built for horizontal scale where your background services and caches and database servers may all be different clusters of VMs/servers over vertical stack efficiency when "scaled down" from the "natural" "mastodon.social scale" that Mastodon is most optimized for.)

It's an interesting optimization problem reminder that scaling factors are different for different needs and not everything scales cleanly to every use case. A single user instance should be able to use a much smaller vertical stack, but scaling down from a wide horizontal stack is not necessarily the best or cheapest place to start when building something like that.

(There are some interesting projects I've seen to build single user instances with much less overhead, shorter vertical stacks. I'm curious to see where those efforts go. In my own usage of Mastodon my "single user" instance gets the benefits of the horizontal scaling Mastodon was built for because my hosting provider does a bunch of work to make sure that they take advantage of that economy of scale to host many small instances for cheaper than trying to run small instances in one-off VMs.)

1 more reply

mdaniel2y ago

https://github.com/rustodon/rustodon#readme which has an awesome name but you're correct it appears that specific repo stalled out. I didn't check on the 41 forks of it

knjllppppp2y ago

I've had a go at doing it in Go and the ActivityPub spec is so loosely defined that it's just a real challenge if you intend to actually unmarshal the JSON you receive

It's not completely impossible but you have to be okay with discarding a lot of unknown options or essentially reverse engineering the objects used by the servers you are federated with

That's not to say it's impossible, I was able to crawl the network successfully, but it hints at the reason that Mastodon and Pleroma use dynamic languages

I'd be very interested to see a flexible/complete AP implementation in any statically typed language

Fwiw WriteFreely is implemented in Go with go-fed but -- correct me if I'm wrong -- that library seemed more limited to me than what Pleroma and Mastodon support

mariusor2y ago

I'm surprised you didn't find my library because I managed to create a statically typed vocabulary library for Go that maps the specification verbatim: https://pkg.go.dev/github.com/go-ap/activitypub#Object

It wasn't easy indeed, and it locked me out of some options to support execution time vocabulary extensions, but hey, it works and it's relatively easy to use.

1 more reply

yawaramin2y ago

Here's the implementation described in OP: https://github.com/Gopiandcode/ocamlot

OCaml is a statically-typed language. It falls somewhere between Go and Haskell on the spectrum of type 'strength'.

zimpenfish2y ago

> I'd be very interested to see a flexible/complete AP implementation in any statically typed language

Try Honk[1] or GotoSocial[2]?

[1] https://humungus.tedunangst.com/r/honk [2] https://github.com/superseriousbusiness/gotosocial

1 more reply

mariusor2y ago

Well, there is one already as the reference implementation for a suite of libraries I wrote. You can find it at https://github.com/go-ap/fedbox. (Contributions welcome)

zimpenfish2y ago

Does it only support C2S as the API? Are there any clients which actually support C2S rather than the Mastodon API?

1 more reply

mxuribe2y ago

There are several websites out there which hope to list many ActivityPub servers (and clients) in many (programming) languages, and other implemtnation aspects...Like, here's an oldie but goodie website: https://fediverse.party/en/miscellaneous/ ...There are other wbsites of course.

Just select your desired lang. and review! Now, of course, it might be early days for some languages (e.g. for Rust, etc.)...But, one reason why some languages are used over others...is due to ease of deploying on VPCs and VPC-like hosts (...historically the land that php ruled ;-)

Enjoy, and I hope you find what you're looking for!

bgorman2y ago

Ocaml code compiles to native binaries, just like Go/Rust.

yawaramin2y ago

Why specifically those languages? Others can also target modest hardware/small resource Docker hosts.

sangnoir2y ago

Pleroma (written in Elixir) is one of the lighter, Mastodon-compatible AP servers available. I recently read a post (a toot actually, but I hate that term) by a Mastodon administrator observing that that Pleroma is often a common thread to problematic Fediverse instances because it can run on cheap VPS boxes on throwaway domains. Spammers/griefers can cause a lot of moderation problems for the same amount as a Twitter Blue subscription.

It is dubious endorsement, but I think it shows how much more efficient Pleroma is than other popular, easy-to-use-OOTB AP servers: 9 out of 10 price-conscious griefers use and endorse Pleroma

Xeoncross2y ago

Sure, Zig, Nim, D, Erlang, etc.. could also do this, but Go and Rust are both big enough and just about as fast and low memory as anything that is available.

Java can be faster than Go, but not by much and I've always seen it to use 5-10x the memory.

Scripting languages like Typescript, Python, PHP and Ruby can't hold a candle to the speed of Rust and Go while also using significantly more memory. They also don't natively support multiple cores / threads.

Rust and Go represent the most approachable middle ground on all accounts of familiarity, performance (allocs and calculation speed), and large communities with libraries covering whatever you could want.

1 more reply

throwaway2902y ago

There's also LitePub, though development seems stalled (?)

yawaramin2y ago

Was it developed at all? I'm not seeing any business logic in the repo: https://hacktivis.me/git/litepub.social/files.html

throwaway2902y ago

It was and has some pros compared to AP...

riffic2y ago

link for reference: https://litepub.social/

j / k navigate · click thread line to collapse

108 comments

mariusor2y ago

The idea of "users", "friends", "posts", "feeds" etc, are concepts that belong to the clients on top of this server, not in the server itself.

This separation between clients and server will also allow better interop/graceful degradation of object types that the client/server don't specifically understand.

still_grokking2y ago

This comment raised a whole bunch of red flags for me.

I don't know enough about ActivityPub to judge whether it's really like email. I would strongly hope it isn't, as otherwise it would be a tech you should probably better never touch as a developer.

The next thing is: If an ActivityPub server only receives and sends some opaque BLOBs what's the whole point of it?

I really hope I'm reading this wrong!

mariusor2y ago

> If [...]receives and sends some opaque BLOBs what's the whole point of it?

WorldMaker2y ago

> If an ActivityPub server only receives and sends some opaque BLOBs what's the whole point of it?

2 more replies

vidarh2y ago

For starters, doing this separation cleanly makes writing a scaleable implementation far easier.

> you need to map the structures in the spec to proper types in a statically typed languages as you can't manipulate them otherwise in any meaningful way

[1] https://www.w3.org/TR/activitystreams-vocabulary/

MuffinFlavored2y ago

> JSON-LD

https://json-ld.org/ for anybody else not super familiar

cratermoon2y ago

OK, but for someone who wants to build a useful tool that does what the author wants, "interacting with the Fediverse", such as federating with Mastodon, how useful is doing that one thing?

jeroenhd2y ago

It depends on your goal. If your server is just a tool you use, you can ignore lot of concepts. There is no local timeline, there are no users, all follows belong to a single user, etc.

1 more reply

vidarh2y ago

I think this is the wrong way of looking at it. If you're doing the whole stack, sure you will end up implementing quite a few things.

But consider that you can write a generic ActivityStreams server without supporting any of the ActivityPub activities. Now you have a generic platform to build on.

Tack on a tiny bit of support for e.g. addressing etc. as found in ActivityPub and you have what you need for federation.

With that generic platform, doing what you're suggesting is a matter of implementing a handful of Activities that mutates Objects and Collections.

mariusor2y ago

If you want to create one just for yourself, sure. If you want to create something for the rest of the world, probably not very much.

I get the "scratch your own itch" mentality, but not if you kneecap all efforts that try to build on top of it. :D

JustSomeNobody2y ago

Do you know of a small sample project that does this as an example?

mariusor2y ago

iudqnolq2y ago

(My only knowledge of activitypub comes from reading this article.)

vidarh2y ago

1 more reply

dahwolf2y ago

Saw some comments on the protocol being fluffy and typical implementations resource hungry. This is an interesting guy to follow:

https://universeodon.com/@supernovae

And the cruel thing is that it still doesn't work properly. Likes/boosts and replies do not properly synchronize.

robga2y ago

Universeodon uses Fastly as a CDN. My masto admin experience is 80-90% CDN caching though obviously that’s images and static content.

(Thing is, a 1k MAU would cost $100+ if you want to start with a scalable footprint).

Universedon had 70k MAU after the November surge and AIUI is still easily scalable to 100k+.

> And the cruel thing is that it still doesn't work properly. Likes/boosts and replies do not properly synchronize.

latch2y ago

MR4D2y ago

Would love to see how this compares to git.

xvilka2y ago

Because they should have designed more compact and less talkative protocol. It was a similar problem with XMPP.

still_grokking2y ago

But people are saying that the Fediverse could replace Twitter and Facebook and Tiktok and Instagram and what not, don't they?

How much hardware would you need for 100 million MAUs? (And that's just a fraction of the current social media users.)

If it really "scales" like indicated in the parent post this tech will never provide any alternative to centralized social media sites just for technical reasons no matter what people want or do.

Maybe someone experienced in effective distributed systems should start to design an alternative.

Otherwise there won't be any viable alternative to the commercial silos no matter how bad people would like one.

dahwolf2y ago

Mastodon absolutely does not scale that far and this should be considered fact. Far smaller instance owners sometimes complain about bills of hundreds of USD per month.

Costs will forever go up and this doesn't even mention the burden and liabilities of moderation.

We've taken all that for granted. But it costs billions, an army of engineers, mods, legal, marketing, UX, top notch infrastructure to make it run and work this smoothly.

The idea that a bunch of enthusiasts can replicate this, is misplaced.

4 more replies

robga2y ago

> How much hardware would you need for 100 million MAUs?

Given an informed yardstick of $.02-.05 per MAU for a 10k instance, possibly $2-5M/month, though much depends on the profile of users per instance.

The support, moderation, legal, administration, and corporate costs would dwarf this cloud/hardware cost.

erwinh2y ago

A bit off-topic but the post title will probably attract relevant people.

What are the thoughts on OCaml on HN?

cccbbbaaa2y ago

mattpallissard2y ago

> There is no alternative to Django, for instance.

https://aantron.github.io/dream/, which is new and used by ocaml.org as well as OP

> No serious IDE, except emacs

and vim, and visual studio, and whatever else supports the LSP protocol via https://github.com/ocaml/ocaml-lsp

> The standard library was so lacking that there is at least an alternative.

While janestreet does have an publish their own stdlib, I personally try to stick to the stdlib whenever possible. Not to knock janestreet. I'm glad they're around and have contributed a bunch.

1 more reply

mdaniel2y ago

> No serious IDE, except emacs.

https://plugins.jetbrains.com/plugin/9440-reasonml (72k downloads)

https://plugins.jetbrains.com/plugin/18531-ocaml (2k downloads)

I'm not in the ocaml ecosystem enough to evaluate their quality, but anything on top of IJ is for sure a serious IDE

amelius2y ago

Do you make GUIs in OCaml, and which libraries do or would you use?

And how about scientific computing (SciPy), deep learning (PyTorch etc.), or computational geometry (Shapely etc.)?

2 more replies

still_grokking2y ago

I've heard good things about OCaml in general.

But "no serious IDE, except emacs" is a non-starter imho, if it's true.

They should really invest in this. Otherwise the language won't attract any professional developers in the large.

2 more replies

yodsanklai2y ago

It's my favorite language by far!

Pros: type safe, GC, fast, (arguably) a simple and practical language if you have a functional mindset (much simpler and pragmatic than Haskell IMHO).

Maybe it's true of every languages but I'm disappointed by some OCaml codebases where often two extreme cohabit

1. people who don't know the language and don't write idiomatic code (like, refusing to write .mli, abusing imperative features)

2. OCaml experts who over-engineer things and want to use the latest features and make the code hard to read/maintain

jolux2y ago

OCaml is a great language but I would probably choose F# if I had to pick a language for a new project because of libraries.

WorldMaker2y ago

zem2y ago

SideburnsOfDoom2y ago

My question is this: if I was to try to hack up an ActivityPub server in my platform of choice, how would I know how compliant it is? Is there any compliance test suite to verify this?

"Try and load it up in a client app" seems suboptimal.

"load it up and see" attitude is part of what made parsing and renderings HTML so hairy, and compliance test suites helped.

mariusor2y ago

[1] https://github.com/go-fed/testsuite

nologic012y ago

mikece2y ago

Im looking forward to a solid ActivityPub server written in Go or Rust that can run on modest hardware/small resource Docker hosts.

SideburnsOfDoom2y ago

> Im looking forward to a solid ActivityPub server written in Go or Rust that can run on modest hardware/small resource

The "Lightweight" GoLang ActivityPub server is GoToSocial https://github.com/superseriousbusiness/gotosocial

The better-known lightweight servers are Pleroma and fork Akkoma, written in Elixir https://akkoma.dev/AkkomaGang/akkoma/

Some of this info I got via: https://social.treehouse.systems/@ariadne/110226729543740723

zimpenfish2y ago

There's also Honk[1] which is written in Go but has slightly wacky source and doesn't support the Mastodon API (but does provide an inbuilt web UI.)

[1] https://humungus.tedunangst.com/r/honk

jeroenhd2y ago

I think there is (was?) an attempt to rewrite Mastodon into Rust but I haven't heard much about it.

WorldMaker2y ago

1 more reply

mdaniel2y ago

https://github.com/rustodon/rustodon#readme which has an awesome name but you're correct it appears that specific repo stalled out. I didn't check on the 41 forks of it

knjllppppp2y ago

I've had a go at doing it in Go and the ActivityPub spec is so loosely defined that it's just a real challenge if you intend to actually unmarshal the JSON you receive

It's not completely impossible but you have to be okay with discarding a lot of unknown options or essentially reverse engineering the objects used by the servers you are federated with

That's not to say it's impossible, I was able to crawl the network successfully, but it hints at the reason that Mastodon and Pleroma use dynamic languages

I'd be very interested to see a flexible/complete AP implementation in any statically typed language

Fwiw WriteFreely is implemented in Go with go-fed but -- correct me if I'm wrong -- that library seemed more limited to me than what Pleroma and Mastodon support

mariusor2y ago

It wasn't easy indeed, and it locked me out of some options to support execution time vocabulary extensions, but hey, it works and it's relatively easy to use.

1 more reply

yawaramin2y ago

Here's the implementation described in OP: https://github.com/Gopiandcode/ocamlot

OCaml is a statically-typed language. It falls somewhere between Go and Haskell on the spectrum of type 'strength'.

zimpenfish2y ago

> I'd be very interested to see a flexible/complete AP implementation in any statically typed language

Try Honk[1] or GotoSocial[2]?

[1] https://humungus.tedunangst.com/r/honk [2] https://github.com/superseriousbusiness/gotosocial

1 more reply

mariusor2y ago

Well, there is one already as the reference implementation for a suite of libraries I wrote. You can find it at https://github.com/go-ap/fedbox. (Contributions welcome)

zimpenfish2y ago

Does it only support C2S as the API? Are there any clients which actually support C2S rather than the Mastodon API?

1 more reply

mxuribe2y ago

Enjoy, and I hope you find what you're looking for!

bgorman2y ago

Ocaml code compiles to native binaries, just like Go/Rust.

yawaramin2y ago

Why specifically those languages? Others can also target modest hardware/small resource Docker hosts.

sangnoir2y ago

It is dubious endorsement, but I think it shows how much more efficient Pleroma is than other popular, easy-to-use-OOTB AP servers: 9 out of 10 price-conscious griefers use and endorse Pleroma

Xeoncross2y ago

Sure, Zig, Nim, D, Erlang, etc.. could also do this, but Go and Rust are both big enough and just about as fast and low memory as anything that is available.

Java can be faster than Go, but not by much and I've always seen it to use 5-10x the memory.

1 more reply

throwaway2902y ago

There's also LitePub, though development seems stalled (?)

yawaramin2y ago

Was it developed at all? I'm not seeing any business logic in the repo: https://hacktivis.me/git/litepub.social/files.html

throwaway2902y ago

It was and has some pros compared to AP...

riffic2y ago

link for reference: https://litepub.social/

j / k navigate · click thread line to collapse