Runsit – A process manager in Go (opens in new tab)

(github.com)

57 pointsarmenb11y ago26 comments

26 comments

FWIW I also started writing a init-like server in Go. One thing I ran into was that Go's APIs sort of coerce you into having an extra thread per process. runsit also has this issue. See line 489 of runsit.go, pasted below.

You can do it non-portably in Go by using os.ForkExec and Wait4(-1). The portable exec package assumes you will call Wait(pid), and not Wait(-1), which basically implies using a thread per process. Go's runtime isn't magic -- if you call libc/syscall wait(), an entire thread will be blocked, and the runtime can't use it for anything else. In this case this is the lifetime of an entire process, which is forever for server processes.

I'm pretty sure nobody would use a real PID 1 that burned a thread per process (systemd, upstart, etc.). But yes, for most use cases, in the grand scheme of things, it's probably not a big deal. I suppose Linux has an O(1) scheduler, although I'm not quite sure how this affects scheduling (interested in any comments).

But this goes to show that portable APIs are awkward and obscure for low level code. Better to use raw Unix APIs for something like an init server. Python and Java have similar problems.

IMO all interesting code nowadays is POSIX-like, so we should drop the pretension of portability and simplify our lives. Unix works.

    // run in its own goroutine
    func (in *TaskInstance) awaitDeath() {
      in.waitErr = in.cmd.Wait()  // ties up an OS thread for the lifetime of a process
      ...  
    }

pedrocr11y ago

> I suppose Linux has an O(1) scheduler

Actually not anymore. The current CFS scheduler is no longer O(1) but O(logN):

https://en.wikipedia.org/wiki/Completely_Fair_Scheduler

robryk11y ago

I wonder if it would make sense to make a "child poller" akin to the net poller:

Have one goroutine loop forever:

  * wait for SIGCHLD
  * take the childlock
  * do nonblocking Wait() until no unwaited child remains
  * release the childlock

The childlock would need to be taken for reading by os.Process.Kill and when a syscall that takes a PID of a child is called (after taking it we'd need to verify that the process we intend to touch isn't already dead).

chubot11y ago

Yes, you can do that. As mentioned, it's not portable (which is fine with me).

However, you don't need to use goroutines (or threads). You do it as you would in C (and how all real PID 1 systems are written) -- with a single thread that starts processes, receives signals, and reaps children in a non-blocking fashion.

This style of program -- a program that needs to simultaneously wait for child processes/signals and fd events -- is quite awkward in Unix, but it definitely works when you get the idea.

To wait on a fd and a signal in a single threaded program, you would use the "self pipe trick" in classic Unix. In Linux, you can ask for a signal to be delivered over a file descriptor with fdsignal(). But AFAICT there is no real reason, and portability across Unix IS a good thing IMO (but not portability to completely different OS's like Windows; in that case I would write a completely separate program using their native APIs).

node.js actually does a great job making this API easy and efficient. It is probably the only runtime (Python/Ruby/JVM/etc.), that doesn't suffer from this problem doing "async processes" (i.e. a complement to async networking).

burke11y ago

We have an init process that runs inside docker containers, and we took a bit of a different route, in listening on SIGCHLD and then doing non-blocking Wait4(0,...) until there are no more children to reap:

https://gist.github.com/burke/1c105378ac0629b39485

chubot11y ago

I think that is basically the same thing. I haven't used Wait4(0), but it looks like it is the same as Wait4(-1), as long as you don't change the process group ID of any of the children?

In any case, you are not calling Wait4(<specific PID>), which is what implies the thread per process.

zemo11y ago

A goroutine is not a thread. That only ties up that one goroutine.

chubot11y ago

Your first statement is true; the second isn't.

Re-read what I wrote. If that doesn't convince you, then download and run the code. Run "pstree" on it and observe how many child processes and threads there are. You'll learn something useful about the relationship of the Go runtime to the OS.

zimbatm11y ago

A thread is created for blocking syscalls. I don't know if it's a syscall under the hood here but he might be talking about that.

armenbOP11y ago

Just an introduction: Currently I'm using runsit as an alternative for supervisord and I'm happy with it so far. Stdout and stderror of the processes can be queried with a very simple HTTP interface. Runsit watches a config directory for any changes and applies them immediately. Config files are in json format.

elithrar11y ago

Mind sharing a (lightly commented) config? I've been using Supervisor to run my Go services for a long while (the built-in log rotation is one of the big attractions) but keen to try alternatives. I never found mmonit and other alternatives to be as comprehensive as Supervisor.

armenbOP11y ago

For instance I use following config (nginx.json) for nginx:

{ "user": ["_env", "${USER}"], "cwd": "/var/www", "standardEnv": true, "numFiles": 1024, "binary": "/usr/sbin/nginx" }

And php-fpm.json:

{ "user": ["_env", "${USER}"], "cwd": "/usr/sbin", "standardEnv": true, "numFiles": 1024, "binary": "php-fpm", "args": [ "-F" ] }

Hope that helps

1 more reply

fcoury11y ago

And the JSON parser is awesome, it gives user friendly error messages when the file has the wrong format. Very cool.

armenbOP11y ago

Yes it is.

gwoo11y ago

I built https://github.com/gwoo/goforever which has similar goals. I definitely like some of the ideas in runsit like automatic config watching. I still need to handle log rotation with something like https://github.com/natefinch/lumberjack

Thanks for putting this out there.

aktau11y ago

I'm a big fan of Go and have a few Go binaries running on hundreds of clients (and a few servers) right now. They are being managed by Runit though. Since the names are so similar, did you get inspiration from Runit? And if so, what would be the main differences/advantages besides being portable to more platforms (Windows I presume)?

@chubot mentions the thread-per-process overhead. Runit does process-per-process so in that aspect Runsit should be a bit lighter. Then again, Runit is extremely tiny, its statically compiled binaries taking up next to nothing. The wait(-1) trick sounds good, but there must be a reason why for example Runit doesn't use it, since afaik Runit only runs on POSIX systems.

Keep up the good work!

EDIT: a web interface, that's pretty spiffy! (though I've made something similar work for Runit by querying the status of a service and serving up a dashboard, with a Go webserver of course).

kylered11y ago

In case anyone is interested, we open sourced a process manager and web interface a few months ago.

https://github.com/VividCortex/pm https://github.com/VividCortex/pm-web

akerl_11y ago

Is there a site for this or some nature of docs?

I'm still searching for a process manager that I can love for use with Docker (I've played with runit and am currently playing with s6), but the total lack of readme or docs makes this link fairly unhelpful.

armenbOP11y ago

Unfortunately there isn't much doc, but it's not that hard to set up. Take a look at run.sh file and config directory. Andrew Gerrand has a init script[1] in his fork for it which might be useful.

[1] https://github.com/nf/runsit/tree/master/doc/initd

stormbrew11y ago

Huh. I have somehow never heard of s6 but it looks very promising. What did you find when using it for this purpose?

akerl_11y ago

My initial thoughts were less than positive, just because building it is a less-than-stellar process. Part of that is because it has a couple of deps that must also be compiled (skalibs and execline), and part is because it follows the slashpackage conventions (http://cr.yp.to/slashpackage.html).

Now that I've got it build, the next hurdle is making init scripts in execline. It's a fairly simple language, and I enjoy the premise behind it, where scripts should be clear and deterministic.

Overall, s6 provides a lot more helper tools for daemon management than runit did, so it looks like it's gonna be great for my use case. I've got an automated build set up to handle making a Docker image with s6 prepared:

https://registry.hub.docker.com/u/dock0/service/

2 more replies

sleepydog11y ago

I used s6 quite a bit in the past, it's got some pretty good improvements on daemontools. Back then I wrote some RPM specs to build s6 against musl: https://github.com/droyo/rpmbuild . I haven't tried to build them recently.

j / k navigate · click thread line to collapse

26 comments

chubot11y ago

But this goes to show that portable APIs are awkward and obscure for low level code. Better to use raw Unix APIs for something like an init server. Python and Java have similar problems.

IMO all interesting code nowadays is POSIX-like, so we should drop the pretension of portability and simplify our lives. Unix works.

    // run in its own goroutine
    func (in *TaskInstance) awaitDeath() {
      in.waitErr = in.cmd.Wait()  // ties up an OS thread for the lifetime of a process
      ...  
    }

pedrocr11y ago

> I suppose Linux has an O(1) scheduler

Actually not anymore. The current CFS scheduler is no longer O(1) but O(logN):

https://en.wikipedia.org/wiki/Completely_Fair_Scheduler

robryk11y ago

I wonder if it would make sense to make a "child poller" akin to the net poller:

Have one goroutine loop forever:

  * wait for SIGCHLD
  * take the childlock
  * do nonblocking Wait() until no unwaited child remains
  * release the childlock

chubot11y ago

Yes, you can do that. As mentioned, it's not portable (which is fine with me).

This style of program -- a program that needs to simultaneously wait for child processes/signals and fd events -- is quite awkward in Unix, but it definitely works when you get the idea.

burke11y ago

https://gist.github.com/burke/1c105378ac0629b39485

chubot11y ago

I think that is basically the same thing. I haven't used Wait4(0), but it looks like it is the same as Wait4(-1), as long as you don't change the process group ID of any of the children?

In any case, you are not calling Wait4(<specific PID>), which is what implies the thread per process.

zemo11y ago

A goroutine is not a thread. That only ties up that one goroutine.

chubot11y ago

Your first statement is true; the second isn't.

zimbatm11y ago

A thread is created for blocking syscalls. I don't know if it's a syscall under the hood here but he might be talking about that.

armenbOP11y ago

elithrar11y ago

armenbOP11y ago

For instance I use following config (nginx.json) for nginx:

{ "user": ["_env", "${USER}"], "cwd": "/var/www", "standardEnv": true, "numFiles": 1024, "binary": "/usr/sbin/nginx" }

And php-fpm.json:

{ "user": ["_env", "${USER}"], "cwd": "/usr/sbin", "standardEnv": true, "numFiles": 1024, "binary": "php-fpm", "args": [ "-F" ] }

Hope that helps

1 more reply

fcoury11y ago

And the JSON parser is awesome, it gives user friendly error messages when the file has the wrong format. Very cool.

armenbOP11y ago

Yes it is.

gwoo11y ago

Thanks for putting this out there.

aktau11y ago

Keep up the good work!

EDIT: a web interface, that's pretty spiffy! (though I've made something similar work for Runit by querying the status of a service and serving up a dashboard, with a Go webserver of course).

kylered11y ago

In case anyone is interested, we open sourced a process manager and web interface a few months ago.

https://github.com/VividCortex/pm https://github.com/VividCortex/pm-web

akerl_11y ago

Is there a site for this or some nature of docs?

armenbOP11y ago

Unfortunately there isn't much doc, but it's not that hard to set up. Take a look at run.sh file and config directory. Andrew Gerrand has a init script[1] in his fork for it which might be useful.

[1] https://github.com/nf/runsit/tree/master/doc/initd

stormbrew11y ago

Huh. I have somehow never heard of s6 but it looks very promising. What did you find when using it for this purpose?

akerl_11y ago

Now that I've got it build, the next hurdle is making init scripts in execline. It's a fairly simple language, and I enjoy the premise behind it, where scripts should be clear and deterministic.

https://registry.hub.docker.com/u/dock0/service/

2 more replies

sleepydog11y ago

j / k navigate · click thread line to collapse