Proposal of a new concurrency model for Ruby 3 [pdf] (opens in new tab)

(atdot.net)

188 pointstenderlove9y ago87 comments

87 comments

62 comments · 16 top-level

zeckalpha9y ago· 11 in thread

How does this compare to the ongoing efforts to remove the GIL in Python? It looks like the Ruby GVL would stay, but be scoped to a Guild, rather than a Process?

the_mitsuhiko9y ago

The proposal in the PDF looks like what I tried to implement many years ago in Python but I gave up in agony due to some stupid design decisions in Python (in particular non heap types and how type checks in the c level work).

Python's attempts to remove the GIL are not going anywhere really.

sanxiyn9y ago

What do you think of PyParallel's approach of "removing" GIL?

http://pyparallel.org/

claudiug9y ago

maybe you should give them some inside :) Not the follow the same path of painful mistakes :)

viraptor9y ago

Are there any ongoing efforts? There's pypy trying to use STM, but don't know of any other attempts - definitely not in cpython.

zeckalpha9y ago

Yes. There was an excellent talk at Pycon 2016, entitled the GILectomy: https://www.youtube.com/watch?v=P3AyI_u66Bw

artellectual9y ago

Anyone correct me if I'm wrong here.

Seems like a guild is just a subprocess with its own resources. And you copy objects over as needed. And when the guild is done it will get garbage collected. Like other objects.

the_mitsuhiko9y ago

> Seems like a guild is just a subprocess with its own resources.

In an ideal world the guild is the interpreter state which would be very far from processes. How far down you can go there largely depends on what promises the API made to C extensions and other things in the past.

Someone9y ago

I think I would consider implementing it as 1:1 threading where every thread=guild runs its own set of green threads.

That likely would be faster than having OS threads in each guild that use PS locks to prevent running >1 of them concurrently.

1 more reply

_ko19y ago

Like sub-process, but share many things like bytecodes (ISeq in MRI context), class and module objects (and method tables) and so on. Also we can share immutable objects (deeply frozen objects) like threads.

Roboprog9y ago

The Guild concept feels much like a mix of actors and messaging from Erlang and Go. Not as restrictive as Erlang, but not as permissive as Go. (there was a nod to Elixir/Erlang in the slides)

1 more reply

pmontra9y ago

Yes, it becomes a guild global lock.

pmontra9y ago· 10 in thread

Tl;dr The goal is to keep compatibility with Ruby 2. It introduces the concept of guilds and channels to send objects between guilds. The bullet points below are quoted from a couple of slides, the other text is mine:

* Guild has at least one thread (and a thread has at least one fiber)

* Threads in different guilds can run in parallel

* Threads in a same guild can not run in parallel because of GVL (or GGL: Giant Guild Lock)

A guild can't access the objects of other guilds.

About channels:

* We have Guild::Channel to communicate each other

* 2 communication methods

1. Copy

2. Transfer membership or Move in short

Copy is a deep copy and the object is duplicated into the destination guild. A transfer removes an object from a guild and makes it available to another.

There are also immutable objects that are available to all guilds. An obvious example are numbers, which are objects in Ruby, booleans and symbols. I think that other objects are frozen with https://ruby-doc.org/core-2.3.1/Object.html#method-i-freeze

They already did some encouraging benchmarks.

catnaroek9y ago

> A guild can't access the objects of other guilds.

> 2. Transfer membership or Move in short

How is this enforced? What exactly happens at runtime if a guild tries to manipulate an object that belongs to another? (Absent a compile-time check, this is always a possibility.)

dragonwriter9y ago

> How is this enforced? What exactly happens at runtime if a guild tries to manipulate an object that belongs to another? (Absent a compile-time check, this is always a possibility.)

It would seem that:

(1) Guild ownership would have to be tracked in the runtime, obviously.

(2) Any access from Ruby code in the runtime, the runtime would also know what Guild the access request came from as well as the Guild the object belonged to against which access was sought.

(3) The runtime would be required to fail in some well-defined way (presuming, raising an exception in the requester) when the rules were violated.

It should be reasonably straightforward to assure this for all accesses within the runtime, since you can just make sure that there is no method to request access which isn't always attached to the Guild that the request comes from. It may be possible to break the runtime with poorly-behaved extension code that subverts the normal mechanisms, and it may be impossible to fully protect against that, but that's pretty much always a potential with extension code.

1 more reply

pmontra9y ago

It's similar to Rust's transfer of ownership. Rust is compiled so you get a compile time error if you try to access something you can't access anymore.

Check the code at http://rustbyexample.com/scope/move.html and run it (inside the page). There is a commented out println towards the end. Comment it in, run the code again and see the compiler error.

More about transfer of ownership at https://doc.rust-lang.org/book/ownership.html

1 more reply

riffraff9y ago

you get a runtime error

1 more reply

rst9y ago

Slides also note that treatment of some state in pre-existing Ruby code, e.g., instance variables of class objects, gets messy. (Class variables are listed as per-Guild, but a fair amount of Ruby code uses instance vars on class objects instead.)

_ko19y ago

So we need to rewrite to support multi-guilds application.

bad_user9y ago

So guilds are effectively OS-managed processes?

audunw9y ago

No, the heap is shared among guilds. See the last slide.

Moving data between guilds is cheap because data does not have to be copied. Referencing frozen (immutable) data is cheap to.

It seems it will track ownership of objects to make sure guilds don't access other guilds data. But it doesn't seem that it uses OS-level data protection.

pmontra9y ago

From the slides I got the idea that there is only one OS process with guilds and possibly threads within guilds. But it could be that the language doesn't care and that's an implementation detail. We'll see.

azr799y ago

sounds reasonable to me

readittwice9y ago· 7 in thread

Hmm, I am wondering how moving ownership would work in a GC'ed system. You could have arbitrarily many references to the moved object (or subobjects). The slides say that an exception is thrown if an object of a different guild is accessed, but doesn't that mean that Ruby needs to check the guild at every object access?

Transfering ownership would probably also mean that Ruby not only needs to move one object but probably all subobjects recursively as well. I assume here that "moving" just means updating the guild field for each object.

Is this really feasible or wouldn't just copying the object be faster... I don't know of any system with gc that uses moving to transfer mutable objects between threads. Do such systems exist? Are there better ways of implementing this?

chrisseaton9y ago

> The slides say that an exception is thrown if an object of a different guild is accessed, but doesn't that mean that Ruby needs to check the guild at every object access?

Ruby is already checking the class of the object on every access. You could combine the guild and the class into a tuple and compare against that instead, so it adds no extra overhead.

There is a paper at OOPSLA this year on doing just that http://2016.splashcon.org/event/splash-2016-oopsla-efficient...

aardvark1799y ago

I can't comment on that paper since it doesn't appear to be available publicly yet, so I'm just going to talk about guilds as proposed.

Adding a guild word to each object header is certainly a way to check ownership, and should be a cheap check to perform in the interpreter, but will obviously add some extra overhead to standard program execution.

The thing that concerns me is that explicit ownership passing can introduce as many bugs as it solves. If I have two objects A and B, with A holding a reference to B, then I can freeze A and freely pass it between guilds, but if I try and touch B I'll get an error until that too has been frozen or its ownership transferred. The same problems occurs with explicit ownership transfer of a non-frozen A, which leaves you with the slower option of a deep-copy or a recursive ownership transfer which can have equally unexpected consequences.

The "Ruby global data" slide also gives me the scream heebie-jeebies, as did finding stack overflow answers on how to unfreeze objects in MRI. I'm sure nothing will go wrong. :-)

Having said all that, it probably can work nicely for the common use cases of balancing requests between a group of worker guilds where the request is a simple data structure whose ownership can be safely transferred, but it would be hard to do a general work stealing solution that was always safe.

3 more replies

VeejayRampay9y ago

Hi Chris, you've mentioned that you didn't like discussing design choices made or attempted to be made in Ruby in the past, preferring to focus on the technical side of implementing the language, so no worries if you don't want to weigh in on this, but how do you feel about this proposal in general, in terms of feasibility and upside/downside of the technique described?

1 more reply

Roboprog9y ago

Hmm. I was under the impression that when you transferred the membership of an object, the "guild-local" or local lexical scope variable would be nulled out. Little to check if that's the case.

ergl9y ago

That's exactly what Pony (http://www.ponylang.org/) does. It has a gc, with all the actors in the system communicating through shared memory. It uses the concept of 'capabilities' to check the owner of any given reference and disallow read or write permissions to other objects / actors.

rurban9y ago

Well, yes. But the pony capabilities system checks ownership at compile-time already and has a much faster GC and smaller objects (actors), while in Ruby 3 you defer the deadlock or race errors to run-time.

_ko19y ago

It's magic of transferring membership. I omitted details on slides.

_ko19y ago· 2 in thread

Could you link to http://www.atdot.net/~ko1/activities/2016_rubykaigi.pdf ? current one is on temporary file space (will be removed soon).

tenderloveOP9y ago

Apparently I can't change the link. I'm sorry! :-(

lake999y ago

The mods can do it. I once left a comment about changing the link. They saw it and changed it on their own. I don't know how to notify them though.

2 more replies

kent19y ago· 2 in thread

I worked on a similar proposal during my PhD thesis. It is formalized for a Java-like language and implemented in the Jikes RVM. We also carried a proof of isolation using Coq.

https://tel.archives-ouvertes.fr/tel-00933072

mattnewton9y ago

That looks really useful. If you have time, please chime in on the proposal!

kent19y ago

The ownership check is requiered for each access to an object. However it is straightforward to understand that successive checks of the same object can be optimized out if the object has not been passed to another owner. In this thesis I describe dynamic and static analyses to remove the unecessary checks.

jph9y ago· 2 in thread

The key points IMHO:

1. This Ruby 3 proposal says that Ruby 2 compatibility is mission critical, therefore this proposal rejects concurrency solutions from other languages (e.g. Erlang) and concepts (e.g. functions) and data structures (e.g. immutable collections).

2 Instead the proposal is to create a fast copy-on-write with rules to "deep freeze" some kinds of objects and primitives into an immutable sharable state.

nateberkopec9y ago

> This Ruby 3 proposal says that Ruby 2 compatibility is mission critical

Matz has been very public about his fear of a "Python 3" situation occurring in the Ruby community.

awj9y ago

And rightly so, I should think. Given the presence of languages like Elixir and Go, creating a situation where you are breaking people's code to introduce multicore programming systems is a pretty bad idea.

I can easily see how people might (rightly or wrongly) say "Ruby 3 broke my code, I'm rewriting in Go".

masterleep9y ago· 2 in thread

How would you use this to parallelize Rails requests? I guess you would need a pool of guilds, each with its own set of controllers, etc.

Since the requests would not be in the "main" guild, it might be painful to call into gems.

artellectual9y ago

I guess you could boot up a pool of guilds in your process or better yet get generated on demand as requests are coming in, to process the request, and kill the guild off when the process is done since the request object shouldn't be shared.

It all really depends on how much overhead there is to create and destroy guilds. If it's easy then ideally you could start 100s of guilds or 1000s should your hardware allow it.

I see guilds as a subprocess with its own isolated resources.

pmontra9y ago

Ideally guilds could be equivalent to lightweight processes at application level (not OS), much like in Erlang. Then they could be scheduled to run concurrently using OS threads (multiple guilds per thread) and take advantage of multiple cores. That's part of BEAM, the Erlang VM. I think it's going to take a while.

1 more reply

DougBarth9y ago· 2 in thread

If I'm reading this proposal correctly, locks will still be needed within multithreaded guilds to guard mutations against complex object graphs.

Here's my reasoning. Since the GVL is insufficient to guard against data races on Ruby 2, under the guild system, locks would be needed to guard against concurrency issues if multiple threads are present.

It would seem like the intention would be to replace usages of Thread with Guild to avoid the concurrency issues inherent with threaded code. Will there be API support to create a Guild that only allows a single thread?

dragonwriter9y ago

> locks will still be needed within multithreaded guilds

It seems to me that is the intent; that is, any Ruby code that exists now is single-guild Ruby 3 code -- if its multithreaded, it needs locks, for the same reason it does now.

> It would seem like the intention would be to replace usages of Thread with Guild to avoid the concurrency issues inherent with threaded code

I think that'll be a common use case, though running what amount to multiple "legacy" Ruby 2 multithreaded systems in separate Guilds in the same Ruby 3 process seems also to be an intended supported use case.

> Will there be API support to create a Guild that only allows a single thread?

It certainly sounds like a good idea.

nateberkopec9y ago

> If I'm reading this proposal correctly, locks will still be needed within multithreaded guilds to guard mutations against complex object graphs.

That is correct. You'll still need to use locks if doing multi-threading inside of Guilds.

It looks like Guilds can have 1 to X threads.

DanWaterworth9y ago· 2 in thread

This is interesting. It doesn't mention GC, but since frozen objects can be shared between guilds, I assume the GC remains global. Perhaps this will trigger interest in immutable datastructures in ruby.

_ko19y ago

Quoted from slides: > GC/Heap > * Share it. Do stop the world parallel marking- and lazy concurrent sweeping. > * Synchronize only at page acquire timing. No any synchronization at creation time.

DanWaterworth9y ago

I stand corrected.

gamesbrainiac9y ago· 2 in thread

Any idea where the video of the talk is?

steveklabnik9y ago

Ruby Kaigi has just started, so I'm guessing it will be a while.

_ko19y ago

Yes.

claudiug9y ago· 2 in thread

do we have any date from this new way of doing concurrency in ruby?

pkmiec9y ago

the new concurrency is part of ruby 3. matz says he wishes for it to be out by 2020. but who knows :).

cutler9y ago

So 4 years to go. Not quite Perl 6 but it could be a bit late in the day considering the rate at which Ruby is losing mindshare.

transfire9y ago· 1 in thread

Hope thy improve the syntax, it looks horrid -- code in strings and all.

_ko19y ago

Because of current limitation. We'll improve it.

ivoras9y ago· 1 in thread

Ok, "guilds"? Is the principle behind this so much different than everything done before that it requires repurposing a completely new word?

On par with That's "crates". Gives the impression some people just want to be remembered as inventing names.

dragonwriter9y ago

> "guilds"? Is the principle behind this so much different than everything done before that it requires repurposing a completely new word?

Pretty much. I mean, if there is a standard name for a thing between a process and a thread that is not a thread group, I haven't heard it.

sciurus9y ago

This reminds me in some ways of Eric Snow's (rejected, afaik) proposal to extend "subinterpreters" to allow parallelism in Python.

https://lwn.net/Articles/650489/

jellymann9y ago

The PDF appears to have been removed. I'm getting a "Not Found" page.

porges9y ago

I{HEART}COM

j / k navigate · click thread line to collapse

87 comments

62 comments · 16 top-level

zeckalpha9y ago· 11 in thread

How does this compare to the ongoing efforts to remove the GIL in Python? It looks like the Ruby GVL would stay, but be scoped to a Guild, rather than a Process?

the_mitsuhiko9y ago

Python's attempts to remove the GIL are not going anywhere really.

sanxiyn9y ago

What do you think of PyParallel's approach of "removing" GIL?

http://pyparallel.org/

claudiug9y ago

maybe you should give them some inside :) Not the follow the same path of painful mistakes :)

viraptor9y ago

Are there any ongoing efforts? There's pypy trying to use STM, but don't know of any other attempts - definitely not in cpython.

zeckalpha9y ago

Yes. There was an excellent talk at Pycon 2016, entitled the GILectomy: https://www.youtube.com/watch?v=P3AyI_u66Bw

artellectual9y ago

Anyone correct me if I'm wrong here.

Seems like a guild is just a subprocess with its own resources. And you copy objects over as needed. And when the guild is done it will get garbage collected. Like other objects.

the_mitsuhiko9y ago

> Seems like a guild is just a subprocess with its own resources.

Someone9y ago

I think I would consider implementing it as 1:1 threading where every thread=guild runs its own set of green threads.

That likely would be faster than having OS threads in each guild that use PS locks to prevent running >1 of them concurrently.

1 more reply

_ko19y ago

Roboprog9y ago

The Guild concept feels much like a mix of actors and messaging from Erlang and Go. Not as restrictive as Erlang, but not as permissive as Go. (there was a nod to Elixir/Erlang in the slides)

1 more reply

pmontra9y ago

Yes, it becomes a guild global lock.

pmontra9y ago· 10 in thread

* Guild has at least one thread (and a thread has at least one fiber)

* Threads in different guilds can run in parallel

* Threads in a same guild can not run in parallel because of GVL (or GGL: Giant Guild Lock)

A guild can't access the objects of other guilds.

About channels:

* We have Guild::Channel to communicate each other

* 2 communication methods

1. Copy

2. Transfer membership or Move in short

Copy is a deep copy and the object is duplicated into the destination guild. A transfer removes an object from a guild and makes it available to another.

They already did some encouraging benchmarks.

catnaroek9y ago

> A guild can't access the objects of other guilds.

> 2. Transfer membership or Move in short

How is this enforced? What exactly happens at runtime if a guild tries to manipulate an object that belongs to another? (Absent a compile-time check, this is always a possibility.)

dragonwriter9y ago

> How is this enforced? What exactly happens at runtime if a guild tries to manipulate an object that belongs to another? (Absent a compile-time check, this is always a possibility.)

It would seem that:

(1) Guild ownership would have to be tracked in the runtime, obviously.

(2) Any access from Ruby code in the runtime, the runtime would also know what Guild the access request came from as well as the Guild the object belonged to against which access was sought.

(3) The runtime would be required to fail in some well-defined way (presuming, raising an exception in the requester) when the rules were violated.

1 more reply

pmontra9y ago

It's similar to Rust's transfer of ownership. Rust is compiled so you get a compile time error if you try to access something you can't access anymore.

More about transfer of ownership at https://doc.rust-lang.org/book/ownership.html

1 more reply

riffraff9y ago

you get a runtime error

1 more reply

rst9y ago

_ko19y ago

So we need to rewrite to support multi-guilds application.

bad_user9y ago

So guilds are effectively OS-managed processes?

audunw9y ago

No, the heap is shared among guilds. See the last slide.

Moving data between guilds is cheap because data does not have to be copied. Referencing frozen (immutable) data is cheap to.

It seems it will track ownership of objects to make sure guilds don't access other guilds data. But it doesn't seem that it uses OS-level data protection.

pmontra9y ago

azr799y ago

sounds reasonable to me

readittwice9y ago· 7 in thread

chrisseaton9y ago

> The slides say that an exception is thrown if an object of a different guild is accessed, but doesn't that mean that Ruby needs to check the guild at every object access?

Ruby is already checking the class of the object on every access. You could combine the guild and the class into a tuple and compare against that instead, so it adds no extra overhead.

There is a paper at OOPSLA this year on doing just that http://2016.splashcon.org/event/splash-2016-oopsla-efficient...

aardvark1799y ago

I can't comment on that paper since it doesn't appear to be available publicly yet, so I'm just going to talk about guilds as proposed.

The "Ruby global data" slide also gives me the scream heebie-jeebies, as did finding stack overflow answers on how to unfreeze objects in MRI. I'm sure nothing will go wrong. :-)

3 more replies

VeejayRampay9y ago

1 more reply

Roboprog9y ago

Hmm. I was under the impression that when you transferred the membership of an object, the "guild-local" or local lexical scope variable would be nulled out. Little to check if that's the case.

ergl9y ago

rurban9y ago

_ko19y ago

It's magic of transferring membership. I omitted details on slides.

_ko19y ago· 2 in thread

Could you link to http://www.atdot.net/~ko1/activities/2016_rubykaigi.pdf ? current one is on temporary file space (will be removed soon).

tenderloveOP9y ago

Apparently I can't change the link. I'm sorry! :-(

lake999y ago

The mods can do it. I once left a comment about changing the link. They saw it and changed it on their own. I don't know how to notify them though.

2 more replies

kent19y ago· 2 in thread

I worked on a similar proposal during my PhD thesis. It is formalized for a Java-like language and implemented in the Jikes RVM. We also carried a proof of isolation using Coq.

https://tel.archives-ouvertes.fr/tel-00933072

mattnewton9y ago

That looks really useful. If you have time, please chime in on the proposal!

kent19y ago

jph9y ago· 2 in thread

The key points IMHO:

2 Instead the proposal is to create a fast copy-on-write with rules to "deep freeze" some kinds of objects and primitives into an immutable sharable state.

nateberkopec9y ago

> This Ruby 3 proposal says that Ruby 2 compatibility is mission critical

Matz has been very public about his fear of a "Python 3" situation occurring in the Ruby community.

awj9y ago

I can easily see how people might (rightly or wrongly) say "Ruby 3 broke my code, I'm rewriting in Go".

masterleep9y ago· 2 in thread

How would you use this to parallelize Rails requests? I guess you would need a pool of guilds, each with its own set of controllers, etc.

Since the requests would not be in the "main" guild, it might be painful to call into gems.

artellectual9y ago

It all really depends on how much overhead there is to create and destroy guilds. If it's easy then ideally you could start 100s of guilds or 1000s should your hardware allow it.

I see guilds as a subprocess with its own isolated resources.

pmontra9y ago

1 more reply

DougBarth9y ago· 2 in thread

If I'm reading this proposal correctly, locks will still be needed within multithreaded guilds to guard mutations against complex object graphs.

dragonwriter9y ago

> locks will still be needed within multithreaded guilds

It seems to me that is the intent; that is, any Ruby code that exists now is single-guild Ruby 3 code -- if its multithreaded, it needs locks, for the same reason it does now.

> It would seem like the intention would be to replace usages of Thread with Guild to avoid the concurrency issues inherent with threaded code

> Will there be API support to create a Guild that only allows a single thread?

It certainly sounds like a good idea.

nateberkopec9y ago

> If I'm reading this proposal correctly, locks will still be needed within multithreaded guilds to guard mutations against complex object graphs.

That is correct. You'll still need to use locks if doing multi-threading inside of Guilds.

It looks like Guilds can have 1 to X threads.

DanWaterworth9y ago· 2 in thread

_ko19y ago

Quoted from slides: > GC/Heap > * Share it. Do stop the world parallel marking- and lazy concurrent sweeping. > * Synchronize only at page acquire timing. No any synchronization at creation time.

DanWaterworth9y ago

I stand corrected.

gamesbrainiac9y ago· 2 in thread

Any idea where the video of the talk is?

steveklabnik9y ago

Ruby Kaigi has just started, so I'm guessing it will be a while.

_ko19y ago

Yes.

claudiug9y ago· 2 in thread

do we have any date from this new way of doing concurrency in ruby?

pkmiec9y ago

the new concurrency is part of ruby 3. matz says he wishes for it to be out by 2020. but who knows :).

cutler9y ago

So 4 years to go. Not quite Perl 6 but it could be a bit late in the day considering the rate at which Ruby is losing mindshare.

transfire9y ago· 1 in thread

Hope thy improve the syntax, it looks horrid -- code in strings and all.

_ko19y ago

Because of current limitation. We'll improve it.

ivoras9y ago· 1 in thread

Ok, "guilds"? Is the principle behind this so much different than everything done before that it requires repurposing a completely new word?

On par with That's "crates". Gives the impression some people just want to be remembered as inventing names.

dragonwriter9y ago

> "guilds"? Is the principle behind this so much different than everything done before that it requires repurposing a completely new word?

Pretty much. I mean, if there is a standard name for a thing between a process and a thread that is not a thread group, I haven't heard it.

sciurus9y ago

This reminds me in some ways of Eric Snow's (rejected, afaik) proposal to extend "subinterpreters" to allow parallelism in Python.

https://lwn.net/Articles/650489/

jellymann9y ago

The PDF appears to have been removed. I'm getting a "Not Found" page.

porges9y ago

I{HEART}COM

j / k navigate · click thread line to collapse