I don't know how to explain that without analogies to Haskell (which, I presume, author deliberately avoids), though.
The point is that it enables some function to be pure in the traditional sense even if their implementation relies on mutating data – any idea how I could make this more clear?