The paper posits (and provides evidence) that if you train a model using ternary values instead of floating point values you get equivalent (useful/practical) information. You can't take an existing model and round all the values down to `{-1,0,+1}` values but you can (re)train a model using ternary values to get the same end result (equivalent information/output).
Technically a model trained using FP16 values contains vastly more information than a model trained using ternary values. Practically though it seems to make no difference.
My prediction: Floating point models will still be used extensively by scientists and academics in their AI research but nearly all real-world, publicly-distributed AI models will be ternary. It's just too practical and enticing! Even if the ternary representation of a model is only 90% effective it's going to be so much faster and cheaper to use it in reality. We're talking about the difference between requiring a $500 GPU or a $5 microcontroller.