While this research allows us to interpret larger models in an amazing way, it doesn’t mean the models themselves ‘understand’ anything.
You can use this on much smaller scale models as well, as they showed 8 months ago. Does that research tell us about how models understand themselves? Or does it help us understand how the models work?