Mostly. There's also confounding effects from factors like the length of the texts - e.g. when compressing Zstd(A+B), it's more expensive to encode a backreference in B to some content in A when the distance to that content is longer, so longer texts will appear less similar to each other than short texts.