From the "related work" section of the paper:
The nowadays surprisingly poor performance of the models in Hill et al.
(2016) can at least partly be explained because 1) they use poorer (older) word embeddings; and 2)
FastSent sentence representations are of the same dimensionality as the input word embeddings,
while they are compared in the same table to much higher-dimensional representations.
See also figure 1 for the increase in performance across tasks when the embedding dimension is increased.