undefined | Better HN

0 pointsnine_k2y ago0 comments

At my previous job, A/B testing was widely used to check which of the small UX tweaks works better.

0 comments

3 comments · 1 top-level

exodust2y ago· 2 in thread

The problem there is many uncontrollable factors contributing to the data. If engagement is found to be better with A on the day, does that mean B will never be better? It doesn't.

A/B testing is often conducted unscientifically or with insufficient sample size and timeframe.

Sometimes the cards fall where they fall, and your small UX tweak wasn't involved. It's tempting to conclude A/B testing delivered a valid result. As an experiment, if you deliberately made A and B identical, then A/B test as normal, you will still get a winner. They won't be equal.

nine_kOP2y ago

This is fair; valid experiments need to run for weeks, not hours, to account for cyclic variations, should not be confounded by some events (like holidays, elections, etc) that severely change relevant user activity, etc, and need to bring large enough amounts of data points for the results to be able to be statistically significant.

gfodor2y ago

Well yeah you have to actually do it right.

j / k navigate · click thread line to collapse