story

A/B testing gets misused to juice metrics in the short term (opens in new tab)

zumsteg.net

514 pointspdxdmz3y ago334 comments

334 comments

jerf3y ago

"As an experiment, I went through a list of holiday weekend sales, and opened all the sites. They all — all, 100% — interrupted my attempt to give them some money."

This is a good touchstone to use for "you've overoptimized your site, tone it back". I am also taken aback every time I'm on a site, I've got something in my shopping cart, I'm headed for the "check out" button, or I'm even on the checkout page, and some stupid interstitial pops up. Dude, I'm trying to enter my credit card information! Back off! Especially stupid for a "sign up for our newsletter" popup; we all know that unclicking the "yes, we can email you every 17 seconds from now until the heat death of the universe with valuable offers from 'our affiliates' which we define as 'anyone we share a species with'!" box on the checkout form is mandatory, and if we don't see it immediately we'd best go hunting for it. You've already default populating the checkbox to "yes" on this very screen, get out of the way!

Less unbelievably stupid, but related, is when I'm examining product X and just after I scroll down a bit to read more you pop up something related to... well... anything other than product X! I'm signalling interest in product X as hard as I can, and you've AB tested that this is a great time to jangle your keys over there instead? Your AB testing is stupid and can't possibly fail to be some stupid statistical fluke or other terrible error. What fisherman goes out on his boat, hooks a fish, and then rushes to throw another completely different lure out to the hooked fish and get them on that hook instead? This is another good touchstone for being "overoptimized".

enlyth3y ago

Speaking of 'sign up to our newsletter', one of the latest dark patterns I've found that astounded me was adding a checkbox to the login form [0], where you'd normally expect the 'remember me' checkbox to be. You almost click it out of muscle memory if you don't read what it says.

[0] https://i.postimg.cc/HW89hs7r/Screenshot-2022-07-12-145957.p...

eloisius3y ago

Email marketing in general blows my mind. Marketers typically have absolutely no respect for consent, and the costs are completely borne by the recipient. The whole industry depends on dark patterns, shady list sharing, and scraping your email to add it to their lists despite you having no relationship with them. I know it's not simple, and it's just my frustration speaking, but I don't understand how my mail host can't ban all Mailchimp et al IPs for me, or implement some standard such that it costs them a penny to send me an email.

5 more replies

bombcar3y ago

"Remember me in your newsletter list" is the next one. Send me money!

cardamomo3y ago

Yes! I just noticed this for the first time yesterday and thought, "I hope this isn't another terrible trend in dark patterns."

dwighttk3y ago

Heh. I’ve stopped clicking remember me boxes because they never work.

aroccoli3y ago

Yeah, I booked a flight on WizzAir two days ago, and this felt like a low blow even from WizzAir.

1 more reply

CodesInChaos3y ago

atlassian does this as well: https://i.postimg.cc/zfwbG5Ft/atlassian-login.png

1 more reply

Akronymus3y ago

Thats one of the things I REALLY dislike about GoG lately. It tries really hard to bait you into signing up for the newsletter when buying stuff.

SoftTalker3y ago

I never check "remember me" so maybe that's good for me?

nemo44x3y ago

A lot of this happens because different managers have different metrics/KPIs they are optimizing around and they all find "good places" to do things to help meet their goals. The secondary effects aren't considered. One managers quest for outperforming their goals comes at the expense of another managers goals.

d233y ago

There was a point a few years ago where you could not see a single piece of user-generated content above the fold on the reddit home page. A bunch of teams had jockeyed for having their little carousels and banners put on top, and of course, metrics were always cited.

I left a screenshot in slack and it ended up causing a couple of teams to have to roll back their widgets, but it always baffled me that we were able to focus so much on the individual trees of metric optimization that we would miss the forest to that extent.

2 more replies

spookthesunset3y ago

This is why any good org will make sure to observe all important KPI’s while doing an A/B test. If your “email signup” KPI went to the moon but tanked your “bought shit” metric… you should probably roll back.

1 more reply

DrewADesign3y ago

Having experience design expertise at the executive level can mitigate this. If nobody is advocating for good user experiences, nobody is advocating for the usefulness of your online product as a whole, and it shows.

retcon3y ago

The primary not secondary effect of random sampling is noise.

CSMastermind3y ago

Recently I got an account on a developer tool (Checkly) because the company I joined uses it. I then got 5 different emails from them in a 48 hour period.

Like I'm sure many users sign up then drop out of your funnel but I'm part of an organization that's a paying customer. I'm already going to use your stuff. What possible business benefit could there be to you spamming me? If anything you're risking the inverse - it made me want to migrate away from the tool.

chairmanwow13y ago

Sounds like they are optimizing for a KPI on time to full integration. Someone else paid for it, now they want to make sure that you are actually using it.

Still absurd, but I know this is a problem friends of mine have had.

1 more reply

mattgreenrocks3y ago

The 'cost' of email is borne by the recipient, mostly.

tnolet3y ago

Hey CSMastermind, I'm founder at Checkly and I got a ping we were mentioned. We do send out some "getting started" messages on autopilot. We also did a product launch Thursday and then our regular changelog on Monday. That probably was overwhelming. If you could email me on tim -at- checklyhq -dot- com I will track down if we hit the the spam cannon too hard.

4 more replies

joe_the_user3y ago

Let me propose a different possibility.

Suppose the site isn't concerned about the sale very much at all?

Suppose the thing that the site uses to reel people in, is a good deal that isn't very profitable to the site but what the site then tries to sell is a very profitable near-scam/ripoff. Scaring off half the ordinary customers becomes worth it to get even 10% of the customers buying the scam.

What seems like "poor optimization" can easily optimization for something and could be seen as "the scammification of the web".

bsuvc3y ago

Exactly this.

Many here are focusing on a single interaction. While the outcome of that single interaction is negative to the company, the aggregate outcome must be positive somehow, perhaps in the way you said, but it doesn't even have to be a scam or ripoff. Some products just have a higher margin and/or customer LTV.

As an individual, it is annoying, but the company is focusing only on the macro effect when it does something like this.

Jenk3y ago

> I'm signalling interest in product X as hard as I can, and you've AB tested that this is a great time to jangle your keys over there instead?

If I may... I have seen data from a big retailer that shows any user that doesn't immediately purchase an item, is actually not that interested in the product on the screen. If a customer is going to buy something, they will do it promptly. Anyone else is just browsing.

YMMV, grain of salt, context dependent, etc, etc.

jerf3y ago

In this case, what I'm referring to is:

    1. Clicked on page.
    2. Took maybe 10 seconds to take in what is "above the fold".
    3. Scrolled down to see what else there is.
    4. BAM! Popup triggered by scrolling down.

While I understand what you're getting at, they do not yet have the info to know that I'm browsing or whatever. They were so excited about their stupid popup that they didn't even get that far.

I will say, generally, when I'm to the point that I'm entering credit card info, I've put up with it, but I have been chased off of sites by this use case before. Especially if that popup also crosses with some other popup and now I'm chasing down the tiny little 6pt light-grey-on-white little "x"s to click away the popups in the right order.

Actually, let me add that to my touchstone list. OF COURSE hiding the dismissal icon for the popup increases "engagement" with the popup. You don't even need to run a test for that, because what other result could it have? "We shrank the close icon, moved it to the lower right corner where nobody expects it, and made sure to kill the constrast even harder, and customers dismissed it 2.5 seconds more quickly on average"? Of course that's not possible. But... that's the wrong question! And AB testing is really good at answering the question you're asking, it has no mechanism in and of itself to see whether you're asking the right question. If you're getting down to this you've overoptimized.

3 more replies

sharemywin3y ago

The number to figure out is how much time do you wait to interrupt. Also wonder if it's person dependant. Some people aren't impulsive buyers.

InCityDreams3y ago

>I have seen data from a big retailer that shows any user that doesn't immediately purchase an item, is actually not that interested in the product on the screen.

Fuck that. Unless it's an emergency (in which case I'll go to a shop), anything i purchase online is carefully considered, sometimes over several weeks. My revolving user-agent, vpn etc may give the illusion I'm not interested....but i am indeed, just browsing...

axus3y ago

Could the popup be a punishment for reading the fine print?

oxfordmale3y ago

I love to sign-up to news letters and get a discount. Of course I am giving you my spam account I set up for this exact purpose.

naravara3y ago

Honestly instead of a cookie law I wish GDPR has imposed a rule that required all those stupid interstitial pop-ups to conform to a standard that could be easily blocked by the browser. I mean they are asking for emails, which is a massive and totally unnecessary proliferation of personally identifiable information.

I hate them so much. It makes it feel like so much more of a chore to try to do research or look for things online. I'd honestly prefer 56k page-load speeds if the pages were free of this garbage.

jdlshore3y ago

Pet peeve: GDPR is not the cookie law, and is fact a very sensible collection of restrictions on how companies can collect and process personal data. The annoying banners you see are against the spirit of the GDPR, and quite possibly against the letter of it, too.

m4633y ago

Amazon does this in a horrible way.

1) I open a product in a tab. I click "add to cart" and a "related products" sidebar slides in. I close the tab in annoyance.

However, some items exhibit a similar pattern, EXCEPT...

2) I open a product in a tab. I click "add to cart" and a stupid extended warranty sidebar slides in. I close the tab in annoyance.

The difference?

Item #1 gets added to my cart

Item #2 doesn't make it to the shopping cart.

Amazon just silently deleted my purchase.

I actually don't know when it dawned on me that this happened, but amazon lost money on me because I didn't buy certain things.

tshaddox3y ago

The thing is, you haven’t really shown that these sites aren’t successfully optimizing for conversions. Couldn’t it very well be the case that UI which annoys some high-intent users by interrupting them or adding steps to the checkout process also increases overall conversions?

jl63y ago

True, A could be “annoy users” and B could be “don’t annoy users”, and A could perform better overall, but in this framework you might be missing C which is “annoy users except those already deep in the funnel”.

1 more reply

sharemywin3y ago

Most people are just browsing so it's optimizing for that when it should be optimizing for sales

EGreg3y ago

Maybe your money is no longer the most important thing for them at that moment? Given that you’ll probably proceed with the purchase anyway, they could be making more overall from the crowd which also signs up for updates.

tracerbulletx3y ago

I would really warn you against thinking your intuitions are going to be a good sign for whether or not something is a good retail decision.

legalcorrection3y ago

Indeed, quantitative data without qualitative understanding is useless. You can't understand data without understanding mechanisms, because there's an infinite number of possible confounding factors that you can only dismiss through your qualitative understanding of the dynamic you are measuring.

lifeplusplus3y ago

What you describe could actually be an artifact of "flaw of averages" (an article was posted here just few days ago, TLDR: air force discovered that nobody fit the body dimensions of an average pilot and therefore created adjustable seats for jets). If promotion banners are just segmented to general millions of people then that's exactly what will happen. But not sure if better segmenting is even possible with data available.

hbn3y ago

I've talked about this on HN but I'll say it again.

A couple years ago, after being an Android fan for the better part of a decade, I finally bought myself an iPhone and pried myself away from Google's ecosystem wherever I could. And Apple didn't even need to do any work for me to make this decision. It was the years of abuse from Google that you experience when you decide to use a Google product or service. And a big part of that was the constant A/B/C/D/E/F testing. I never felt like I was using a complete product, everything felt like a constant beta that could be changed or rearranged at any point, and I was just doing free testing work for them while they harvest all my data.

Every app update was a risk of the app rearranging itself, or features appearing/disappearing. Eventually it didn't even come from app updates in the Play Store, and new interfaces would just appear one day when a server somewhere marked your account as being in the group that gets the new UI. This app that you were familiar with could at any point be rearranged when you open it on any given day. Then maybe a week later you open it and it's back to how it was before. A button you thought was here suddenly isn't, and you question whether something actually changed or if you're losing your mind. It's a subtle gaslighting that eventually I couldn't stand any more.

To me, A/B testing means you don't respect your users. You see them as just one factor in your money machine that can be poked and prodded to optimize how much money you can squeeze out of them. That's not to say a company like Apple is creating products out of the goodness of their heart, but at least it feels like it was developed by humans who made an opinionated call as to what they thought was the right design decision, and what they would want to use. And in my 2 years of owning an iPhone, I've never opened my reminders app to find out that it's completely unrecognizable, or my messages app has been renamed or rethemed for the umpteenth time.

costcofries3y ago

"To me, A/B testing means you don't respect your users. You see them as just one factor in your money machine that can be poked and prodded to optimize how much money you can squeeze out of them. "

Your perspective is extremely short-sighted. A/B testing can result in this type of behaviour but that's just poor A/B testing. Good A/B testing focuses on removing distractions from the experience and helping users derive more value from the product. Bad A/B testing tries to make things more discoverable, where discoverability is often just noise and distractions. Good A/B testing ensures that the money machine, as you put it, pays its dues to users by making the product experience delightful.

manmal3y ago

UX feedback is usually retrieved via user interviews.

I personally have never heard from a product person „Let’s A/B test whether this is delightful“. And I think that’s because delightfulness or satisfaction is impossible to quantify in A/B tests. You only get to measure things like engagement, signup rates, retention etc. - cold hard taps on the screen, and no more.

And I must say that I‘m glad that, right now, apps can’t just scan my face (or cortisol levels, or pheromones or…) for emotional clues while I read their pesky push notifications that want to coax me back into their daily active user base.

1 more reply

jcelerier3y ago

> Your perspective is extremely short-sighted.

it's the perspective of the normal users.

every time i'm using a website and it does not behave exactly the same than for other people or I notice some AB testing, in my head it goes "who the fuck these people think they are?". The computing experience must be consistent and repeatable. If I wanted something that can change depending on the current position of the stars I'd ask another human, not a computer.

A/B testing can result in this type of behaviour but that's just poor A/B testing.

drewcoo3y ago

"You're doing it, but in some subtle but very important way that's not at all obvious to you, you're doing it wrong."

How many tech startup patterns fit that? That's a sign that either the pattern does not generalize well or it's snake oil.

3 more replies

loriverkutya3y ago

I never seen an A/B test where the goal was not to maximise profit for the company.

7 more replies

saagarjha3y ago

Well, then, the reason why people hate A/B testing is because everyone does it wrong. People can write fast software in Python too but the reason it’s known as being a slow language is because it invariably gets used differently.

NyxWulf3y ago

A/B testing is a tool, and as Deming said, the aim defines the system. In your definition of Good, you are defining the aim.

I've been in a similar situation, where I created a relatively sophisticated A/* testing and control system. My idea of good use of the system ended up being very different from how the team employing the system thought about it.

I believe that is part of the point of the post, that unintended, and even unimagined side effects plague even the best of ideas.

koheripbal3y ago

Do people honestly think that Apple doesn't do A/B testing?

1 more reply

civilized3y ago

You're missing GP's point. This is their experience with A/B testing as practiced. It's not about whether A/B testing could theoretically be great.

DiggyJohnson3y ago

I’m not sure this addresses the core criticism of using your users as a testbed.

8note3y ago

A/B testing is always going to be distracting to your users though.

Sometimes they'll be in A, and sometimes in B.

The button moves or disappears or appears.

Your user does not get an experience they can rely on

svnpenn3y ago

> Your perspective is extremely short-sighted

no, yours is. if some company wants to do some testing, they SHOULD PAY users for that. A/B testing is just exploiting users to get free testing.

corrral3y ago

> Every app update was a risk of the app rearranging itself, or features appearing/disappearing.

This shit drives my parents insane. Me too, when I have to help them. I've had to spend tens of seconds looking at a major screen in the phone app, of all things, to figure out WTF I'm looking at so I could help them figure out what was up. Re-arranged every update (or new phone) for absolutely no reason, terrible affordances, poor use of their own design language. Ugh.

I'd get them on iOS but they need larger screens and the $400 small iPhones (what I have) are already more expensive than they think a phone "should" cost, so they keep buying $200 Android phones about once a year (hoping the next one will be better) and not being able to use them because the UI is garbage.

selfhoster113y ago

At one point I was trying to set up my grandma with a popular video calling app on a dedicated device so we could stay in touch.

Before I could give her the freshly grandparent-proofed device, said video calling app upgraded on my parents' PC first and changed literally every single element of the UI beyond recognition. To someone the age of my grandma, that would be literally like bricking the device remotely, because none of the buttons would look the same, and she would not be able to work out how to use the new interface.

STOP CHANGING THINGS! Even if the new UI is better (debatable), some people just like or rely on a particular layout to operate the device or app. Don't rearrange without giving a ~permanent setting to use the old layout.

2 more replies

jl63y ago

If you put a dollar value on your own time spent providing training and tech support to your parents, the iPhone options start to look much cheaper.

1 more reply

bombcar3y ago

It's doubly sad because the larger phone or iPad would be perfect, but every year it's another $200 to tell Google they're doing the right thing.

At least on iPad/iPhone you can set the apps to access Google mail, etc, which doesn't change as often, but still too often.

aceazzameen3y ago

You nailed it. Google is constantly forcing users to relearn most of their products year after year. Give me Google products from a decade ago and I'd still be happy. Now I'm moving on from Google also. It's an untrustworthy brand.

bentcorner3y ago

Google recently shut off Hangouts on me with a flight (since several of my contacts reported that Hangouts worked fine for them).

It's kind of mind boggling they'd decide to do that - the replacement they direct me to (Google Chat) doesn't even have feature parity so I just dropped them and moved my social circle using Hangouts to a different app (since at this point they all faced the same problem and we decided on a different platform).

I'm really curious how the A/B testing for this went down - Google is willingly throwing customers away because somebody wants to pump numbers for a new app that is objectively worse than the old one.

At this point Google Maps is the only product that is keeping me with them, but even that one is beginning to wear thin.

jklinger4103y ago

> Google Chat) doesn't even have feature parity

Which features are you missing?

2 more replies

brynjolf3y ago

Ironically the place I'm getting forced A/B testing now is the Playstore. They moved the update apps section so it is now 2-3 clicks more, making me have to make a seperate shortcut just to reach the app update quickly.

The point is of course to make it annoying to manually update apps and enable auto update. I have been burned too many times with a auto update so I refuse.

This wasn't enough, they really want to force me enable auto updates to the point of the update section of the app having 50% of the visible space on my big screen being covered with a message to enable auto update over WiFi. [0]

Whoever is doing this at Google... Stop. Just stop. It is cringe.

[0] https://i.imgur.com/MbXn9gy.jpg

mgoetzke3y ago

Recently they remove YouTube PiP on iOS. Then it came back to my device but not others in the family. We pay for YT Premium. This is beyond infurating

1 more reply

fouric3y ago

A few months ago, I came to the belief that Google is the ADHD toddler of user-facing software development - absolutely unable to sit still and concentrate on anything, hence the constant UI/UX churn, half-baked products, and graveyard[1] of shiny things that they worked on for a few years before abandoning.

Google seems to be really good at making developer tools like Borg and Blaze - however, I think that as an organization they have some deficit that makes them not responsible enough to develop user-facing software (like, uh, an operating system).

Maybe Google would be better as a B2B company.

[1] https://killedbygoogle.com/

adrr3y ago

We AB tested a performance enhancement to our frontend web app to show that speed had significant benefits to the business. we use the results justify the investment cost. We spent the next six months working on making the site faster because of it. It is a tool. How would you measure things without ab testing?

fisf3y ago

The fact that you need an A/B test to demonstrate that frontend performance has an impact (on user experience first and foremost) in 2022 speaks volumes.

3 more replies

quickthrower23y ago

I moved back to iPhone in the 2015s onwards (I think) because of the crap quality of devices available, and the bloatware atop if android.

In many cases the hardware was so poor it was hard to make a call due to the touchscreen.

Since the primary thing I want the phone to do is make a call I switched to the “it just works” camp and haven’t regretted it.

Except getting photos off the phone. Until I realised the best tool for that is … Ubuntu!

amelius3y ago

Yes, I hate Apple, but I'm starting to hate Google even more. One of these days, I might switch too.

Why can't we have nice things?

dakial13y ago

The problem is not the AB testing, as it can be a good thing to improve the experience, the problem is poorly set, short sighted, OKRs. The author points that in the text, as he mentions many times that the leadership never asks the right questions, mainly how it will impact the client in the long run (being NPS evolution, lifetime value, etc..)

bratbag3y ago

Depends on where A/B testing is used.

If it's something one-and-done (like different permutations of a signup flow to see what is easier for users), then I don't see the harm in it.

Lutger3y ago

> To me, capitalism (A/B testing) means you don't respect your citizens (users). You see them as just one factor in your money machine that can be poked and prodded to optimize how much money you can squeeze out of them.

You just described doing business in todays world.

Being a bit more generous towards A/B testing, I would make a counter point: _not_ doing any kind of user testing, of which large scale automated A/B tests are just a subset, means you don't respect your users. Because it means you just assume you know what their experience is like, or worse: you don't even care about it and bother to learn something.

Your complaint seems to be more about the scale and aspect of automation honestly, and continuity of the services, which is a valid complaint against Google but not about A/B testing in general.

saagarjha3y ago

> Because it means you just assume you know what their experience is like, or worse: you don't even care about it and bother to learn something.

A/B tests are not the only, or even the best, way of collecting user feedback.

ChrisMarshallNY3y ago

> means you don't respect your users.

I was just pontif- er, talking about this to someone, a couple of days ago.

I love the users of my products. Most of my products are free, and are carefully-crafted, highly-polished, complete deliverables, and I fret over how they are used -even if by a tiny number of end users-, like a nervous hen. I do what I do, out of love for the craft, and out of a genuine desire to make people's lives easier, through the technology I have at my disposal.

It is my belief that most tech companies despise their user base. Users are little more than cattle, to be fattened and slaughtered. "Caring about the user" means optimizing for "engagement," or keeping them trapped within their own ecosystem. John Oliver did a rant about this, recently[0]. It has nothing to do with actually caring about the user, or solving their problem. It is about harvesting users.

In fact, my discussion about this, came about, because someone wanted to keep users inside the app I'm writing, as opposed to linking them to a more familiar app, on their phone (for the record, it was for videoconferencing). Linking is a "no-brainer," as I can link out to dozens of installed apps, using the simple URL scheme method, built into iOS[1], and "keeping them in the app," would have required several months of extra work, polluting the app with megabytes of junk code, because I'd need to use SDKs, and also kill the ability to easily scale to add new clients (contrary to popular belief, Zoom is not the only videoconferencing option). It would also have possibly put us on the hook, legally, for what happened in those videoconferences.

[0] https://youtu.be/jXf04bhcjbg?t=638

[1] https://developer.apple.com/documentation/xcode/defining-a-c...

causi3y ago

The problem with AB testing is that it's a short-term strategy. For example, if a news site runs AB testing with headlines, they'll find that bullshit clickbait headlines get more pageviews than concise, accurate headlines, but the constant use of clickbait headlines will over time destroy overall traffic to your site. More frustratingly, sites run by smart people tend to fall into a balance where the worst articles get the most alluring headlines.

tomrod3y ago

This highlights the major downside to "data-driven" policy and decisions.

Data can "lie". What is observed is not always reality, simply what we can see of it.

Consider auctions. You never actually "see" the bidder's demand or utility. Yes, there are some ways to structure auctions that in theory show willingness to pay and such (ignoring confounding factors and irrationality), but you don't actually observe anything beyond the bid.

Similarly, on websites, you don't always know the causal reasons people click here or there. You know perhaps enough to predict a step-wise behavior, but don't (usually) understand the full behavioral lifecycle -- especially if a metric improves but at the hidden cost of decrements to conversion and similar.

goodside3y ago

There’s nothing about AB testing that requires you to use short-term metrics. I used to manage AB tests for online dating sites (OkCupid, Grindr) where subscription revenue is what matters, and the gains of any strategy will take months to materialize. We were well aware that, say, raising prices would yield more short-term revenue at the expense of long-term revenue. That didn’t stop us from testing, it just made the statistics more complicated.

tsimionescu3y ago

Sure, but in many cases, such as the example given by GP, long-term AB testing is hard or almost impossible. For the testing to have validity, you need the A and B cohorts to be stable, and have little or no overlap, and that is hard for long time spans for anything that is not account based (and somewhat dangerous even for account-based things, as people will almost certainly start to notice that they are getting a different experience than their peers, which may upset them).

1 more reply

Philadelphia3y ago

OkCupid has completely destroyed its interface and utility, so whatever they’re doing doesn’t seem to be working anymore.

1 more reply

im3w1l3y ago

Did you A/B test the matching algorithms?

1 more reply

multivariate3y ago

I write A/B tests for headlines for a news site, this is too broad a generalization. Clickbait titles aren't great for building subscribers or establishing trust, which is what we really care about (LTV). To the author's credit, our deepest testing insights come from analyzing a lot of historical data (not just last week's).

dr_dshiv3y ago

I’m a huge fan of metrics. Huge! But they are worthless when not combined with qualitative experience. AB testing needs to be combined with human-centered “actually talking to people about their experiences.” Otherwise, you drift and the metrics no longer match the objective.

naravara3y ago

> but the constant use of clickbait headlines will over time destroy overall traffic to your site

I'd add a bit of nuance here. They are very good at driving traffic, but very bad at building an audience. You do this long enough and your news site is now optimized for attracting hot-take appreciators who engage with the news like a tabloid. This drives away everyone who doesn't want to be reading a tabloid and makes you more dependent on keeping up with traffic-gaming strategies to continuously drive traffic. You've basically shifted your business from being a place that produces journalism to being a place that figures out ways to game social media trends and SEO.

boruto3y ago

Indeed,

If do more ad placements increase revenue is the test and then there is 20% jump what are you as an engineer going to do? Tell to management that its bad?

benja1233y ago

I say this a lot and I will keep saying it. Conversion != customer obsession. There is a place for A/B testing. It is necessary and can be extremely beneficial in helping your customers enjoy and use your product more successfully.

The main issue is that people mix conversion with customer obsession! Whenever you work on a product or feature you should be asking yourself "Is this really good for my customer" - if the answer is no, then no matter what the A/B tests/conversion rates show you don't do it.

Unfortunately we mostly hire the wrong people as PMs, who then hire clones of themselves. They are not truly customer obsessed and use A/B tests incorrectly which results in products that trick or force customers to do things they don't understand/want to do. Long term this is bad for the product and company

10x-dev3y ago

My 'favorite' silly thing PMs do is UX research studies (typically on 5-10 people) and essentially ask completely untrained people if we should go with X/Y or Z. It's a super-effective way of avoiding responsibility for product decisions ("the data suggest we should go with Y"). If only building good products were as easy as asking what customers think they want.

HWR_143y ago

Either they're doing the UX research wrong or (more likely) you're misunderstanding the process. You don't ask them if you should do X/Y/Z. You ask them to do X in the program, and see that none of them can find widget Y which controls it because they keep clicking on widget Z.

It's about observing the users fumble through your UX when you know their motivation.

2 more replies

throwaway987973y ago

only listen to customers problems and never their solutions

time_to_smile3y ago

The term "customer obsession" has become a red flag for me when interviewing because I've never worked at or chatted with a company that had "customer obsession" as value that wasn't aggressively working to squeeze every dime from their users with zero interest in whether or not this squeezing was harmful to the customer.

An actual, sincere customer obsession (and btw I think we both completely agree here) means that you are willing to lose out on some conversion and revenue in order to make sure your customers are top priority.

Real customer obsession isn't just an ethical principle either, it makes business sense. The problem is that the value of customer obsession is realized over the span of years or decades. Companies that have a sincere customer obsession are the kinds of places that survive economic ups and downs, where people's children grow up and are loyal to the product because they remember the time their parents were treated well by the company.

If your only company focus is Q4 KPIs then you really can't have "customer obsession".

jklinger4103y ago

> The main issue is that people mix conversion with customer obsession!

The logic is: If they hate your app, they won't spend money. If they love your app, they will. Which is what would make you think A/B testing and UX work are the same thing.

There's really nothing new about this issue at all. Playing towards the average creates a lot of shitty stuff, in apps/websites as well as politics and wherever else there are metrics to track.

The genius of a good product is that it will make a stand and not give in to the whims of over-optimization in order to maintain its original intent. This is what made Apple unique.

It requires leadership with guts who aren't chasing the latest shiny object.

jfoster3y ago

Yeah, this is key. Improving a product in the direction of customer intent vs against customer intent.

axg113y ago

A/B testing is local optimization. It should only be done on a mature(-ish) product when you have given up on finding a global minimum.

Running experiments and A/B tests are popular because it is _guaranteed_ to give you signal. If you have a large engineering team and you're not sure how to filter the quality of results, gating everything through A/B tests is a well understood methodical way to ensure only positive work makes it way through.

Early stage startups should never A/B test. When you're searching for product market fit, you're doing global optimization within the search space. Your product will change drastically as you make new learnings. Premature optimization (A/B tests) will only be detrimental.

jaggederest3y ago

> Running experiments and A/B tests are popular because it is _guaranteed_ to give you signal. If you have a large engineering team and you're not sure how to filter the quality of results, gating everything through A/B tests is a well understood methodical way to ensure only positive work makes it way through.

It's almost guaranteed to ensure only false positive work makes its way through. If you're picking 0.05 as your P value, and you're running dozens to hundreds of tests, your false positives are almost certain to exceed your actual positives.

When I'm working for clients that do a lot of A/B testing, I suggest that they should always run A/A tests to ensure that they're not incorrectly rejecting the null hypothesis. If your A/A tests are showing significant differences, you have a problem in your testing pipeline that by definition can't be cured by more testing. You need holdout groups and selectivity about what to test, instead of just throwing everything at the proverbial wall.

3pt141593y ago

Even checking A/A tests won't surface all the issues. A proper A/B test is one that samples over a long enough time to adjust to the true audience of the service.

For example, imagine a costume shop that ran a couple dozen A/B tests over the summer. Those results may look statistically significant. They may even stand up against the A/A test. But people that buy costumes in the summer are very, very different than people that buy them in October, and if 90% of the store's business is in the run up to halloween, then all these micro optimizations could actually make your total business performance worse.

I'm a A/B testing skeptic too, though I admit they have a time and a place. My favourite are ones that can be reasoned about as actual hypotheses. This usually involves some degree of data analysis or segmentation. For example, increasing font sizes may boost conversion, and a later analysis shows that this was almost solely a lift in conversion rates amongst the 45+ cohort. The data in this case isn't just blindly driving design decisions, it's helping inform the staff on how to better design in the future for the audience we have.

marcosdumay3y ago

Well, if you are running hundreds of tests with 0.05 p-value, you will get plenty of false ok A/A tests, and there isn't much of a reason to expect them to be correlated to actual signal on your A/B tests.

A/A tests do test your methodology as you said. But they do not fix a p-value one order of magnitude higher than it should be. (And yeah, I'm aware you know that, but your comment places them on the same context, so it got misleading.)

purplerabbit3y ago

Great insight. Without this approach, A/B testing could be used to generate an infinite stream of meaningless work

slotrans3y ago

The first company I worked for, and also the first company I saw A/B tests at, once ran an A/A test because someone was a bit skeptical about some of the test results that had been claimed.

Predictably, whatever metric we were watching on it (probably conversion) swung wildly to either side over the first few days. The look on some of the product managers' faces was pretty great. After about 2 weeks, it settled into a steady state where each "version" performed equally (measured cumulatively, so just large numbers in action).

The conclusion from this exercise was...

"It takes 2 weeks."

¯\_(ツ)_/¯

jakubmazanec3y ago

That's why we calculate stuff like effect size and power of a test (or even better, use Bayesian statistics); just p < 0.05 is practically almost meaningless.

ravivyas3y ago

"Running experiments and A/B tests are popular" ... because you can give up on your own judgment and opinions and say "the data says"

4111111111111113y ago

> give up on your own judgment and opinions and say "the data says"

The beauty of AB testing is that you don't have to give up your opinion. You can just change irrelevant things until the result you desire gets proven by chance and now you've got data to base your opinion on!

andsoitis3y ago

Even for a mature product where you might be doing A/B tests to explore hypotheses that you think will improve the product for the user, it is also worth considering doing mountain tests where you try to escape the local maxima.

amluto3y ago

Reading this makes me think of the handful of sites, often targeted at professionals, that highly optimize for the experience of actually buying things. McMaster-Carr comes to mind. Their users shop there over and over, and McMaster wants to keep them. So you can find things for $2 or $2000, shipping prices are inoffensive, customer service is friendly but rarely needed, and there are minimal distractions on the way to checking out or even after checking out. The only real issues are mostly related to the fact that they sell so many products that one can get lost in the 4000+ items that all match the search. Well done.

This is an interesting contrast to Amazon that also makes checkout easy but bombasts the user with thousands of listings, mostly mildly fraudulent and consisting of absolute crap, and still somehow gets repeat business.

rightbyte3y ago

McMaster-Carr might be the single thing I miss the most from my time in the US. It is like ... stupid good. Their listings catalogization is like godlike compared to alternatives.

The Amazon or Google way of throwing all things into the bin and spew it out to the users is BS. We are saying we live in an information age but I firmly believe stuff were way better catalogizised back when it was done manually by paid gatekeepers.

int_19h3y ago

https://www.usplastic.com/ is another "industrial" example.

saagarjha3y ago

> This is an interesting contrast to Amazon that also makes checkout easy

Hey, would you like Prime with that? Do you know we provide free two-day shipping with Prime? If you sign up for Prime today you can get a $100 discount!

mschuster913y ago

My biggest issue with A/B testing isn't even mentioned here... gaslighting your customers is absolutely not OK. Particularly with older people, the constant "where the f..k did Outlook now put feature XYZ" (in the case that comes to my mind, the CC bar which used to be tab-reachable, now you have to tab+space or manually click on the tiny gray "cc" letters) onslaught is just absurd. When you change how applications behave without telling the users, it's a direct attack on their muscle memory at best and makes them question their sanity at worst.

My second biggest issue is: it's rare that companies offer actual, live-human support these days anyway. When marketing adds A/B testing, shit becomes really annoying if something breaks as a result - usually the phone lines are suddenly flooded, the agents have no idea what has happened either and try to reproduce and figure out what's going on (and sometimes can't because they aren't part of the test group!), and so even people who haven't been in the testing group are going to be very pissed off.

IMHO, A/B testing without explicitly notifying the customers in advance should be banned by law, and that ban be harshly enforced. Customers are not guinea pigs, and with the rise of elderly people on the Internet this becomes an actual public safety issue (as ever-changing stuff makes it easier for scammers!).

mjburgess3y ago

You're downvoted, but this issue is more common than it seems; and I agree, more serious than it seems.

You're describing adversarial UI changes to small populations of then unsupported customers. This can have outsized impacts on vulnerable populations, eg., esp. the elderly.

Akronymus3y ago

Not to mention the lovely "This option that used to be here no longer is here." getting the response of "its still here (for me)". Youtube specifically loves to do this to me.

emsixteen3y ago

> it's rare that companies offer actual, live-human support these days anyway.

This is one of my most intense frustrations in the modern age. Complete and utter disrespect for your customers' time and knowledge.

donmcronald3y ago

> Outlook now put feature XYZ" (in the case that comes to my mind, the CC bar which used to be tab-reachable, now you have to tab+space or manually click on the tiny gray "cc" letters)

I can sort of understand wanting to hide stuff on mobile, but the discovery of controls to unhide things should be better. I often help people that are stuck trying to figure out how to do something in an app and not realizing they can click on something that gives no indication it's clickable is a common thing.

Desktop is another world. I often have 20+ inches of horizontal space and a hamburger menu. It's infuriating, especially when the hamburger menu is hiding one button.

johnnymorgan3y ago

Love the post and I tend to agree.

As Product manager/owner I've only found A/B testing useful when trying to narrow in on a specific demographic and you are trying to find some optimization.

The marketing/sales funnel use of it is kind of gross and has ruined , imo, something that has utility in a very narrow scope.

Cheers, also very much agree customers should be informed and allowed to opt out.

'hey we have a new UX to try..would you like to switch?' the data from people that opt-in is way better

happimess3y ago

I had a PM who pushed us to A/B test _everything_. We hired a new graphic designer who suggested that we change our product links from ALL CAPS to Title Case (a very popular idea on the team, and his first real suggestion after a few weeks with us), and she insisted that we A/B test it first. It felt like an insult to him, and a dumb test since title case looked way better.

The three key outcomes I observed from the relentless A/B testing were UI antipatterns, team burnout, and a well-attended conference talk about "how we ran 105 A/B tests in a year, and what we learned".

edmundsauto3y ago

I've had a similar experience, although my learning was "people, even experts, are really terrible at understanding which treatment will perform better".

We always run >=3 variants, surveyed the dozen team members on which one they thought would run. Over the years, there was no clear pattern over who could make that prediction.

IE, it's not possible to predict which is the most effective treatment, even when you include a really bad idea in the treatments!

gorbachev3y ago

Was one of the learnings "Everyone hated the product manager"?

ryanmcbride3y ago

I give an incredibly similar warning every time a company I'm working for starts trying to dip their toe into A/B testing. I have a lot of experience with it at scale (one at a fortune 100 company) and I've even built an a la carte testing framework in aws for a company that didn't like Target or Optimizely.

Every single time I warn them about how the bill of goods they've been sold with A/B testing is almost completely unattainable, especially in the way that they want to go about it. They won't magically start getting more conversions by changing a button color. Even if they start getting more clicks, they rarely start getting more complete conversions, because the increased numbers is usually from people who weren't good leads in the first place.

On top of that every company I've worked with has no idea what the real methodology for good tests is, no matter how many times I explain it or put it in a slide deck. I would constantly get requests to use A/B testing for feature rollout.

Them: "Hey, could you do an A/B test of our existing site design and our upcoming redesign?"

Me: "if the old design performs better are you going to toss out the redesign?"

Them: "No we're going with the redesign but we want metrics on how it'll affect traffic"

Me: "Those metrics are useless if you aren't going to listen to them, and if the results come back and the old design performs better, you're not even going to put it in a presentation because it's counter to your planned actions. There's literally no point in running this test"

Them: "Run it anyway"

mattgreenrocks3y ago

I recall when Booking.com rolled out the false urgency features. I was amazed at how utterly trashy and desperate they were.

The problem is it's not subtle at all; there's a handful of those features that, when combined, end up being overbearing and noisy: "3 people looked at this listing within the past 3 days! 12 rooms left at this rate!" I don't care. I'm looking to book business or vacation travel. If a spot fills up I'll just go somewhere else. It'll be fine either way.

I don't use them anymore for that reason. Old soul (me) is old. (I'm probably in a minority, judging by their advertisement budget.)

LeonM3y ago

Booking.com (or any other hotel/flight booking sites) are the masters of dark patterns. This is what happens when software is 'finished', companies start to optimize for profit, regardless of customer experience.

But unfortunately, it works.

I've seen friends that I consider intelligent panic buy tickets/hotels, "because prices are going up since the last time I checked!"

Next time you want to book anything, browse around, ignore any of the fake urgency notifications, ignore the price (while staying broadly in your price-range, of course). Then when you found a destination you like, open the page in a private browsing window (or clear your cookies), and you'll see that prices and availability are back to normal.

tnolet3y ago

Interestingly enough, booking this ridiculous A/B testing almost from the start. Super hard data driven company. There is a great book about them, written by 3 journalists, but it’s only in Dutch I think. https://www.amazon.com/Machine-ban-van-Booking-com-Dutch-ebo...

coldcode3y ago

Having worked in the OTA space, every time I see their (sometimes funny) ads, I want to call them Booking.Nope

OTA make comparisons a bit easier, but everything is negotiated and contractually controlled to keep people from just going to the hotel directly. Secret hotel prices (like HotWire if that still exists (Expedia) or Travelocity's Top Secret hotels if that still exists (also Expedia)) are an even more crazy negotiation. Hotel Tonight at least used to contact the hotel chains every day for that day's options, though since they were bought by AirBnB who knows what they do.

These days I just find a nice hotel and book with them/their system directly. Airlines too, since airlines fail to give all their options to the OTAs.

In some ways its sad that aggregators don't work all that well in the main travel industry (Flight/Hotel/Car) but travel is extremely complicated, highly competitive and still very fractured except for airlines. Pricing comparisons are not very useful since they are so mangled and obfuscated that you may as well just go to several sites and do it yourself by hand. For example Spirit Airlines used to give us prices for their tickets at $X and were always cheaper than everyone else; yet once you booked at that price they hit you for everything extra (bags, res, for all I know oxygen) then our customers complained we were fooling them and the real cost was higher.

iamben3y ago

It's amazing how much this works though. I remember getting a call from my mum saying "the website is telling me London is already 78% booked for these dates!" It felt ridiculous having to say "Mum, it's March. You're staying there in November. I promise you it'll be fine..."

AndrewThrowaway3y ago

Imagine this beautiful business software which during the years and numerous A/B tests, "best UX practices", design languages and whatnot became this all "applesque", minimalist UI with 80% of it being a white space. By the way winning numerous design awards.

However entering e.g. client's information take a lot of steps, you are constantly clicking "Next" throughout these beautiful wizards and pages. After some time everybody starts to feel that there must be a better way.

What is the solution?

Spreadsheet import! Where you can just do everything in this "complicated" UI of Microsoft Excel, with formulas, and hundred buttons at once on the screen. Fill in hundreds of rows of information and just import it to the "beautiful business system".

earthmancash3y ago

This post is right up my alley, as I'm a.) the CEO of an AB experimentation platform www.geteppo.com and b.) I'm an Airbnb alum where we rolled out an analogous feature that labeled a listing as "this is a rare find!"

And the funny thing is, I agree with this article. Both the content and the heading of this hackernews article:

1. Notification/scare spam can have long term retention ramifications. The previous generation of experiment platforms made long term metrics literally impossible to read. But now companies can use holdout groups and long term metrics like retention to give more clarity.

2. Even if you can read long term metrics that include retention, the scare/notification spam could lead to less word of mouth growth. For travel, I am guessing that you will be swayed to drive WoM growth more by differentiated inventory, reliable service, and cost, so maybe it's just merely annoying but not a risk to the business.

3. Notice that Airbnb's UI is very, very different from Booking, Expedia. We made a conscious choice to always make sure Airbnb came across as a "sincerely helpful friend" as a booking platform. An AB experiment showing that metrics improve doesn't mean that you have to launch it. You can look at those results and say, "this metric lift isn't worth how ugly it's making my site", and that's a completely valid choice. (a choice we made often at Airbnb)

fleddr3y ago

I absolutely agree that A/B testing in the way described in the article is a catalyst for creating dark patterns in a UI. Because dark patterns work, they deliver short term increases in particular metrics.

The author's idea is that this short term gain damages longer term metrics. That sounds logical and agreeable, but that doesn't make it true. Not in my experience anyway.

Probably the people complaining the most about annoying UI patterns weren't going to convert anyway. Whilst those coming with a specific conversion goal to your site will convert even if annoyed in the process.

Anyway, the true root cause goes all the way to the top. When you give a team a 20% sales increase target and "deliver by next quarter or be fired"...this is what you get. If the executive level dismisses a healthier, more sustainable long term growth model, then there's pretty much no way to stop this.

It's so hard to stop because it actually works. It works short term and evidence that it harms long time is typically lacking or it simply isn't true.

chunkyks3y ago

"If a study came out that said deafening high-pitched noises increased conversion rates, we would all be bleeding from our ears by end of business tomorrow, right?"

Netflix auto play? Is that you? You were a hateful idea, no one liked you, yet you stubbornly hung on for far too long

_tom_3y ago

I'm convinced that Netflix uses "number of hours watched" as a success metric. Autoplay raises that.

I'd pay twice as much to watch half as much quality programming, but that would tank what they think is a positive metric.

chunkyks3y ago

I'm pretty confident that that horrorshow was a case study in A/B testing: People can't decide what to watch and agitate over flicking through titles, so some bright spark had the idea of just starting stuff in the background to see if that helped.

It did! People started watching stuff sooner!

Mostly because it was so incredibly irritating that I'd start a show just to make the random autoplay cease. Of course, what it really meant was that I'd go to amazon and agonize over what to watch, but at least I was doing it at my own speed and not being harassed by noise autoplay.

andreareina3y ago

Getting a 503, so here's an archive: https://web.archive.org/web/20220712122630/https://www.zumst...

kjhgkjghkj3y ago

Intentional or not, one outcome on sites that are relentlessly A/B tested is that the resulting UI design lets users know that content they want is there, they just need to click and scroll a bit more to find it.

Having left FB years ago, I now watch people "navigate" their site/apps with disbelief.

regularfry3y ago

Isn't that exactly the problem? The resulting UI isn't designed, it's aggregated across a disjointed set of granular tweaks.

ssharp3y ago

This is part of the "unchecked" part of AB testing the headline mentions.

You, of course, need to ensure the granular tweaks can be rolled up into something usable as the granular tweaks prove successful. You can't just keep bolting on UI changes while losing sight of the larger experience. Each incremental A/B test is testing against a previously successful variant so eventually the control is radically different from where it started and you're only concerned about beating the control. Using a longer-term holdout group or reseting the control experience during incremental testing can help mitigate this and get you zoomed out a bit from the local maxima.

kjhgkjghkj3y ago

A problem for who? Given that people already invested in the product ecosystem seem to have almost limitless patience to scroll for the right content, I'm sure it improves almost every user time and attention metric.

It's why I saw it as my moral duty to leave (as well as the other FB properties), so that at least in a small way, I "produce content" that is only available by interacting with me as a person.

fairity3y ago

Yes, it's possible to use A/B testing in a short-sighted fashion. Yes, I'm sure there are plenty of examples thereof.

No, that doesn't mean A/B testing is inherently short-sighted. It's entirely possible to measure long-term secondary effects of an A/B test. Just save a record of treatment groups, and remember to come back and compare long-term metrics like LTV down the road. We do this all the time at my startup, and of the dark patterns that we've tested, we rarely see a long-term negative impact on LTV that outweighs the positive conversion rate impact.

If you want to make a valid argument against dark patterns (which is basically what 90% of this thread is trying to do), it's unlikely to be grounded in efficacy. This is coming from a business owner who spends seven figures monthly on advertising, constantly split tests, and is heavily invested in only making decisions that are in the long-term interest of the business.

epups3y ago

Yes, the author was not talking about A/B testing at all, but about popups and annoying UX patterns that get rewarded by poorly used A/B tests or short-term thinking. But there is nothing stopping anybody from using long-term A/B tests (I assume most companies do, like you). Unfortunately dark patterns might simply be profitable.

Kaotique3y ago

AB testing shows zero respect to your customers. It is the equivalent of testing your theories on lab rats.

Instead try to improve the customer experience, make better products, improve customer service.

Saturdays3y ago

That's a very weak blanket statement, there are totally reasonable A/B tests you can run that don't deteriorate a user's experience, and the results can guide you to a better customer experience overall.

Kaotique3y ago

It did not mean it too seriously, of course there are also good AB tests, but there are a lot of bad ones out there. Those are what the article was about.

(edited for clarification)

1 more reply

smeyer3y ago

>Instead try to improve the customer experience

AB testing can be (although isn't always) used to improve the customer experience. Assuming you know exactly what will make the customer experience best without actually testing it can also lead to a worse experience.

throwaway2903y ago

A/B testing helps you maximize a metric, not make customer experience better. Those are different things.

2 more replies

mschuster913y ago

> Assuming you know exactly what will make the customer experience best without actually testing it can also lead to a worse experience.

For that you usually hire a market research company or do what they will do: take an interviewer, two cameras (one front-face, one top-hands) and hire an as-diverse-as-possible pool of test candidates that you then put through whatever workflow optimization you want to do. Then afterwards, you interview them - side benefit, you can get really interesting general side knowledge that you'd never gain from a dumbass A/B scheme: is your font style/color scheme legible, can the site be used by colorblind people, are there stock photo choices that give off stereotypical vibes...

It's real fun and a worthwhile experience for everyone involved.

1 more reply

rgavuliak3y ago

I have experience where the company paid a UX agency to create a flow that was by all standards better customer experience and a better product, nicer too. They ran an AB test, turns out people were more likely to pay with the old version. AB testing is good that it challenges what UX people think is better experience or product with hard metrics.

nkrisc3y ago

Which is why A/B testing is an important part of the UX toolkit. It's a tool among others, and is one way to validate assumptions. A good UX designer will try to base their designs on data and reasonable hypotheses drawn from the data, but a new design or flow is necessarily based on some amount of assumptions, so it requires validation.

That said, an A/B test does not tell you why something didn't work. You can make further assumptions based on the results and develop new hypotheses, but it never tells you why. Typically you would do some kind of qualitative UX research on a prototype or even static concepts beforehand to identify these kinds of issues before you even expend the effort to do a live A/B test. Far cheaper to do a study with 6-12 people and a prototype than to build out a full, functioning A/B test experience.

It's possible the flow they created was generally better but perhaps it had one fatal flaw. Perhaps that flaw could easily be remedied once identified.

A/B testing is just one small part of a good UX process.

treis3y ago

>Instead try to improve the customer experience, make better products, improve customer service.

Without a metric to say what is "better" and a method to measure it this is empty advice.

Kaotique3y ago

There are many other metrics that do not involve AB testing. You can just survey customer experience before, during and after a purchase for example. I never said to throw out all metrics.

With AB testing your are optimising for a specific outcome. Usually higher conversion. As pointed out in the article eventually you'll end up with a bunch of colourful buttons and scary texts that persuade the user to click. A lot of the "only 2 seats/rooms available" are lies to scare the user into a conversion.

josefx3y ago

But does AB testing provide the only or even best metric for that? It probably is the cheapest way requiring the least engagement with the lab mice.

1 more reply

_the_inflator3y ago

Maybe Apple will come up with a reality distortion field and will remove "urgency" warnings and informations from websites on Safari, as well as blocking "Join our Newsletter now and get a discount" pop-ups.

What once was ads everywhere, is now psychological gaming.

I hope someone comes up with a Google Extension, and maybe Apple with a new "Access Website" mode.

These messages are boring to be honest. Once you noticed them everywhere, game over for me. Time to move on.

ryanmarsh3y ago

For many businesses revenue is a function of aggressive deal making. Full stop. In an undifferentiated market of discretionary (impulse) purchases if you don't hustle the customer you make less. The author of this article is confusing companies that are bad at hustling with hustling being bad.

One time offers, limited time offers, mailing list signups, up-sells, and cross sells are time tested ways to increase sales dating as far back as radio era telephone and catalog sales.

Steve Madden is a perfect example of this. They sell undifferentiated popular shoe styles less expensive than high fashion but more expensive than knockoffs. They have to hustle you to get you on their mailing list (for 10% off your order) in the hopes that you'll make another impulse purchase later when you get a text or email from them. If they weren't as aggressive you might never make another impulse purchase with them again as there are tons of brands selling nearly identical products.

Some companies are just horrible at hustling so they actually get in the way of you completing your purchase. In a competitive market this is a self correcting problem.

epolanski3y ago

Anecdotal: we released plenty of improved features, like a better gallery to see the items in our shop, users used it a lot +250%, but conversion rate went down 4%.

They spent more time seeing the items and..didn't like the pics and conversion went down. In the end we reverted to the crap gallery we had before, they don't click it anymore and conversion went back up again..

bee_rider3y ago

If it possible that there'd be a long term effect like:

* Users know you have a nice gallery

* They are more likely to shop at your store

* In the end, you get more sales despite the lower conversation rate

epolanski3y ago

I agree with your point, but after finding out that in this industry you just need to be able to monkey some code to be called an engineer, random people are now data analysts because they can pull "experiment A revenue up, experiment B revenue down" and call it a day.

walrus013y ago

If you really want to see a massive amount of additional offers and small/partially hidden "no thanks" links, check out the work flow to reserve and rent a small light duty trailer with U-Haul.

You have to click through at least 10 pages of additional offers (and many extra price things that are added by default!) before you get to the actual checkout page.

ninkendo3y ago

As a rule, I open sites I’ve never been to in a new tab, with my hand hovering over the Cmd and W keys. The moment I get an popup of any kind that obscures content offering me to sign up for a mailing list or offer me some bullshit coupon, my fingers come down and I close the tab. I really hope the A/B tests actually show a bounce here, but I doubt it because I’m actually closing the tab, not using the back button (too many sites hijack it) and I’m not sure they can detect that I’ve left. I also do it so quickly now that they probably don’t even get the signal that I saw the popup either.

Site owners: please stop doing this. You’re turning the web into a cesspit. You’re part of the problem.

EricMausler3y ago

/rant

AB testing is and always has been fish oil for management. The only things it can actually prove, are more easily identifiable by common sense. So wherever it actually works, it was probably a waste of time / overkill for evidence.

- sincerely, a business analyst

tbranyen3y ago

Have to disagree. I've found plenty of issues that affect real production users through the use of AB testing. Problems that were small enough to escape review, testing, and reporting, but large enough to be stat-sig. They always lead to a bug, or issue with test vs control.

I will always use AB testing for uncertain code in the future. I was skeptical when I first started writing AB tests, but they have proven their worth over and over again.

ratww3y ago

Sure, but that's not really A/B testing, those are more often called staged rollouts or progressive rollouts.

1 more reply

lifeplusplus3y ago

A nice way to summarize this article to think about local maxima and global maxima.

A/B testing right now is done on cohort basis and tests are ran for weeks to couple of months. This means where lifetime span of a customer is beyond few weeks and months, it's really not possible to tell if global maximum was missed.

I.e. you increase the number of promotional emails the customers get per week. You do it for 3 weeks and see that customers who got those emails had higher conversion. But you didn't get to see that customers who kept getting those higher number of emails completely unsubscribed after 3 months of pain. But by this time all customers are on higher frequency group so it's hard to tell what would be driving the unsubscriptions.

I'm no expert but here are some solutions:

1. You should have really delayed long running control groups. Preferably going well beyond average duration your customer sticks around. These groups should get onto new things a year after. But even then it'd be not possible to take out WHAT feature is affecting them, because in 1 year main group would have accumulated lot of features. But still something...

2. You should really have lots of secondary KPIs that measure things that affect long term KPIs. Sure conversion is better, but is time spent reading newsletters increasing? Are buyers feeling good about their experience with the brand... some of these KPI are more qualitative and can't be just automated.

what else?

ravivyas3y ago

In todays world of algorithms optimising marketing, and constant updates on marketing channels, it is hard to say if an A/B test worked as quality of users is never consistent.

I currently work in a game publishing company, here are 2 anecdotes from it

1. We run an A/B for game performance but we keep changing the bids for our games, and thus get varied quality of users, A/B tests don't really help in such a case 2. Once by mistake we ran the same creative on FB for 2 different ads.. both ended up having totally different metrics

commandlinefan3y ago

> Next to some hotels, a message that supply was limited.

It's also worth noting that there's no way in hell they actually know that with any sort of precision. No GDS has proper up-to-date knowledge of bookings from all the various sources that hotel reservations actually go through (they overbook airline flights). What they're really saying is that the small inventory of rooms that are reserved for them to book exclusively are almost gone.

twawaaay3y ago

I think the main issue is that testing is misused to create better version of something when it should be used to create knowledge.

So if you do testing and it gives you some kind of result, the crucial step is trying to understand what it really means, is there something we can learn from it.

Unfortunately, this is also the hard part that requires actual effort and intelligence and is difficult to scale -- and so is frequently skipped.

nuc1e0n3y ago

This kind of optimisation for short term gains at the expense of long term sustainability is what is causing climate change and the collapse of the global economy. But the politicians and heads of industry who preside over this situation will all be retired/dead before it becomes a problem. Or so they thought.

test12353y ago

archive: https://archive.ph/fuUPG

mdip3y ago

The "Hotel Tonight" example hit home with me, recently.

I used to use that app all the time, then kids happened and spontaneous hotel reservation became rare. Fast forward a few years and a circumstance came up that made me think "Hotel Tonight". I discovered it wasn't installed on my new phone so I grabbed it. It was unrecognizable. Maybe the prices were as good, maybe it could still be used the way I used it previously, but it looked like it turned into a hotel booking app when what I wanted to see was a small selection of good hotels nearby with unusually low prices. One of the features was the lack of choice.

morelandjs3y ago

I work in this field. Author makes great points. Here are some additional thoughts: * Don't use bad optimization to discredit all optimization * Incremental A/B testing assumes that all accumulated features are independent with no interactions, which is almost assuredly a poor assumption as a website grows and space becomes limited * A/B testing is the only reliable tool to test hypotheses; causal inference is great but not a substitute * Incremental A/B testing (small changes) should be periodically coupled with mutational A/B testing (large changes) to tunnel from local optima to global optima

civilized3y ago

Whenever someone says "X tends to be bad", people will show up to say "not all X is bad, only bad X is bad" which is really beside the point.

gumby3y ago

Such perfect timing: I just tried to place a take out lunch order with a restaurant. Opening the page popped up a modal box that said "Join Our List Subscribe to find out about new specials, community events, store openings and more." There were no buttons to click, no place to enter my email address (had I wanted to) and clicking did not dismiss it. The modal had a background that obscured the actual page.

I finally opened the inspector and deleted it, so that I could use the menu to select "order online", which took me to a page ... with the same modal.

WhitneyLand3y ago

What an excellent write up.

I agree with the sentiment on AB testing but I think the bigger insight is that we need to be reminded to see the forest for the trees with any process, tool, or goal.

Sometimes these intangibles are hard to measure and almost need to be sensed.

It reminds me of how you can see the exact same development methodology used at two different companies, where at one company it works beautifully and at the other it becomes a bureaucratic albatross.

readingnews3y ago

Hrm, looks like A/B testing has destroyed the website.

phendrenad23y ago

A/B testing can be powerful, but you quickly lose your editorial voice and your headlines become the same clickbait garbage that works for bottom-tier blogspammers. Look at a site like The Register. Could they use A/B testing to pick headlines? If they do, it's a light usage, because the clever and witty headlines have an internal consistency that I've come to enjoy and expect.

londons_explore3y ago

You can get long term results from AB tests long after the test has ended...

For example, you can see if Group A or Group B from a test are more likely to still use the site 1 year later.

You hypothesize that those ways to 'juice the metrics in the short term' hurt the user experience in the long term... Well if your hypothesis is right, these long term AB results should show it.

epolanski3y ago

> For example, you can see if Group A or Group B from a test are more likely to still use the site 1 year later.

This isn't very feasible on most products and certainly limited by the amount of data collected.

redleggedfrog3y ago

I think that you have to take into account the popularity of these methods when evaluating whether to implement them. It would seem the more sites that do these obtrusive UI patterns the less effective they become. Anecdotally nearly every method described in the article is an automatic back button off the site for me.

ghostly_s3y ago

Nothing torpedoes my opinion of a brand more effectively than one of those insulting "Yes, spam me!"/"No, I'm a moron who hates saving money" popups. Absolutely mind-boggling that any thinking person thought that was an okay way to talk to customers.

slotrans3y ago

The way A/B testing is practiced by actual people in the actual world, it is fundamentally broken.

No one EVER tests for mean-reversion over time.

17 years I've seen companies do A/B tests. I doubt I've seen a single convincing, durable result the whole time.

fairity3y ago

It doesn’t seem like the author has any hard data that supports his claim that long term LTV and K-factor losses outweigh short term conversion rate wins. Maybe I missed it? Without said evidence, it’s probably safe to assume his generalized claim is wrong in most cases.

zeroonetwothree3y ago

Reminds me of this post: https://biggestfish.substack.com/p/data-as-placebo

weeksie3y ago

Bright eyed PMs whispering "statistically significant" to themselves over and over as they nervously scan their data aggregation dashboards for wiggles.

langsoul-com3y ago

Nobody gets a promotion for a long term oriented ab test. Hence short term is here to stay.

Saying you made a 10% purchase rate improvement in a month is an easy pay rise.

forgotmypw173y ago

I have developed a personal strategy of ridding the Web of these things. Anytime it happens, I close the tab and move along. Very little of value is lost.

donmcronald3y ago

This is basically what I do. Anything that pops up or tries to grab my attention gets instant closed before I look at it and if I can't find the control to close it in 1s I just close the whole tab.

Eddy_Viscosity23y ago

I'd say this is a sub-category of the saying: "There is nothing in this world that an MBA can't and won't make worse".

It happens when people change perspectives from building and sustaining businesses to exploiting and squeezing every employee, supplier, and customer for the last drop.

dang3y ago

Please don't take HN threads into flamewar. The one you started here was particularly shallow and gratuitous.

https://news.ycombinator.com/newsguidelines.html

jollybean3y ago

This is just casual bigotry.

verve_rat3y ago

Calling that bigotry is offensive to anyone that has been subjected to actual bigotry. You should be ashamed.

2 more replies

Eddy_Viscosity23y ago

I'll say that there are managers out there who do fantastic and important work, but they are the ones that have a build & sustain perspective. If you feel offended by this comment, its because you are in the squeeze & exploit group and you know it.

1 more reply

dang3y ago

Please don't feed flamewars on HN, regardless of how bad another comment is or you feel it is. It just makes everything worse.

https://news.ycombinator.com/newsguidelines.html

Edit: you broke the site guidelines particularly badly later in the thread. We ban accounts that do that, so please don't do it again. More here: https://news.ycombinator.com/item?id=32072856.

scotty793y ago

Giraffe neck is the result of a/b testing.

If you known its inside anatomy you know what I mean.

tracerbulletx3y ago

If you're upset about internet retail, I hope you're also upset about milk being in the back of the store to get you to walk through the whole thing, because this has been merchandising's bread and butter for a very long time.

slotrans3y ago

This is a long-standing and oft-repeated myth.

Milk is in the back of the store because that's where it makes sense to have a refrigerated wall.

Milk is increasingly available in smaller quantities in compact refrigeration units at the front of the store.

tracerbulletx3y ago

It's probably both. But where products are placed in a store are uncontroversially designed to sell more not to make your life more convenient. https://www.npr.org/2014/08/01/337034378/everyone-goes-to-th...

epolanski3y ago

Eggs and salt are a common dark pattern, milk is in the fridges with diary products in most of europe, hard to miss.

j / k navigate · click thread line to collapse

334 comments

jerf3y ago

"As an experiment, I went through a list of holiday weekend sales, and opened all the sites. They all — all, 100% — interrupted my attempt to give them some money."

enlyth3y ago

[0] https://i.postimg.cc/HW89hs7r/Screenshot-2022-07-12-145957.p...

eloisius3y ago

5 more replies

bombcar3y ago

"Remember me in your newsletter list" is the next one. Send me money!

cardamomo3y ago

Yes! I just noticed this for the first time yesterday and thought, "I hope this isn't another terrible trend in dark patterns."

dwighttk3y ago

Heh. I’ve stopped clicking remember me boxes because they never work.

aroccoli3y ago

Yeah, I booked a flight on WizzAir two days ago, and this felt like a low blow even from WizzAir.

1 more reply

CodesInChaos3y ago

atlassian does this as well: https://i.postimg.cc/zfwbG5Ft/atlassian-login.png

1 more reply

Akronymus3y ago

Thats one of the things I REALLY dislike about GoG lately. It tries really hard to bait you into signing up for the newsletter when buying stuff.

SoftTalker3y ago

I never check "remember me" so maybe that's good for me?

nemo44x3y ago

d233y ago

2 more replies

spookthesunset3y ago

1 more reply

DrewADesign3y ago

retcon3y ago

The primary not secondary effect of random sampling is noise.

CSMastermind3y ago

Recently I got an account on a developer tool (Checkly) because the company I joined uses it. I then got 5 different emails from them in a 48 hour period.

chairmanwow13y ago

Sounds like they are optimizing for a KPI on time to full integration. Someone else paid for it, now they want to make sure that you are actually using it.

Still absurd, but I know this is a problem friends of mine have had.

1 more reply

mattgreenrocks3y ago

The 'cost' of email is borne by the recipient, mostly.

tnolet3y ago

4 more replies

joe_the_user3y ago

Let me propose a different possibility.

Suppose the site isn't concerned about the sale very much at all?

What seems like "poor optimization" can easily optimization for something and could be seen as "the scammification of the web".

bsuvc3y ago

Exactly this.

As an individual, it is annoying, but the company is focusing only on the macro effect when it does something like this.

Jenk3y ago

> I'm signalling interest in product X as hard as I can, and you've AB tested that this is a great time to jangle your keys over there instead?

YMMV, grain of salt, context dependent, etc, etc.

jerf3y ago

In this case, what I'm referring to is:

    1. Clicked on page.
    2. Took maybe 10 seconds to take in what is "above the fold".
    3. Scrolled down to see what else there is.
    4. BAM! Popup triggered by scrolling down.

While I understand what you're getting at, they do not yet have the info to know that I'm browsing or whatever. They were so excited about their stupid popup that they didn't even get that far.

3 more replies

sharemywin3y ago

The number to figure out is how much time do you wait to interrupt. Also wonder if it's person dependant. Some people aren't impulsive buyers.

InCityDreams3y ago

>I have seen data from a big retailer that shows any user that doesn't immediately purchase an item, is actually not that interested in the product on the screen.

axus3y ago

Could the popup be a punishment for reading the fine print?

oxfordmale3y ago

I love to sign-up to news letters and get a discount. Of course I am giving you my spam account I set up for this exact purpose.

naravara3y ago

I hate them so much. It makes it feel like so much more of a chore to try to do research or look for things online. I'd honestly prefer 56k page-load speeds if the pages were free of this garbage.

jdlshore3y ago

m4633y ago

Amazon does this in a horrible way.

1) I open a product in a tab. I click "add to cart" and a "related products" sidebar slides in. I close the tab in annoyance.

However, some items exhibit a similar pattern, EXCEPT...

2) I open a product in a tab. I click "add to cart" and a stupid extended warranty sidebar slides in. I close the tab in annoyance.

The difference?

Item #1 gets added to my cart

Item #2 doesn't make it to the shopping cart.

Amazon just silently deleted my purchase.

I actually don't know when it dawned on me that this happened, but amazon lost money on me because I didn't buy certain things.

tshaddox3y ago

jl63y ago

1 more reply

sharemywin3y ago

Most people are just browsing so it's optimizing for that when it should be optimizing for sales

EGreg3y ago

tracerbulletx3y ago

I would really warn you against thinking your intuitions are going to be a good sign for whether or not something is a good retail decision.

legalcorrection3y ago

lifeplusplus3y ago

hbn3y ago

I've talked about this on HN but I'll say it again.

costcofries3y ago

"To me, A/B testing means you don't respect your users. You see them as just one factor in your money machine that can be poked and prodded to optimize how much money you can squeeze out of them. "

manmal3y ago

UX feedback is usually retrieved via user interviews.

1 more reply

jcelerier3y ago

> Your perspective is extremely short-sighted.

it's the perspective of the normal users.

A/B testing can result in this type of behaviour but that's just poor A/B testing.

drewcoo3y ago

"You're doing it, but in some subtle but very important way that's not at all obvious to you, you're doing it wrong."

How many tech startup patterns fit that? That's a sign that either the pattern does not generalize well or it's snake oil.

3 more replies

loriverkutya3y ago

I never seen an A/B test where the goal was not to maximise profit for the company.

7 more replies

saagarjha3y ago

NyxWulf3y ago

A/B testing is a tool, and as Deming said, the aim defines the system. In your definition of Good, you are defining the aim.

I believe that is part of the point of the post, that unintended, and even unimagined side effects plague even the best of ideas.

koheripbal3y ago

Do people honestly think that Apple doesn't do A/B testing?

1 more reply

civilized3y ago

You're missing GP's point. This is their experience with A/B testing as practiced. It's not about whether A/B testing could theoretically be great.

DiggyJohnson3y ago

I’m not sure this addresses the core criticism of using your users as a testbed.

8note3y ago

A/B testing is always going to be distracting to your users though.

Sometimes they'll be in A, and sometimes in B.

The button moves or disappears or appears.

Your user does not get an experience they can rely on

svnpenn3y ago

> Your perspective is extremely short-sighted

no, yours is. if some company wants to do some testing, they SHOULD PAY users for that. A/B testing is just exploiting users to get free testing.

corrral3y ago

> Every app update was a risk of the app rearranging itself, or features appearing/disappearing.

selfhoster113y ago

At one point I was trying to set up my grandma with a popular video calling app on a dedicated device so we could stay in touch.

2 more replies

jl63y ago

If you put a dollar value on your own time spent providing training and tech support to your parents, the iPhone options start to look much cheaper.

1 more reply

bombcar3y ago

It's doubly sad because the larger phone or iPad would be perfect, but every year it's another $200 to tell Google they're doing the right thing.

At least on iPad/iPhone you can set the apps to access Google mail, etc, which doesn't change as often, but still too often.

aceazzameen3y ago

bentcorner3y ago

Google recently shut off Hangouts on me with a flight (since several of my contacts reported that Hangouts worked fine for them).

At this point Google Maps is the only product that is keeping me with them, but even that one is beginning to wear thin.

jklinger4103y ago

> Google Chat) doesn't even have feature parity

Which features are you missing?

2 more replies

brynjolf3y ago

The point is of course to make it annoying to manually update apps and enable auto update. I have been burned too many times with a auto update so I refuse.

Whoever is doing this at Google... Stop. Just stop. It is cringe.

[0] https://i.imgur.com/MbXn9gy.jpg

mgoetzke3y ago

Recently they remove YouTube PiP on iOS. Then it came back to my device but not others in the family. We pay for YT Premium. This is beyond infurating

1 more reply

fouric3y ago

Maybe Google would be better as a B2B company.

[1] https://killedbygoogle.com/

adrr3y ago

fisf3y ago

The fact that you need an A/B test to demonstrate that frontend performance has an impact (on user experience first and foremost) in 2022 speaks volumes.

3 more replies

quickthrower23y ago

I moved back to iPhone in the 2015s onwards (I think) because of the crap quality of devices available, and the bloatware atop if android.

In many cases the hardware was so poor it was hard to make a call due to the touchscreen.

Since the primary thing I want the phone to do is make a call I switched to the “it just works” camp and haven’t regretted it.

Except getting photos off the phone. Until I realised the best tool for that is … Ubuntu!

amelius3y ago

Yes, I hate Apple, but I'm starting to hate Google even more. One of these days, I might switch too.

Why can't we have nice things?

dakial13y ago

bratbag3y ago

Depends on where A/B testing is used.

If it's something one-and-done (like different permutations of a signup flow to see what is easier for users), then I don't see the harm in it.

Lutger3y ago

You just described doing business in todays world.

Your complaint seems to be more about the scale and aspect of automation honestly, and continuity of the services, which is a valid complaint against Google but not about A/B testing in general.

saagarjha3y ago

> Because it means you just assume you know what their experience is like, or worse: you don't even care about it and bother to learn something.

A/B tests are not the only, or even the best, way of collecting user feedback.

ChrisMarshallNY3y ago

> means you don't respect your users.

I was just pontif- er, talking about this to someone, a couple of days ago.

[0] https://youtu.be/jXf04bhcjbg?t=638

[1] https://developer.apple.com/documentation/xcode/defining-a-c...

causi3y ago

tomrod3y ago

This highlights the major downside to "data-driven" policy and decisions.

Data can "lie". What is observed is not always reality, simply what we can see of it.

goodside3y ago

tsimionescu3y ago

1 more reply

Philadelphia3y ago

OkCupid has completely destroyed its interface and utility, so whatever they’re doing doesn’t seem to be working anymore.

1 more reply

im3w1l3y ago

Did you A/B test the matching algorithms?

1 more reply

multivariate3y ago

dr_dshiv3y ago

naravara3y ago

> but the constant use of clickbait headlines will over time destroy overall traffic to your site

boruto3y ago

Indeed,

If do more ad placements increase revenue is the test and then there is 20% jump what are you as an engineer going to do? Tell to management that its bad?

benja1233y ago

10x-dev3y ago

HWR_143y ago

It's about observing the users fumble through your UX when you know their motivation.

2 more replies

throwaway987973y ago

only listen to customers problems and never their solutions

time_to_smile3y ago

If your only company focus is Q4 KPIs then you really can't have "customer obsession".

jklinger4103y ago

> The main issue is that people mix conversion with customer obsession!

The logic is: If they hate your app, they won't spend money. If they love your app, they will. Which is what would make you think A/B testing and UX work are the same thing.

There's really nothing new about this issue at all. Playing towards the average creates a lot of shitty stuff, in apps/websites as well as politics and wherever else there are metrics to track.

The genius of a good product is that it will make a stand and not give in to the whims of over-optimization in order to maintain its original intent. This is what made Apple unique.

It requires leadership with guts who aren't chasing the latest shiny object.

jfoster3y ago

Yeah, this is key. Improving a product in the direction of customer intent vs against customer intent.

axg113y ago

A/B testing is local optimization. It should only be done on a mature(-ish) product when you have given up on finding a global minimum.

jaggederest3y ago

3pt141593y ago

Even checking A/A tests won't surface all the issues. A proper A/B test is one that samples over a long enough time to adjust to the true audience of the service.

marcosdumay3y ago

purplerabbit3y ago

Great insight. Without this approach, A/B testing could be used to generate an infinite stream of meaningless work

slotrans3y ago

The first company I worked for, and also the first company I saw A/B tests at, once ran an A/A test because someone was a bit skeptical about some of the test results that had been claimed.

The conclusion from this exercise was...

"It takes 2 weeks."

¯\_(ツ)_/¯

jakubmazanec3y ago

That's why we calculate stuff like effect size and power of a test (or even better, use Bayesian statistics); just p < 0.05 is practically almost meaningless.

ravivyas3y ago

"Running experiments and A/B tests are popular" ... because you can give up on your own judgment and opinions and say "the data says"

4111111111111113y ago

> give up on your own judgment and opinions and say "the data says"

andsoitis3y ago

amluto3y ago

rightbyte3y ago

McMaster-Carr might be the single thing I miss the most from my time in the US. It is like ... stupid good. Their listings catalogization is like godlike compared to alternatives.

int_19h3y ago

https://www.usplastic.com/ is another "industrial" example.

saagarjha3y ago

> This is an interesting contrast to Amazon that also makes checkout easy

Hey, would you like Prime with that? Do you know we provide free two-day shipping with Prime? If you sign up for Prime today you can get a $100 discount!

mschuster913y ago

mjburgess3y ago

You're downvoted, but this issue is more common than it seems; and I agree, more serious than it seems.

You're describing adversarial UI changes to small populations of then unsupported customers. This can have outsized impacts on vulnerable populations, eg., esp. the elderly.

Akronymus3y ago

Not to mention the lovely "This option that used to be here no longer is here." getting the response of "its still here (for me)". Youtube specifically loves to do this to me.

emsixteen3y ago

> it's rare that companies offer actual, live-human support these days anyway.

This is one of my most intense frustrations in the modern age. Complete and utter disrespect for your customers' time and knowledge.

donmcronald3y ago

> Outlook now put feature XYZ" (in the case that comes to my mind, the CC bar which used to be tab-reachable, now you have to tab+space or manually click on the tiny gray "cc" letters)

Desktop is another world. I often have 20+ inches of horizontal space and a hamburger menu. It's infuriating, especially when the hamburger menu is hiding one button.

johnnymorgan3y ago

Love the post and I tend to agree.

As Product manager/owner I've only found A/B testing useful when trying to narrow in on a specific demographic and you are trying to find some optimization.

The marketing/sales funnel use of it is kind of gross and has ruined , imo, something that has utility in a very narrow scope.

Cheers, also very much agree customers should be informed and allowed to opt out.

'hey we have a new UX to try..would you like to switch?' the data from people that opt-in is way better

happimess3y ago

edmundsauto3y ago

I've had a similar experience, although my learning was "people, even experts, are really terrible at understanding which treatment will perform better".

We always run >=3 variants, surveyed the dozen team members on which one they thought would run. Over the years, there was no clear pattern over who could make that prediction.

IE, it's not possible to predict which is the most effective treatment, even when you include a really bad idea in the treatments!

gorbachev3y ago

Was one of the learnings "Everyone hated the product manager"?

ryanmcbride3y ago

Them: "Hey, could you do an A/B test of our existing site design and our upcoming redesign?"

Me: "if the old design performs better are you going to toss out the redesign?"

Them: "No we're going with the redesign but we want metrics on how it'll affect traffic"

Them: "Run it anyway"

mattgreenrocks3y ago

I recall when Booking.com rolled out the false urgency features. I was amazed at how utterly trashy and desperate they were.

I don't use them anymore for that reason. Old soul (me) is old. (I'm probably in a minority, judging by their advertisement budget.)

LeonM3y ago

But unfortunately, it works.

I've seen friends that I consider intelligent panic buy tickets/hotels, "because prices are going up since the last time I checked!"

tnolet3y ago

coldcode3y ago

Having worked in the OTA space, every time I see their (sometimes funny) ads, I want to call them Booking.Nope

These days I just find a nice hotel and book with them/their system directly. Airlines too, since airlines fail to give all their options to the OTAs.

iamben3y ago

AndrewThrowaway3y ago

What is the solution?

earthmancash3y ago

And the funny thing is, I agree with this article. Both the content and the heading of this hackernews article:

fleddr3y ago

The author's idea is that this short term gain damages longer term metrics. That sounds logical and agreeable, but that doesn't make it true. Not in my experience anyway.

It's so hard to stop because it actually works. It works short term and evidence that it harms long time is typically lacking or it simply isn't true.

chunkyks3y ago

"If a study came out that said deafening high-pitched noises increased conversion rates, we would all be bleeding from our ears by end of business tomorrow, right?"

Netflix auto play? Is that you? You were a hateful idea, no one liked you, yet you stubbornly hung on for far too long

_tom_3y ago

I'm convinced that Netflix uses "number of hours watched" as a success metric. Autoplay raises that.

I'd pay twice as much to watch half as much quality programming, but that would tank what they think is a positive metric.

chunkyks3y ago

It did! People started watching stuff sooner!

andreareina3y ago

Getting a 503, so here's an archive: https://web.archive.org/web/20220712122630/https://www.zumst...

kjhgkjghkj3y ago

Having left FB years ago, I now watch people "navigate" their site/apps with disbelief.

regularfry3y ago

Isn't that exactly the problem? The resulting UI isn't designed, it's aggregated across a disjointed set of granular tweaks.

ssharp3y ago

This is part of the "unchecked" part of AB testing the headline mentions.

kjhgkjghkj3y ago

It's why I saw it as my moral duty to leave (as well as the other FB properties), so that at least in a small way, I "produce content" that is only available by interacting with me as a person.

fairity3y ago

Yes, it's possible to use A/B testing in a short-sighted fashion. Yes, I'm sure there are plenty of examples thereof.

epups3y ago

Kaotique3y ago

AB testing shows zero respect to your customers. It is the equivalent of testing your theories on lab rats.

Instead try to improve the customer experience, make better products, improve customer service.

Saturdays3y ago

Kaotique3y ago

It did not mean it too seriously, of course there are also good AB tests, but there are a lot of bad ones out there. Those are what the article was about.

(edited for clarification)

1 more reply

smeyer3y ago

>Instead try to improve the customer experience

throwaway2903y ago

A/B testing helps you maximize a metric, not make customer experience better. Those are different things.

2 more replies

mschuster913y ago

> Assuming you know exactly what will make the customer experience best without actually testing it can also lead to a worse experience.

It's real fun and a worthwhile experience for everyone involved.

1 more reply

rgavuliak3y ago

nkrisc3y ago

It's possible the flow they created was generally better but perhaps it had one fatal flaw. Perhaps that flaw could easily be remedied once identified.

A/B testing is just one small part of a good UX process.

treis3y ago

>Instead try to improve the customer experience, make better products, improve customer service.

Without a metric to say what is "better" and a method to measure it this is empty advice.

Kaotique3y ago

There are many other metrics that do not involve AB testing. You can just survey customer experience before, during and after a purchase for example. I never said to throw out all metrics.

josefx3y ago

But does AB testing provide the only or even best metric for that? It probably is the cheapest way requiring the least engagement with the lab mice.

1 more reply

_the_inflator3y ago

What once was ads everywhere, is now psychological gaming.

I hope someone comes up with a Google Extension, and maybe Apple with a new "Access Website" mode.

These messages are boring to be honest. Once you noticed them everywhere, game over for me. Time to move on.

ryanmarsh3y ago

One time offers, limited time offers, mailing list signups, up-sells, and cross sells are time tested ways to increase sales dating as far back as radio era telephone and catalog sales.

Some companies are just horrible at hustling so they actually get in the way of you completing your purchase. In a competitive market this is a self correcting problem.

epolanski3y ago

Anecdotal: we released plenty of improved features, like a better gallery to see the items in our shop, users used it a lot +250%, but conversion rate went down 4%.

bee_rider3y ago

If it possible that there'd be a long term effect like:

* Users know you have a nice gallery

* They are more likely to shop at your store

* In the end, you get more sales despite the lower conversation rate

epolanski3y ago

walrus013y ago

If you really want to see a massive amount of additional offers and small/partially hidden "no thanks" links, check out the work flow to reserve and rent a small light duty trailer with U-Haul.

You have to click through at least 10 pages of additional offers (and many extra price things that are added by default!) before you get to the actual checkout page.

ninkendo3y ago

Site owners: please stop doing this. You’re turning the web into a cesspit. You’re part of the problem.

EricMausler3y ago

/rant

- sincerely, a business analyst

tbranyen3y ago

I will always use AB testing for uncertain code in the future. I was skeptical when I first started writing AB tests, but they have proven their worth over and over again.

ratww3y ago

Sure, but that's not really A/B testing, those are more often called staged rollouts or progressive rollouts.

1 more reply

lifeplusplus3y ago

A nice way to summarize this article to think about local maxima and global maxima.

I'm no expert but here are some solutions:

what else?

ravivyas3y ago

In todays world of algorithms optimising marketing, and constant updates on marketing channels, it is hard to say if an A/B test worked as quality of users is never consistent.

I currently work in a game publishing company, here are 2 anecdotes from it

commandlinefan3y ago

> Next to some hotels, a message that supply was limited.

twawaaay3y ago

I think the main issue is that testing is misused to create better version of something when it should be used to create knowledge.

So if you do testing and it gives you some kind of result, the crucial step is trying to understand what it really means, is there something we can learn from it.

Unfortunately, this is also the hard part that requires actual effort and intelligence and is difficult to scale -- and so is frequently skipped.

nuc1e0n3y ago

test12353y ago

archive: https://archive.ph/fuUPG

mdip3y ago

The "Hotel Tonight" example hit home with me, recently.

morelandjs3y ago

civilized3y ago

Whenever someone says "X tends to be bad", people will show up to say "not all X is bad, only bad X is bad" which is really beside the point.

gumby3y ago

I finally opened the inspector and deleted it, so that I could use the menu to select "order online", which took me to a page ... with the same modal.

WhitneyLand3y ago

What an excellent write up.

I agree with the sentiment on AB testing but I think the bigger insight is that we need to be reminded to see the forest for the trees with any process, tool, or goal.

Sometimes these intangibles are hard to measure and almost need to be sensed.

readingnews3y ago

Hrm, looks like A/B testing has destroyed the website.

phendrenad23y ago

londons_explore3y ago

You can get long term results from AB tests long after the test has ended...

For example, you can see if Group A or Group B from a test are more likely to still use the site 1 year later.

You hypothesize that those ways to 'juice the metrics in the short term' hurt the user experience in the long term... Well if your hypothesis is right, these long term AB results should show it.

epolanski3y ago

> For example, you can see if Group A or Group B from a test are more likely to still use the site 1 year later.

This isn't very feasible on most products and certainly limited by the amount of data collected.

redleggedfrog3y ago

ghostly_s3y ago

slotrans3y ago

The way A/B testing is practiced by actual people in the actual world, it is fundamentally broken.

No one EVER tests for mean-reversion over time.

17 years I've seen companies do A/B tests. I doubt I've seen a single convincing, durable result the whole time.

fairity3y ago

zeroonetwothree3y ago

Reminds me of this post: https://biggestfish.substack.com/p/data-as-placebo

weeksie3y ago

Bright eyed PMs whispering "statistically significant" to themselves over and over as they nervously scan their data aggregation dashboards for wiggles.

langsoul-com3y ago

Nobody gets a promotion for a long term oriented ab test. Hence short term is here to stay.

Saying you made a 10% purchase rate improvement in a month is an easy pay rise.

forgotmypw173y ago

I have developed a personal strategy of ridding the Web of these things. Anytime it happens, I close the tab and move along. Very little of value is lost.

donmcronald3y ago

This is basically what I do. Anything that pops up or tries to grab my attention gets instant closed before I look at it and if I can't find the control to close it in 1s I just close the whole tab.

Eddy_Viscosity23y ago

I'd say this is a sub-category of the saying: "There is nothing in this world that an MBA can't and won't make worse".

It happens when people change perspectives from building and sustaining businesses to exploiting and squeezing every employee, supplier, and customer for the last drop.

dang3y ago

Please don't take HN threads into flamewar. The one you started here was particularly shallow and gratuitous.

https://news.ycombinator.com/newsguidelines.html

jollybean3y ago

This is just casual bigotry.

verve_rat3y ago

Calling that bigotry is offensive to anyone that has been subjected to actual bigotry. You should be ashamed.

2 more replies

Eddy_Viscosity23y ago

1 more reply

dang3y ago

Please don't feed flamewars on HN, regardless of how bad another comment is or you feel it is. It just makes everything worse.

https://news.ycombinator.com/newsguidelines.html

Edit: you broke the site guidelines particularly badly later in the thread. We ban accounts that do that, so please don't do it again. More here: https://news.ycombinator.com/item?id=32072856.

scotty793y ago

Giraffe neck is the result of a/b testing.

If you known its inside anatomy you know what I mean.

tracerbulletx3y ago

slotrans3y ago

This is a long-standing and oft-repeated myth.

Milk is in the back of the store because that's where it makes sense to have a refrigerated wall.

Milk is increasingly available in smaller quantities in compact refrigeration units at the front of the store.

tracerbulletx3y ago

epolanski3y ago

Eggs and salt are a common dark pattern, milk is in the fridges with diary products in most of europe, hard to miss.

j / k navigate · click thread line to collapse