People who actually understand statistics are rare. I can probably weed out 1/3 to 1/2 of candidates simply by asking what a p-value is, or what precision/recall are (this includes people who said they worked in search).
Of the ones who know basic stats, most are neither good at nor interested in programming. They just want to use existing libraries to crunch numbers in a Jupyter notebook, then hand that off to the developers.
Finding a person who can come up with a predictive model, understand what they did, optimize it without breaking it's statistical validity and deploy it to production is very hard.
(If you can do this, I'm hiring in Pune and Delhi. Email in my profile.)
(not sure I can defend somebody that does not know what precision/recall are)
https://en.wikipedia.org/wiki/Confusion_matrix
You can see from that that sensitivity and recall are the same thing, but specificity and precision are not.
Edit: note that I'm not saying you need this to add roi as an analyst for a business!
My only quibble would be that precision + recall are one set of evaluation metrics applicable to classification tasks. Modelers can absolutely use other loss functions.
Additionally, precision/recall do not map nicely to regression problems, so people use other metrics (RMSE, MAE, etc.).
I'd happily take a Bayesian answer if they preferred that, but that hasn't happened very often.
Bayesian stats tend to use likelihood ratios or Bayes factors instead of p-values for hypothesis testing.
The trick in all cases is that you're comparing to expected results given some prior distribution. Most people use a dumb prior (e.g. Gaussian) and then they're confused when the numbers make no sense as data is multimodal or heavy tailed, thus mismodelled.
"just compose a team" sounds easy, doesn't it? Unfortunately there are lots of failure modes involving different parts of the team not really understanding what each other are trying to do, let alone what they are doing, and subtle errors getting by people who don't know what to look for. So, you can find such teams and some of them work well but a lot of them don't.
So an alternate is to try and find or create domain experts who mix all the appropriate skills, but this is hard and in the extreme case involves chasing down unicorns.
Companies and industries flop back and forth between preferring different approaches - right now a lot of people are talking about "data scientists" as one of the latter, but it will likely change over time as it always does.
It's a hard problem, and it shows.
So why don't you just use the phrase "Why don't you just ...?"?
> There are so many more statisticians who can at least communicate and work effectively with developers and vice versa.
Not in my experience. You need to design your data infrastructure to promote easy analysis, and you need to design your models to scale well according to the amount of data you're working with. There are also many cases where a project will require mostly engineering work for a while, and then mostly analysis/statistics work–there are ways to handle this with specialists of course, but there's generally a significant switching cost.
Also people with a combination of statistics & programming aren't that rare–IMO it's more that employers tend to search for both degrees, when instead you should be trying to evaluate the skills directly.
That said, most companies should probably be hiring data engineers rather than data scientists–for most "data science" jobs I've seen, almost no statistics is actually necessary/useful.