Replace peer review with “peer replication” (2021) (opens in new tab)

(blog.everydayscientist.com)

583 pointsdongping2y ago337 comments

337 comments

203 comments · 60 top-level

fabian2k2y ago· 67 in thread

I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

This of course depends a lot on the specific field, but it can easily be months of effort to replicate a paper. You save some time compared to the original as you don't have to repeat the dead ends and you might receive some samples and can skip parts of the preparation that way. But properly replicating a paper will still be a lot of effort, especially when there are any issues and it doesn't work on the first try. Then you have to troubleshoot your experiments and make sure that no mistakes were made. That can add a lot of time to the process.

This is also all work that doesn't benefit the scientists replicating the paper. It only costs them money and time.

If someone cares enough about the work to build on it, they will replicate it anyway. And in that case they have a good incentive to spend the effort. If that works this will indirectly support the original paper even if the following papers don't specifically replicate the original results. Though this part is much more problematic if the following experiments fail, then this will likely remain entirely unpublished. But the solution here unfortunately isn't as simple as just publishing negative results, it take far more work to create a solid negative result than just trying the experiments and abandoning them if they're not promising.

kergonath2y ago

> I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

They also tend to over-estimate the effect of peer review (often equating peer review with validity).

> If someone cares enough about the work to build on it, they will replicate it anyway. And in that case they have a good incentive to spend the effort. If that works this will indirectly support the original paper even if the following papers don't specifically replicate the original results. Though this part is much more problematic if the following experiments fail, then this will likely remain entirely unpublished.

It can also remain unpublished if other things did not work out, even if the results could be replicated. A half-fictional example: a team is working on a revolutionary new material to solve complicated engineering problems. They found a material that was synthesised by someone in the 1980s, published once and never reproduced, which they think could have the specific property they are after. So they synthesise it, and it turns out that the material exists, with the expected structure but not with the property they hoped. They aren’t going to write it up and publish it; they’re just going to scrap it and move on to the next candidate. Different teams might be doing the same thing at the same time, and nobody coming after them will have a clue.

techdragon2y ago

This waste of effort by way of duplicating unpublished negative results is a big factor in why replicated results deserve to be rated more highly than results that have not been replicated regardless of the prestige of the researchers or the institutions involved… if no one can prove your work work was correct… how much can anyone trust your work…

I have gone down the rabbit hole of engineering research before and 90% of the time I’ve managed to find an anecdote or subsequent research footnotes or actual subsequent research publications, that substantially invalidated the lofty claims of the engineers in the 70s or 80s (which is amazing still despite this, a genuine treasure trove of research unused and sometimes useful aerospace engineering research and development) and unfortunately outside the few proper publications, a lot of the invalidations are not properly reverse cited research material and I could have spent a week cross referencing before I spot the link and realise the unnamed work they are saying they are proving wrong is actually some footnotes containing the only published data (before their new paper) on some old work that has a bad scan copy on the NASA NTRS server under some obscure title and no related keywords to the topic the research is notionally about…

Academic research can genuinely suck sometimes… particularly when you want to actually apply it.

2 more replies

vibrio2y ago

“They also tend to over-estimate the effect of peer review (often equating peer review with validity).“

In my experience, scientists ate comfortably cynical about peer review- even those that serve as reviewers and editors- except maybe junior scientists that haven’t gotten burned yet.

4 more replies

sebzim45002y ago

>I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

I think it would be fine to half the productivity of these fields, if it means that you can reasonably expect papers to be accurate.

dmarchand902y ago

I believe that, contrary to popular belief, the implementation of this system would lead to a substantial increase in productivity in the long run. Here's why:

Currently, a significant proportion of research results in various fields cannot be reproduced. This essentially means that a lot of work turns out to be flawed, leading to wasted efforts (you can refer to the 'reproducibility crisis' for more context). Moreover, future research often builds upon this erroneous information, wasting even more resources. As a result, academic journals get cluttered with substandard work, making them increasingly difficult to monitor and comprehend. Additionally, the overall quality of written communication deteriorates as emphasis shifts from the accurate transfer and reproduction of knowledge to the inflated portrayal of novelty.

Now consider a scenario where 50% of all research is dedicated to reproduction. Although this may seem to decelerate progress in the short term, it ensures a more consistent and reliable advancement in the long term. The quality of writing would likely improve to facilitate replication. Furthermore, research methodology would be disseminated more quickly, enhancing overall research effectiveness.

2 more replies

advisedwang2y ago

It would be more than just half productivity. Not only do you have to do the work twice, but you add the delay of someone else replicating before something can be published and built upon by others. If you are developer, imagine how much your productivity would drop going from a 3 minute build to a 1 day build.

2 more replies

crote2y ago

Where are you going to get the budget to build a second LHC solely for replication? How are you going to replicate a long-term medical cohort study which has been running for thirty years? What about a paper describing a one-off astronomical event, like the "Wow!" signal? What if you research the long-term impact of high-dose radiation exposure during Chernobyl?

There is plenty of science out there which financially, practically, or ethically simply by definition cannot be replicated. That doesn't mean their results should not be published. If peer review shows that their methods and analysis are sound, there is no reason to doubt the results.

1 more reply

mapt2y ago

We could easily 10x the funding and 5x the manpower we throw at STEM research if we actually cared what they produced.

NSF grants distribute 8.5 billion dollars a year, which is less than Major League Baseball (and its Congressionally granted monopoly) makes. The US Congress has directed 75 billion dollars in aid to Ukraine to date.

1 more reply

harimau7772y ago

The issue that I see is: even if halving productivity is acceptable to the field as a whole; how do you incentivize a given scientist to put in the effort?

This seems particularly problematic because it is already notoriously hard to get tenure and academia is already notoriously unrewarding to researchers who don't have tenure.

ImPostingOnHN2y ago

half would only be possible if, for every single paper published by a given team, there exists a second team just as talented as the original team, skilled in that specific package of techniques, just waiting to replicate that paper

hoosieree2y ago

Half is wildly optimistic.

sqrt_12y ago

FYI there is a at least one science journal that only publishes reproduced research:

Organic Syntheses "A unique feature of the review process is that all of the data and experiments reported in an article must be successfully repeated in the laboratory of a member of the editorial board as a check for reproducibility prior to publication"

https://en.wikipedia.org/wiki/Organic_Syntheses

jamesash2y ago

Started in 1924 and still going strong 100 years later. The gold standard for organic chemistry procedures.

"If you can't reproduce a procedure in Org Syn, it's YOUR fault" - my PhD supervisor

ebiester2y ago

It's simple but not easy: You create another path to tenure which is based on replication, or on equal terms as a part of a tenure package. (For example, x fewer papers but x number of replications, and you are expected to have x replications in your specialty.) You also create a grant funding section for replication which is then passed on to these independent systems. (You would have to have some sort of randomization handled as well.) Replication has to be considered at the same value as original research.

And maybe smaller faculties at R2s pivot to replication hubs. And maybe this is easier for some sections of biology, chemistry and psychology than it is for particle physics. We could start where cost of replication is relatively low and work out the details.

It's completely doable in some cases. (It may never be doable in some areas either.)

tnecniv2y ago

Your proposal has a whole slew of issues.

First, people that want to be professors normally do so because they want to steer their research agenda, not repeat what other people are doing without contribution. Second, who works in their lab? Most of the people doing the leg work in a lab are PhD students, and, to graduate, they need to do something novel to write up in their dissertation. Thus, they can’t just replicate three experiments and get a doctorate. Third, you underestimate how specialized lab groups are — both in terms of the incredibly expensive equipment it is equipped with and the expertise within the lab. Even folks in the same subfield (or even in the same research group!) often don’t have much in common when it comes to interests, experience, and practical skills.

For every lab doing new work, you’d basically need a clone of that lab to replicate their work.

2 more replies

rapjr92y ago

Another approach I've seen actually used in Computer Science and Physics is to make replication a part of teaching to undergrads and masters candidates. The students learn how to do the science, and they get a paper out of replicating the work (which may or may not support the original results), and the field benefits from the replication.

harimau7772y ago

I think that there's also a lot of psychological/cultural/political issues that work also need to be worked out:

If someone wins the Nobel Prize, do the people who replicated their work also win it? When the history books are written do the replicators get equal billing to the people who made the discovery?

When selecting candidates for prestigious positions, are they really going to consider a replicator equal to an original researcher?

Eddy_Viscosity22y ago

It's not easy because it isn't simple. How do get all of the universities to change their incentives to back this?

1 more reply

SkyMarshal2y ago

> x fewer papers but x number of replications, and you are expected to have x replications in your specialty.

Could it be simplified it even further to say x number of papers, but they only count if they’re replicated by others in the field?

1 more reply

RugnirViking2y ago

lets be brutally honest with ourselves.

99% of all papers mean nothing. They add nothing to the collective knowledge of humanity. In my field of robotics there are SOOO many papers that are basically taking three or four established algorithms/machine learning models, and applying them to off-the-shelf hardware. The kind of thing any person educated in the field could almost guess the results exactly. Hundreds of such iterations for any reasonably popular problems space (prosthetics, drones for wildfires, museum guide robot) etc every month. Far more than could possibly be useful to anyone.

There should probably be some sort of separate process for things that actually claim to make important discoveries. I don't know what or how that should work. In all honesty maybe there should just be less papers, however that could be achieved.

indymike2y ago

> 99% of all papers mean nothing. They add nothing to the collective knowledge of humanity.

A lot of papers are done as a part of the process of getting a degree or keeping or getting job. The value is mostly the candidate showing they have the acumen to produce a paper of such quality that meets the publisher and peer review requirements. In some cases, it is to show a future employer some level of accomplishment or renown. The knowledge for humanity is mostly the authors ability to get published.

1 more reply

staunton2y ago

99% of science is a waste of time, not just the papers. We just don't know which 1% will turn out not to be. The point is that this is making progress. As such, these 99% definitely are adding to the collective knowledge. Maybe they add very little and maybe it's not worth the effort but it's not nothing. I think one of the effects of AI progress will be allowing to extract much more of the little value such publications have (the 99% of papers might not be worth reading but are good enough for feeding the AI).

LBTables2y ago

> In my field of robotics there are SOOO many papers that are basically taking three or four established algorithms/machine learning models, and applying them to off-the-shelf hardware.

This is a direct result of the aggressive "publish or perish" system. I worked as an aide in an autonomous vehicles lab for a year and a half during my undergrad, and while the actual work we were doing was really cool cutting edge stuff, it was absolutely maddening the amount of time we wasted blatantly pulling bullshit nothing papers exactly like you describe out of our asses to satisfy the constant chewing out we got that "your lab has only published X papers this month".

awesomeMilou2y ago

Thank you. Not even saying this to shit on academia, but modern scientific publishing follows the same governing rules as publishing a YouTube video (in principle).

> There should probably be some sort of separate process for things that actually claim to make important discoveries.

This used to be Springer Nature and the likes, but they've had so many retractions in the past years + they broke their integrity in the Schoen scandal, allowing lenience in the review process to secure a prestigious publication in their journal.

In reality, I mean you're probably my academic senior: How does true advancement get publicized these days? You post a YouTube video somewhere. See LK99. No peer review, no fancy stuff, a YouTube video was enough to get Argonne National lab on the case.

PaulHoule2y ago

99% is bombastic. What I would say is that the median scientific paper is wrong and back that up with a very long list of things that could make a paper "wrong" or "not even wrong". In the case of physics, everything about string theory may be one day considered "wrong". In the case of medicine all the studies where N < 1/10 the number it would take to draw a reliable conclusion are wrong.

justinpombrio2y ago

> If someone cares enough about the work to build on it, they will replicate it anyway.

Well, the trouble is that hasn't been the case in practice. A lot of the replication crisis was attempting for the first time to replicate a foundational paper that dozens of other papers took as true and built on top of, and then seeing said foundational paper fail to replicate. The incentives point toward doing new research instead of replication, and that needs to change.

p1esk2y ago

It is the case in my field (ML): if I care enough about a published result I try to replicate it.

1 more reply

johnnyworker2y ago

> If someone cares enough about the work to build on it, they will replicate it anyway.

Does it really deserve to be called work if it doesn't include the a full, working set of instructions that if followed to a T allow it to be replicated? To me that's more like pollution, making it someone else's problem. I certainly don't see how "we did this, just trust us" can even be considered science, and that's not because I don't understand the scientific method, that's because I don't make a living with it, and have no incentive to not rock the boat.

MrJohz2y ago

I work with code, which is about as reproducible as it is possible to get - the artifacts I produce are literally just instructions on how to reproduce the work I've done again, and again, and again. And still people come to me with some bug that they've experienced on their machine, that I cannot reproduce on my machine, despite the two environments being as identical as I can possibly make them.

I agree that reproduction in scientific work is important, but it is also apparently impossible in the best possible circumstances. When dealing with physical materials, inexact measurements, margins of error, etc, I think we have to accept that there is no set of instructions that, if followed to a T, will ever ensure perfect replication.

1 more reply

davidktr2y ago

You just described the majority of scientific papers. A "working set of instructions" is not really feasible in most cases. You can't include every piece of hard- and software required to replicate your own setup.

3 more replies

jofer2y ago

Also, don't forget that a lot of replication would fundamentally involve going and collecting additional samples / observations / etc in the field area, which is often expensive, time consuming, and logistically difficult.

It's not just "can we replicate the analysis on sample X", but also "can we collect a sample similar to X and do we observe similar things in the vicinity" in many cases. That alone may require multiple seasons of rather expensive fieldwork.

Then you have tens to hundreds of thousands of dollars in instrument time to pay to run various analysis which are needed in parallel with the field observations.

It's rarely the simple data analysis that's flawed and far more frequently subtle issues with everything else.

In most cases, rather than try to replicate, it's best to test something slightly different to build confidence in a given hypothesis about what's going on overall. That merits a separate paper and also serves a similar purpose.

E.g. don't test "can we observe the same thing at the same place?", and instead test "can we observe something similar/analogous at a different place / under different conditions?". That's the basis of a lot of replication work in geosciences. It's not considered replication, as it's a completely independent body of work, but it serves a similar purpose (and unlike replication studies, it's actually publishable).

throwaway4aday2y ago

What's the value in publishing something that is never replicated? If no one ever reproduces the experiment and gets the same results then you don't know if any interpretations based on that experiment are valid. It would also mean that whatever practical applications could have come from the experiment are never realized. It makes the entire pursuit seem completely useless.

wizofaus2y ago

> What's the value in publishing something that is never replicated?

Because it presents an experimental result to other scientists that they may consider worth trying to replicate?

1 more reply

geysersam2y ago

It still has value if we assume the experiment was done by competent honest people who are unlikely to try to fool us on purpose and unlikely do have made errors.

It would be even better if it was replicated of course.

Depending on what certainty you need you might have to wait for the result of one or several replications, but that is application dependent.

mattkrause2y ago

Longer, even!

Some experiments that study biological development or trained animals can take a year or more of fairly intense effort to start generating data.

Maxion2y ago

A year? some data sets take decades to build up before significant papers can be published on their data. Replication of the dataset is just not feasible.

This whole thread just shows how little the average HNer knows about the academic sciences.

tnecniv2y ago

I know people that had to take a 6+ month trip to Antarctica for part of their work and others that had to share time on a piece of experimental equipment with a whole department — they got a few weeks per year to run their experiment and had to milk that for all it’s worth. Even if they had funding, that machine required large amounts of space and staff to keep it running and they aren’t off the shelf products — only a few exist at large research centers.

coldtea2y ago

>I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

Then perhaps those papers shouldn't be published? Or held in any higher esteem than a blog post by the same authors?

gus_massa2y ago

An arXiv preprint is like a blog post.

A paper in a peer review journal is like posting a request for reproduction in a heavily moderated mailing list.

A paper in a predatory journal is like the "You are the best ___" price that you get if you pay to go to the "congress" invitation in spam.

Neither of them guaranty that the result is true. The publication in some peer review journals give a minimal guaranty that the paper is not horribly bad, but I've seen too much crap there too.

I know a few journals and author in my area that are serious and I can guess the result will hold, but I find very difficult to evaluate journals and authors in other areas.

kshahkshah2y ago

When I looked into this, more than 15 years ago, I thought the difficult portion wasn't sharing the recipe, but the ingredients, if you will - granted I was in a molecular biology lab. Effectively the Material Transfer Agreements between Universities all trying to protect their IP made working with each other unbelievably inefficient.

You'd have no idea if you were going down a well trodden path which would yield no success because you have no idea it was well trod. No one publishes negative results, etc.

majormajor2y ago

I think the current system is just measuring entirely the wrong thing. Yes, fewer papers would be published. But today's goal is "publish papers" not "learn and disseminate truly useful and novel things", and while this doesn't solve it entirely, it pushes incentives further away from "publish whatever pure crap you can get away with." You get what you measure -> sometimes you need to change what/how you measure.

> If someone cares enough about the work to build on it, they will replicate it anyway.

That's duplicative at the "oh maybe this will be useful to me" stage, with N different people trying to replicate. And with replication not a first-class part of the system, the effort of replication (e_R) is high. For appealing things, N is probably > 2. So N X e_R total effort.

If you move the burden at the "replicate to publish" stage, you can fix the number of replicas needed so N=2 (or whatever) and you incentive the orginal researchers to make e_R lower (which will improve the quality of their research even before the submit-for-publication stage).

I've been in the system, I spent a year or two chasing the tail of rewrites, submissions, etc, for something that was detectable as low-effect-size in the first place but I was told would still be publishable. I found out as part of that that it would only sometimes yield a good p-value! And everything in the system incentivized me to hide that for as long as possible, instead of incentivizing me to look for something else or make it easy for others to replicate and judge for themselves.

Hell, do something like "give undergrads the opportunity to earn Master's on top of their BSes, say, by replicating (or blowing holes in) other people's submissions." I would've eaten up an opportunity like that to go really really deep* in some specialized area in exchange for a masters degree in a less-structured way than "just take a bunch more courses."

oldgradstudent2y ago

> This is also all work that doesn't benefit the scientists replicating the paper. It only costs them money and time

If you build upon a result, you almost have to replicate it.

An acquaintance spent years building upon a result that turned out to be fraudulent/p-hacked.

dongpingOP2y ago

While it is a lot of work, I tend to think that one can then always publish preprints if they can't wait for the replication. I don't understand why a published paper should count as an achievement (against tenure or funding) at all before the work is replicated. The current model just creates perverse incentives to encourage lying, P-hacking, and cherry-picking. This would at least work for fields like machine learning.

This is, of course, a naive proposal without too much thought into it. But I was wondering what I would have missed here.

i_no_can_eat2y ago

and in this proposal, who will be tasked with replicating the work?

1 more reply

boxed2y ago

> I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

I don't see how the current system works really either. Fraud is rampant, and replication crisis is the most common state of most fields.

Basically the current system is failing at finding out what is true. Which is the entire point. That's pretty damn bad.

tptacek2y ago

Fraud seems rampant because you hear about cases of fraud, but not about the tens of thousands of research labs plugging away day after day.

2 more replies

coding1232y ago

Maybe doing an experiment twice, even with a cost that is double, makes more sense so that we don't all throw away our coffee when coffee is bad, or throw away our gluten when gluten is bad, etc... (those are trivial examples) basically the cost to perform the science in many cases is so minuscule in scale to how it could affect society.

pvaldes2y ago

One. Doing experiments is yet enough difficult and painful.

Two. This drain of resources can't be done for free. Somebody will need to pay twice for half of the research [1], and faster. Peers will need to be hired and paid, maybe by the writer's grants. Researchers cant justify to give their own funds to other teams without a profound change in regulation and even in that case would be harming their own projects.

[1] as the valuable experts are now stuck validating things instead doing their own job

Would open also a door for foul play. Blocking competitors teams in molasses just trowing them secondary silly problems that they know that are a dead end, while the other team work in the real deal, and take the advantage to win the patent.

faeriechangling2y ago

In some fields research can’t be replicated later. Much of all autism research will NEVER be replicated because the population of those considered autistic is not stable over time.

Other research proves impossible to replicate because whatever experiment was not described in enough detail to actually replicate it, which should be grounds to immediately dismiss the research before publishing, but which can’t truly be caught if you don’t actually try to reproduce.

Finally these practical concerns don’t even touch on the biggest benefit of reproduction as standard which is that almost nobody wants to reproduce research as they are not rewarded for doing so. This would give somebody, namely those who want to publish something, a strong impetus to get that reproduction done which wouldn’t otherwise exist.

DoctorOetker2y ago

> [...] non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper

Either "peer reviewed" articles describe progress of promising results, or they don't. If they don't the research is effectively ignored (at least until someone finds it promising). So let's consider specifically output that described promising results.

After "peer review" any apparently promising results prompt other groups to build on them by utilizing it as a step or building block.

It can take many failed attempts by independent groups before anyone dares publish the absence of the proclaimed observations, since they may try it over multiple times thinking they must have botched it somewhere.

On paper it sounds more expensive to require independent replication, but only because the costs of replication attempts are hidden until its typically rather late.

Is it really more expensive if the replication attempts are in some sense mandatory?

Or is it perhaps more expensive to pretend science has found a one-shot "peer reviewed" method, resulting in uncoordinated independent reproduction attempts that may go unannounced before, or even after failed replications?

The pseudo-final word, end of line?

What about the "in some sense mandatory" replication? Perhaps roll provable dice for each article, and in-domain sortition to randomly assign replicators. So every scientist would be spending a certain fraction of their time replicating the research of others. The types of acceptable excuses to derelict these duties should be scrutinized and controlled. But some excuses should be very valid, for example conscientious objection. If you are tasked to reproduce some of Dr. Mengele's works, you can cop out on condition that you thoroughly motivate your ethical concerns and objections. This could also bring a lot of healthy criticism to a lot of practices, which is otherwise just ignored an glossed over for fear of future career opportunities.

brightball2y ago

> I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

The alternative is a bunch of stuff being published which people belief as "science" that doesn't hold up under scrutiny, which undermines the reliability of science itself. The current approach simply gives people reason to be skeptical.

ImPostingOnHN2y ago

I'm not convinced this proposed alternative is better than the status quo. It's simply not feasible, no matter how many benefits one might imagine.

the concern about skepticism is not irrelevant, but many of these skeptics also are skeptical of the earth being round, or older than a few thousand years, or not created by an omnipotent skylord, and I'm not sure it's actually a significant concern given the current number and expertise of those who are skeptical

so, we can hear their arguments for their skepticism, but that doesn't mean the arguments are valid to warrant the skepticism exhibited. And in the end, that's what matters: skepticism warranted by valid arguments, not just any Cletus McCletus's skepticism of heliocentrism, as if his opinion is equal to that of an astrophysicist (it isn't). And you know what? It isn't necessary to convince a ditch digger that the earth goes around the sun, if they feel like arguing about it.

1 more reply

backtoyoujim2y ago

Yes it would indeed mean slowing down and having more scientists.

It would mean disruption is no longer a useful tool for human development.

ebiester2y ago

I don't necessarily think it would mean more scientists, but it would mean more expense. You have a moderate number of low impact papers that people are doing for tenure today - papers for the purpose of cranking out papers. We are talking about redirecting efforts but increasing quality of what you have.

brnaftr3612y ago

It may not be. I would be willing to argue that there was a tipping point and we've long exceeded its boundary - progress and disruption now is just making finding an equilibrium in the future increasingly difficult.

So entering into a paradigm where we test the known space - especially presently - would 1) help reduce cruft; 2) abate undersirable forward progress; 3) train the next generation(s) of scientists to be more diligent and better custodians of the domain.

throwawaymaths2y ago

> I don't see how this could ever work,

http://www.orgsyn.org/

> All procedures and characterization data in OrgSyn are peer-reviewed and checked for reproducibility in the laboratory of a member of the Board of Editors

Never is a strong word.

omgwtfbyobbq2y ago

What about a system where peer replication is required if the number of citations exceeds some threshold?

p1esk2y ago

Who will be replicating it? Why would I want to set aside my own research to replicate some claim someone made? How would this help my career?

3 more replies

iamthemonster2y ago

My Master's thesis was basically taking a purely theoretical paper and "replicating" it, by which I mean taking the formulae and just writing the software to run them. It sounds trivial to an outsider but even that was I guess 300 hours of work.

In general I think undergraduate projects are a great space to attempt to replicate findings, but it heavily depends on the field. Fundamental physics experiments can be expensive and require equipment that's outside the reach of undergrads. But one thing I love about engineering as an academic field, by comparison, is that anything you research tends to be more achievable for others to replicate because as your end goal you are aiming for something that's practical in the field.

ljf2y ago

Indeed, a friend of my was running experiments on quantum computing - the first 18 months on each run of his tests, as he moved between jobs and universities, was just setting up the necessary refrigeration systems (and generally building them from scratch each time).

Only then could he even start building the experiment - total time to run it all seems to run across years.

techas2y ago

Well, you could put incentives to make replication attractive. Give credit for replication. Give money to the researchers doing the replication/review. Today we pay an average of 2000€ per article, reviewers get 0€ and the editorial keeps all for putting a pdf online. I would say there is margin there to invest in improving the review process.

mandmandam2y ago

It's wild to me that although we know that it was Ghislaine Maxwell's daddy who started this incredibly corrupt system, people hardly mention this fact.

The US system, and others, even attack people who dare to try and make science more open. RIP Aaron Swartz, and long live Alexandra Elbakyan.

indymike2y ago

> This is also all work that doesn't benefit the scientists replicating the paper. It only costs them money and time.

Maybe this is what needs to change. If we only reward discovery and success, then the incentive is to only produce discovery and success.

chmod6002y ago

Please excuse my ignorance, but I'm not convinced.

What are we supposed to do in a hundred years when the scientists of today are dead and we have a bunch of results with important implications that aren't documented well enough to replicate?

jononomo2y ago

If it is not replicated it shouldn't be published, other than as a provisional draft. I don't care if it hurts your feelings.

wilde2y ago

The pressure to replicate would make folks publish things in forms that are easier to replicate. This cost would go down over time.

picadores2y ago

Yes, the amount of work done that could go into more paper churn?

miga2y ago· 17 in thread

Peer review does not serve to assure replication, but assure readability and comprehensibility of the paper.

Given that some experiments cost billions to conduct, it is impossible to implement "Peer Replication" for all papers.

What could be done is to add metadata about papers that were replicated.

NalNezumi2y ago

Isn't readability and comprehensibility the job of the editor/journal to check. (after all they're actually paid) maybe not for conference, but peer review is more for checking if the methodology, scope, claim, direction, conclusion and relevances is sound&trustable.

At least that's my understanding

kergonath2y ago

The editor is often not the right person to decide based on technical details. Most often, articles they receive anre outside their field of expertise and they don’t really have a way of deciding if a section is comprehensible or not. It’s very difficult for an outsider to know what bit of jargon is redundant and what bit is actually important to make sense of the results. So this bit of readability check falls to the referees.

In theory editors (or rather copyeditors, the editors themselves have to handle too many papers to do this sort of thing) should help with things like style, grammar, and spelling. In practice, quality varies but it is often subpar.

kkylin2y ago

Highly dependent on journal / field. In mine (mathematics) most associate editors work for free, same as reviwers. The reviewer do all the things you say, and in addition try to ensure readability & novelty. Most journals do have professional copy editing, but that's separate from the content review.

I don't know how refereed conference proceedings work (we don't really use these). The only journals I know of that have professional editors (i.e., editors who are not active researchers themselves) are Nature and affiiliated journals, but someone more knowledgeble should correct me here.

snowwrestler2y ago

> Isn’t readability and comprehensibility the job of the editor/journal to check

Yes, who do you think ask the reviewers to perform their reviews?

> peer review is more for checking if the methodology, scope, claim, direction, conclusion and relevances is sound&trustable.

No, the parent comment has it right. The only thing being reviewed is the paper, and the point is to make sure it communicates clearly, not that it’s “sound and trustable.”

kjkjadksj2y ago

The editor is basically deferring to people with expertise who can put the paper into context better than they could. The editor might be an expert in the field, but they can’ speak for every aspect of it like someone working day to day in that specific aspect of the field could. Sometimes the authors themselves even recommend potentially relevant reviewers for the editor to contact for peer reviewing.

hedora2y ago

In CS, the editor / journal don’t do those things. Instead, the reviewers do. (Sometimes reviewers “shepherd” papers to help fix readability after acceptance).

Also, most work goes to conferences; journals typically publish longer versions of published works.

jxramos2y ago

Yes a metadata relationship link would be outstanding. Reproduced in some paper xyz, or by some institution, named individuals, etc. some kind of structured information would be very useful.

kergonath2y ago

Barriers to publication should be lower for replication studies, I think that’s the main problem.

If someone wants to spend some time replicating something that’s only been described in a paper or two, that is valuable work for the community and should be encouraged. If the person is a PhD student using that as an opportunity to hone their skills, it’s even better. It’s not glamorous, it’s not something entirely new, but it is useful and important. And this work needs to go to normal journals, otherwise there’s just be journals dedicated to replication and their impact factor will be terrible and nobody will care.

s1artibartfast2y ago

They're basically no barriers to publication. There are a number of normal journals that publish everything submitted if it appears to be honest research.

1 more reply

jxramos2y ago

I wonder if undergrads could be harnessed to enter into this kind of work, maybe under the supervision of doctoral students and a well meaning and interested PI.

strangattractor2y ago

Maybe add people as special authors/contributors to the original work.

There always seems to be a contingent of people that think that anything less than %100 solution is inadequate so nothing is done. Peer review has proven itself inadequate and people hang on to it tooth and nail. Some disciplines should require replication on everything - I won't name Psychology or Social Sciences in general but the failure to replicate rate for some is unacceptable.

ebiester2y ago

Let's not make perfect be the enemy of good. We may never be able to replicate every field, but we could start many fields today. It means changing our values to make replication as a valid path to tenure and promotion and a required element of Ph.D studies.

mathisfun1232y ago

>Peer review does not serve to assure replication, but assure readability and comprehensibility of the paper.

I have had a paper rejected twice in a row over the last year. Both times the comments include something like "paper was very well-wriiten; well-written enough that an undergrad could read it".

Peer review ensures the gates are kept.

julienreszka2y ago

>Experiments that cost billions to conduct

If you can't replicate them it's like they didn't happen anyways

kergonath2y ago

It’s a bit more subtle than that. Not all papers are equal and I’d trust an article from a large team where error and uncertainty analysis has been done properly (think the Higgs boson paper) over a handful of dodgy experiments that are barely documented properly.

But yeah, in the grand scheme of things if it hasn’t been replicated, then it hasn’t been proven, but some works are credible on their own.

thfuran2y ago

So no experiments have happened because I don't have a lab, and CERN is just an elaborate ruse?

tnecniv2y ago

Ah yes, if I can’t run the LHC at home, none of the work there happened

matthewdgreen2y ago· 9 in thread

The purpose of science publications is to share new results with other scientists, so others can build on or verify the correctness of the work. There has always been an element of “receiving credit” to this, but the communication aspect is what actually matters from the perspective of maximizing scientific progress.

In the distant past, publication was an informal process that mostly involved mailing around letters, or for a major result, self-publishing a book. Eventually publishers began to devise formal journals for this purpose, and some of those journals began to receive more submissions than it was feasible to publish or verify just by reputation. Some of the more popular journals hit upon the idea of applying basic editorial standards to reject badly-written papers and obvious spam. Since the journal editors weren’t experts in all fields of science, they asked for volunteers to help with this process. That’s what peer review is.

Eventually bureaucrats (inside and largely outside of the scientific community) demanded a technique for measuring the productivity of a scientist, so they could allocate budgets or promotions. They hit on the idea of using publications in a few prestigious journals as a metric, which turned a useful process (sharing results with other scientists) into [from an outsider perspective] a process of receiving “academic points”, where the publication of a result appears to be the end-goal and not just an intermediate point in the validation of a result.

Still other outsiders, who misunderstand the entire process, are upset that intermediate results are sometimes incorrect. This confuses them, and they’re angry that the process sometimes assigns “points” to people who they perceive as undeserving. So instead of simply accepting that sharing results widely to maximize the chance of verification is the whole point of the publication process, or coming up with a better set of promotion metrics, they want to gum up the essential sharing process to make it much less efficient and reduce the fan-out degree and rate of publication. This whole mess seems like it could be handled a lot more intelligently.

sebastos2y ago

Very well put. This is the clearest way of looking at it in my view.

I’ll pile on to say that you also have the variable of how the non-scientist public gleans information from the academics. Academia used to be a more insular cadre of people seeking knowledge for its own sake, so this was less relevant. What’s new here is that our society has fixated on the idea that matters of state and administration should be significantly guided by the results and opinions of academia. Our enthusiasm for science-guided policy is a triple whammy, because 1. Knowing that the results of your study have the potential to affect policy creates incentives that may change how the underlying science is performed 2. Knowing that results of academia have outside influence may change WHICH science is performed, and draw in less-than-impartial actors to perform it 3. The outsized potential impact invites the uninformed public to peer into the world of academia and draw half-baked conclusions from results that are still preliminary or unreplicated. Relatively narrow or specious studies can gain a lot of undue traction if their conclusions appear, to the untrained eye, to provide a good bat to hit your opponent with.

Maxion2y ago

A significant problem we face today is the way research, especially in academia, gets spotlighted in the media. They often hyper-focus on single studies, which can give a skewed representation of scientific progress.

The reality is that science isn't about isolated findings; it's a cumulative effort. One paper might suggest a conclusion, but it's the collective weight of multiple studies that provides a more rounded understanding. Media's tendency to cherry-pick results often distorts this nuanced process.

It's also worth noting the trend of prioritizing certain studies, like large RCTs or systematic reviews, while overlooking smaller ones, especially pilot studies. Pilot studies are foundational—they often act as the preliminary research needed before larger studies can even be considered or funded. By sidelining or dismissing these smaller, exploratory studies, we risk undermining the very foundation that bigger, more definitive research efforts are built on. If we consistently ignore or undervalue pilot studies, the bigger and often more impactful studies may never even see the light of day.

casualscience2y ago

Most of this is very legit, but this

> Still other outsiders, who misunderstand the entire process, are upset that intermediate results are sometimes incorrect. This confuses them, and they’re angry that the process sometimes assigns “points” to people who they perceive as undeserving. So instead of simply accepting that sharing results widely to maximize the chance of verification is the whole point of the publication process, or coming up with a better set of promotion metrics, they want to gum up the essential sharing process to make it much less efficient and reduce the fan-out degree and rate of publication.

Does not represent my experience in the academy at all. There is a ton of gamesmanship in publishing. That is ultimately the yardstick academics are measured against, whether we like it or not. No one misunderstands that IMO, the issue is that it's a poor incentive. I think creating a new class of publication, one that requires replication, could be workable in some fields (e.g. optics/photonics), but probably is totally impossible in others (e.g. experimental particle physics).

For purely intellectual fields like mathematics, theoretical physics, philosophy, you probably don't need this at all. Then there are 'in the middle fields' like machine learning which in theory would be easy to replicate, but also would be prohibitively expensive for, e.g. baseline training of LLMs.

Maxion2y ago

And on the extreme end you have the multi-decade longitudinal studies in epidemiology / biomedicine that would be more-or-less impossible to replicate.

1 more reply

oneshtein2y ago

IMHO, physicist, especially theoretical physicist, must be able to create a physical model of something, to confirm that their mathematical models are somewhat connected to reality. WTF is «wave of probability»? WTF is «bending of space-time»? These things are possible in dream-land physics only.

nine_k2y ago

For sharing results widely, there's arxiv. The problem is that the fanout is now overwhelming.

The public perception of a publication in a prestigious journal as the established truth does not help, too.

isaacremuant2y ago

> The public perception of a publication in a prestigious journal as the established truth does not help, too.

it's not so much the public perception but what govs/media/tech and other institutions have pushed down so that the public doesn't question whatever resulting policy they're trying to put forth.

"Trust the science" means "Thou shalt not question us, simply obey".

Anyone with eyes who has worked in institutions knows that bureocracy, careerism and corruption are intrinsic to them.

dmbche2y ago

Your analysis seems to portray all scientists as pure hearted. May I remind you of the latest Stanford scandal where the president of Stanford was found to have manipulated data?

Today, publications do not serve the same purpose as they did before the internet. It is trivial today to write a convincing paper without research and getting that published(www.theatlantic.com/ideas/archive/2018/10/new-sokal-hoax/572212/&sa=U&ved=2ahUKEwjnp5mRtsiAAxVwF1kFHesBDC8QFnoECAkQAg&usg=AOvVaw0t_Bo31BrT5D9zHBdmNAqi).

matthewdgreen2y ago

No subset of humanity is “pure hearted.” Fraud and malice will exist in everything people do. Fortunately these fraudulent incidents seem relatively rare, when one compares the number of reported incidents to the number of publications and scientists. But this doesn’t change anything. The benefit of scientific publication is to make it easier to detect and verify incorrect results, which is exactly what happened in this case.

I understand that it’s frustrating it didn’t happen instantly. And I also understand that it’s deeply frustrating that some undeserving person accumulated status points with non-scientists based on fraud, and that let them take a high-status position outside of their field. (I think maybe you should assign some blame to the Stanford Trustees for this, but that’s up to you.) None of this means we’d be better off making publication more difficult: it means the metrics are bad.

PS When a TFA raises something like “the replication crisis” and then entangles it with accusations of deliberate fraud (high profile but exceedingly rare) it’s like trying to have a serious conversation about automobile accidents, but spending half the conversation on a handful of rare incidents of intentional vehicular homicide. You’re not going to get useful solutions out of this conversation, because it’s (perhaps deliberately) misunderstanding the impact and causes of the problem.

2 more replies

eesmith2y ago· 7 in thread

> the real test of a paper should be the ability to reproduce its findings in the real world. ...

> What if all the experiments in the paper are too complicated to replicate? Then you can submit to [the Journal of Irreproducible Results].

Observational science is still a branch of science even if it's difficult or impossible to replicate.

Consider the first photographs of a live giant squid in its natural habitat, published in 2005 at https://royalsocietypublishing.org/doi/10.1098/rspb.2005.315... .

Who seriously thinks this shouldn't have been published until someone else had been able to replicate the result?

Who thinks the results of a drug trial can't be published until they are replicated?

How does one replicate "A stellar occultation by (486958) 2014 MU69: results from the 2017 July 17 portable telescope campaign" at https://ui.adsabs.harvard.edu/abs/2017DPS....4950403Z/abstra... which required the precise alignment of a star, the trans-Neptunian object 486958 Arrokoth, and a region in Argentina?

Or replicate the results of the flyby of Pluto, or flying a helicopter on Mars?

Here's a paper I learned about from "In The Pipeline"; "Insights from a laboratory fire" at https://www.nature.com/articles/s41557-023-01254-6 .

"""Fires are relatively common yet underreported occurrences in chemical laboratories, but their consequences can be devastating. Here we describe our first-hand experience of a savage laboratory fire, highlighting the detrimental effects that it had on the research group and the lessons learned."""

How would peer replication be relevant?

msla2y ago

With some of the things, but admittedly not most of the things you mentioned, there's a dataset (somewhere) and some code run on that dataset (somewhere) and replication would mean someone else being able to run that code on that dataset and get the same results.

Would this require labs to improve their software environments and learn some new tools? Would this require labs to give up whatever used to be secret sauce? That's. The. Point.

counters2y ago

In practice this is happening in many disciplines, for most research, on a daily basis. What _isn't_ happening is that the results of these replications are being independently peer reviewed, because that isn't incentivized. However, when replication fails for whatever reason, it usually leads to insights that themselves lead to stronger scientific work and better publications later on.

eesmith2y ago

> someone else being able to run that code on that dataset and get the same results.

I think when people talk about "replicate" they mean something more than that.

The dataset could contain coding errors, and the analysis could contain incorrect formulas and bad modeling. Reproducing a bad analysis, successfully, provide no corrective feedback.

I know for one paper I could replicate the paper's results using the paper's own analysis, but I couldn't replicate the paper's results using my analysis.

> Would this require labs to give up whatever used to be secret sauce? That's. The. Point.

That seems to be a very different Point.

Newton famously published results made from using his secret sauce - calculus - by recasting them using more traditional methods.

In the extreme cas, I could publish the factors for RSA-1024 without publishing my factorization method. "I prayed to God for the answer and He gave them to me." You can verify that result without the secret sauce.

I mean, people use all sorts of methods to predict a protein structure, including manual tweaking guided by intuition and insight gained during a reverie or day-dream (à la Kekulé) which is clearly not reproducible. Yet that final model may be publishable, because it may provide new insight and testable predictions.

1 more reply

kergonath2y ago

> Who seriously thinks this shouldn't have been published until someone else had been able to replicate the result?

Nobody, obviously. You cannot reproduce a result that hasn’t been published, so no new phenomenon is replicated the moment it is first published. The problem is not the publication of new discoveries, it’s the lack of incentives to confirm them once they’ve been published.

In your example, new observations of giant squids are still massively valuable even if not that novel anymore. So new observations should be encouraged (as I am sure they are).

> Or replicate the results of the flyby of Pluto, or flying a helicopter on Mars?

Well, we should launch another probe anyway. And I am fairly confident we’ll have many instances of aircrafts in Mars’ atmosphere and more data than we’ll know what to do with it. We can also simulate the hell out of it. We’ll point spectrometers and a whole bunch of instruments towards Pluto. These are not really good examples of unreproducible observations.

Besides, in such cases robustness can be improved by different teams performing their own analyses separately, even if the data comes from the same experimental setup. It’s not all black or white. Observations are on a spectrum, some of them being much more reliable than others and replication is one aspect of it.

> How would peer replication be relevant?

How would you know which aspects of the observed phenomena come from particularities of this specific lab? You need more than one instance. You need some kind of statistical and factor analyses. Replication in this instance would not mean setting actual labs on fire on purpose.

It’s exactly like studying car crashes: nobody is going to kill people on purpose, but it is still important to study them so we regularly have new papers on the subject based on events that happened anyway, each one confirming or disproving previous observations.

eesmith2y ago

> Nobody, obviously. You cannot reproduce a result that hasn’t been published, .. The problem is not the publication of new discoveries, it’s the lack of incentives to confirm them once they’ve been published.

Your comment concerns post-publication peer-replication, yes?

If so, it's a different topic. The linked-to essay specifically proposes:

""Instead of sending out a manuscript to anonymous referees to read and review, preprints should be sent to other labs to actually replicate the findings. Once the key findings are replicated, the manuscript would be accepted and published.""

That's pre-publication peer-replication, and my comment was only meant to be interpreted in that light.

1 more reply

phpisthebest2y ago

I think in some of those cases you have conclusions drawn from raw data that could be replicated or reviewed. For example many teams use the same raw data from Large Colliders, or JWT, or other large science projects to reach competiting conclusions.

Yes in a perfect world we would also replicate the data collection but we do not live in a perfect world

Same is true for Drug Trials, there is always a battle over getting the raw data from drug trails as the companies claim that data is trade secret, so independent verification of drug trails is very expensive but if the FDA required not just the release of redacted conclusions and supporting redacted data but 100% of all data gathered it would be alot better IMO

For example the FDA says it will take decades to release the raw data from the COVID Vaccine trials.. Why... and that is after being forced to do so via a law suit.

eesmith2y ago

> For example many teams use the same raw data from Large Colliders, or JWT, or other large science projects to reach competiting conclusions.

Yes, but why must the first team wait until the second is finished before publishing?

What if you are the only person in the world with expertise in the fossil record of an obscure branch of snails? You spend 10 years developing a paper knowing that the next person with the right training to replicate the work might not even be born yet.

Other paleontologists might not be able to replicate the work, but still tell if it's publishable - that's what they do now, yes?

> but we do not live in a perfect world

Alternatively, we don't live in a perfect world which is why we have the current system instead of requiring replication first.

Since the same logic works for both cases, I don't think it's persuasive logic.

> the FDA says it will take decades

Well, that's a tangent. The FDA is charged with protecting and promoting public health, not improving the state of scholarly literature.

And the FDA is only one of many public health organizations which carried out COVID vaccine trials.

abnry2y ago· 7 in thread

If scientists are going to complain that's its too hard or too expensive to replicate their studies, then that just shows their work is BS.

snitty2y ago

>If scientists are going to complain that's its too hard or too expensive to replicate their studies, then that just shows their work is BS.

1 mg of anti-rabbit antibody (a common thing to use in a lot of biology experiments) is $225 [1]. Outside of things like standard buffers and growth medium for prokaryotes, this is going to be the cheapest thing you use in an experiment.

1/10th of that amount for anti-flagellin antibody is $372. [2]

A kit to prep a cell for RNA sequencing is $6-10 per use. That's JUST isolation of the RNA. Not including reverse transcribing it to cDNA for sequencing, or the sequencing itself. [3]

Let's not even reach things like materials science where you may be working on an epitaxial growth paper, and there are only a handful of labs where they could even feasibly repeat the experiment.

Or say something with a BSL-3 lab where there are literally only 15 labs in the US that could feasibly do the work, assuming they aren't working on their own stuff. [4]

[1] - https://www.thermofisher.com/antibody/product/Goat-anti-Rabb... [2] https://www.invivogen.com/anti-flagellin [3] https://www.thermofisher.com/order/catalog/product/12183018A [4] https://www.niaid.nih.gov/research/tufts-regional-biocontain...

alsodumb2y ago

Nah, it doesn't. It just shows that it's time consuming and expensive to replicate their studies.

abnry2y ago

If that's the case, then don't claim confidence in the work or make policy decisions based off of it. If there is no epistemological humility, then yes, it is still BS.

1 more reply

fodkodrasz2y ago

I guess if software developers will complain that it's too hard or too expensive to thoroughly test their code to ensure exactly zero bugs at release[1], then that just shows their work is BS.

[1]: if you have delivered telco code to Softbank you may have heard this sentence

abnry2y ago

Replication is not the same thing as zero bugs in software.

1 more reply

azan_2y ago

What if it REALLY is too expensive? You do realize that there are studies which literally cost millions of dollars? Getting funding for original studies is hard enough, good luck securing additional funds for replication.

Maxion2y ago

Something in switzerland called the Large Hadron Collider comes to mind.

I guess we should not talk about the Higgs before someone else builds a second one and replicates the papers.

1 more reply

titzer2y ago· 5 in thread

In the PL field, conferences have started to allow authors to submit packaged artifacts (typically, source code, input data, training data, etc) that are evaluated separately, typically post-review. The artifacts are evaluated by a separate committee, usually graduate students. As usual, everything is volunteer. Even with explicit instructions, it is hard enough to even get the same code to run in a different environment and give the same results. Would "replication" of a software technique require another team to reimplement something from scratch? That seems unworkable.

I can't even imagine how hard it would be to write instructions for another lab to successfully replicate an experiment at the forefront of physics or chemistry, or biology. Not just the specialized equipment, but we're talking about the frontiers of Science with people doing cutting-edge research.

I get the impression that suggestions like these are written by non-scientists who do not have experience with the peer review process of any discipline. Things just don't work like that.

mike_hearn2y ago

Is PL theory actually science? Although we call it computer science, I don't personally think CS is actually a science in the sense of studying nature to understand it. Computers are artificial constructs. CS is a lot closer to engineering than science. Indeed it's kind of nonsensical to talk about replicating an experiment in programming language theory.

For the "hard" sciences, replication often isn't so difficult it seems. LK-99 being an interesting study in this, where people are apparently successfully replicating an experiment described in a rushed paper that is widely agreed to lack sufficient details. It's cutting edge science but replication still isn't a problem. Most science isn't the LHC.

The real problems with replication are found in the softer fields. There it's not just an issue of randomness or difficulty of doing the experiments. If that's all there was to it, no problem. In these fields it's common to find papers or entire fields where none of the work is replicable even in principle. As in, the people doing it don't think other people being able to replicate their work is even important at all, and they may go out of their way to stop people being able to replicate their work (most frequently by gathering data in non-replicable ways and then withholding it deliberately, but sometimes it's just due to the design of the study). The most obvious inference when you see this is that maybe they don't want replication attempts because they know their claims probably aren't true.

So even if peer reviewers or journals were just checking really basic things like, is this claim even replicable in principle, that would be a good start. You would still be left with a lot of papers that replicate fine but their conclusions are still wrong because their methodology is illogical, or papers that replicate because their findings are obvious. But there's so much low hanging fruit.

lou13062y ago

Well there are entire areas of CS research tangential to PLs (say, SAT/SMT, software verification, program synthesis) where the fundamental problems are known to be NP-complete/exponential/undecidable. So it is pretty hard to declare a new approach "superior" on purely theoretical grounds: you usually have to run some benchmarks and see how your new approach compares to existing alternatives. And you want these benchmark to be replicable across different machines and platforms.

1 more reply

Maxion2y ago

> I get the impression that suggestions like these are written by non-scientists who do not have experience with the peer review process of any discipline. Things just don't work like that.

Not to mention that the cutting edge in many sciences are perhaps two-three research groups of 5-30 individuals each in varying research institutions around the world.

ramesh312y ago

> Even with explicit instructions, it is hard enough to even get the same code to run in a different environment and give the same results.

Is it really that hard for researchers to standardize around providing Dockerfiles? Environment replication is a solved problem.

titzer2y ago

> Environment replication is a solved problem.

Unfortunately, no. Dockerfiles aren't as portable as you think, and not architecture-independent. VMs are better, but even then, performance isn't portable either.

The last artifact I produced included builds of 3 web browsers from source--it was over 10GB. One doesn't just "build Chrome in a dockerfile".

jonnycomputer2y ago· 4 in thread

If they are, in fact, implying that another lab should produce a matching data-set to try to replicate results, well, I'm sorry, but that won't work, at least in a whole lot of fields. Data collection can be very expensive, and take a lot of time. It certainly is in my field.

If on, the other hand, they just want the raw data, and let others go to town on it in their own way, that's fine, probably. Results that don't depend on very particular details of the processing pipeline are probably more robust anyway.

peteradio2y ago

What field is it too expensive or difficult to reproduce the data?

jonnycomputer2y ago

Lots?

Human subjects research, for one e.g. very often they involve clinical populations that are very hard to recruit. You can spend tens of thousands in advertising, and multiples more in labor, to get a hundred participants in, over the course of an entire year of effort, and that's not even counting the money spent on a clinician doing a diagnosis. And then, when you do, you may, say, pay $1000 for the MRI per subject, plus the $100 bucks you pay directly to the participant themselves.

5424582y ago

As reviewers are paid nothing and get no substantial credit for their work, I’m going to say “Every field”. Why would you do a significant chunk of a paper’s work for no reward? Replication studies are typically a bad deal even when you get to pick a notable study to reproduce and you get a paper to your name out of it - replicating what will probably be an obscure paper for no credit is not something most academics (let alone commercial research labs) would entertain.

TrackerFF2y ago

On the top of my head: Say you want to research biological data from the Mariana trench.

Or even worse: Some passing by comet, or planet/moon/whatever in the solar system. And just to make things EVEN worse, you need to analyze the data in some destructive way.

Certainly very plausible scenarios, but also some which could prohibitively expensive to do multiple times.

infogulch2y ago· 3 in thread

I like the idea of splitting "peer review" into two, and then having a citation threshold standard where a field agrees that a paper should be replicated after a certain number of citations. And journals should have a dedicated section for attempted replications.

1. Rebrand peer review as a "readability review" which is what reviewers tend to focus on today.

2. A "replicability statement", a separately published document where reviewers push authors to go into detail about the methodology and strategy used to perform the experiments, including specifics that someone outside of their specialty may not know. Credit NalNezumi ITT

analog312y ago

Every experimental paper I've ever read has contained an "Experimental" section, where they provide the details on how they did it. Those sections tend to be general enough, albeit concise.

In some fields, aside from specialized knowledge, good experimental work requires what we call "hands." For instance, handling air sensitive compounds, or anything in a condensed or crystalline state. In my thesis experiment, some of the equipment was hand made, by me.

Sometimes specialized facilities are needed. My doctoral thesis project used roughly 1/2 million dollars of gear, and some of the equipment that I used was obsolete and unavailable by the time I finished.

janalsncm2y ago

“Concise” isn’t good enough. If other scientists are trying to read through the tea leaves at what you’re trying to say you did, that defeats the entire point of a paper. The purpose of science is to create knowledge that other people can use and if people can’t replicate your work that’s not science.

2 more replies

ahmadmijot2y ago

> My doctoral thesis project used roughly 1/2 million dollars of gear,

Wow I envy you. My doctoral thesis project spent like... USD2.5k directly for gears (half of it just to buy lego bricks to build our own instrument exactly because we can't afford to buy commercial one lol)

1 more reply

waynecochran2y ago· 3 in thread

I spent a lot of my graduate years in CS implementing the details of papers only to learn that, time and time again, the paper failed to mention all the short comings and fail cases of the techniques. There are great exceptions to this.

Due to the pressure of "publish or die" there is very little honesty in research. Fortunately there are some who are transparent with their work. But for the most part, science is drowning in a sea of research that lacks transparency and replication short falls.

cptskippy2y ago

You'll quickly discover when you enter the workforce that the reasons we have CI/CD, Docker, and virtualization are because of a similar problem. The dread "it works on my machine" response.

CI/CD forces people to codify exactly how to build and deploy something in order for it to get into a production environment. Docker and VMs are ways around this by giving people a "my machine" that can be copied and shared easily.

janalsncm2y ago

I had a very similar experience in my masters. Really made me think, what exactly are the peers “reviewing” if they don’t even know whether the technique works in the first place.

waynecochran2y ago

I have reviewed many papers and there is never the time to recreate the work and test. That is why I love the "papers w code" site. I think every published CS paper should require a git repo with all their code and experimental data.

NalNezumi2y ago· 2 in thread

Imo, A more realistic thing to do is "replicability review" and/or requirement to submit "methodology map" to each paper.

The former would be a back and forth between a reviewer that inquire and ask questions (based on the paper) with the goal to reproduce the result, but don't have to actually reproduce it. This is usually good to find out missing details in the paper that the writer just took for granted everyone in the field knows (I've met Bio PHD that have wasted Months of their life tracking up experimental details not mentioned in a paper)

The latter would be the result of the former. Instead of having pages long "appendix" section in the main paper, you produce another document with meticulous details of the experiment/methodology with every stone turned together with an peer reviewer. Stamp it with the peer reviewes name so they can't get away with hand wavy review.

I've read too many papers where important information to reproduce the result is omitted. (for ML/RL) If the code is included I've countless of times found implementation details that is not mentioned in the paper. In matter of fact, there's even results suggesting that those details are the make or break of certain algorithms. [1] I've also seen breaking details only mentioned in code comments...

Another atrocious thing I've witnessed is a paper claiming they evaluated their method on a benchmark and if you check the benchmark, the task they evaluated on doesn't exit! They forked the benchmark and made their own task without being clear about it! [2]

Shit like this make me lose faith in certain science directions. And I've seen a couple of junior researcher giving it all up because they concluded it's all just house of cards.

[1] https://arxiv.org/abs/2005.12729

[2] https://arxiv.org/abs/2202.02465

Edit: also if you think that's too tedious/costly, reminder that publishers rake in record profits so the resources are already there https://youtu.be/ukAkG6c_N4M

kergonath2y ago

> I've met Bio PHD that have wasted Months of their life tracking up experimental details not mentioned in a paper

Same. Now, when I review manuscripts, I pay much more attention to whether there is enough information to replicate the experiment or simulation. We can put out a paper with wrong interpretations and that’s fine because other people will realise that when doing their own work. We cannot let papers get published if their results cannot be replicated.

> The latter would be the result of the former. Instead of having pages long "appendix" section in the main paper, you produce another document with meticulous details of the experiment/methodology with every stone turned together with an peer reviewer. Stamp it with the peer reviewes name so they can't get away with hand wavy review

Things that take too much space to go in the experimental section should go to a electronic supplementary information document. But then it would be nice if the ESI were appended to the article when we download a PDF because tracking them is a pain in the backside. Some fields are better than others about this, for example in materials characterisation studies it’s very common to have ESI with a whole bunch of data and details.

Large dataset should go to a repository or a dataset journal, that way the method is still peer reviewed and the dataset has a doi and is much easier to re-use. It’s also a nice way of doubling a student’s papers count by the end of their PhD.

> Another atrocious thing I've witnessed is a paper claiming they evaluated their method on a benchmark and if you check the benchmark, the task they evaluated on doesn't exit! They forked the benchmark and made their own task without being clear about it! [2]

That’s just evil!

Maxion2y ago

> Large dataset should go to a repository or a dataset journal, that way the method is still peer reviewed and the dataset has a doi and is much easier to re-use.

This may be possible in some sciences, but not in epidemiology or biomed. Often the study is based on tissue samples owned by some entity, with permission granted only to some certain entity.

Datasets in epidemiology are often full of PII, and cannot be shared publicly for many reasons.

SubiculumCode2y ago· 2 in thread

Scientist publishes paper based on ABCD data.

Replicator: Do you know how much data I'll need to collect? 11,000 particpants followed across multiple timepoints of MRI scanning. Show me the money.

petesergeant2y ago

Definitely something that needs large charitable investment, but charities like that do exist, eg Wellcome Trust

SubiculumCode2y ago

Like 290+ million, just to get started.

freeopinion2y ago· 2 in thread

My mind automatically swapped out the words "peer" for "code". It took my brain to interesting places. When I came back to the actual topic, I had accidentally built a great way to contrast some of the discussion offered in this thread.

dongpingOP2y ago

In the sense of replicating the results, we do have CI servers and even fuzzers running for our "code replication".

freeopinion2y ago

I don't want to derail the science discussion too much, but what if you actually had to reproduce the code by hand? Would that process produce anything of value? Would your habit of writing i+=1 instead of i++ matter? Or iteration instead of recursion?

Would code replication result in fewer use after free, or off by one than code review? Or would it mostly be a waste of resources including time?

1 more reply

moelf2y ago· 2 in thread

I wish we can replicate the LHC

Maxion2y ago

No talking about the Higgs before that happens, apparently.

kergonath2y ago

We will, don’t worry.

janalsncm2y ago· 1 in thread

For a while Reddit had the mantra “pics or it didn’t happen”.

At least in CS/ML there needs to be a “code or it didn’t happen”. Why? Papers are ambiguous. Even if they have mathematical formulas, not all components are defined.

Peer replication in these fields is an easy low hanging fruit that could set an example for other fields of science.

bradley132y ago

CS and ML are my field, although I'm no longer active in research. I always made a code archive available. Want to replicate? Download and run.

This should be standard now, in the age of GitHub, GitLab, et al. If a paper discusses an implementation, but doesn't provide code, it is probably BS.

leedrake52y ago· 1 in thread

Peer Review is the right solution to the wrong problem: https://open.substack.com/pub/experimentalhistory/p/science-...

On replication, it is a worthwhile goal but the career incentives need to be there. I think replicating studies should be a part of the curriculum in most programs - a step toward getting a PhD in lieu of one of the papers.

vinnyvichy2y ago

Fear of the frontier.. that's why instead of people getting excited to look for new rtsp superconductor candidates, we get a lot of talk downplaying the only known one. Strong link vs weak link reminds me of how some cultures frown on stimulants while other cultures frown on relaxants.

hedora2y ago· 1 in thread

The website dies if I try to figure out who the author (“sam”) is, but it sounds like they are used to some awful backwater of academia.

They have this idea that a single editor screens papers to decide if they are uninteresting or fundamentally flawed, then they want a bunch of professors to do grunt work litigating the correctness of the experiments.

In modern (post industrial revolution) branches of science, the work of determining what is worthy of publication is distributed amongst a program committee, which is comprised of reviewers. The editor / conference organizers pick the program committee. There are typically dozens of program committee members, and authors and reviewers both disclose conflicts. Also, papers are anonymized, so the people that see the author list are not involved in accept/reject decisions.

This mostly eliminates the problem where work is suppressed for political reasons, etc.

It is increasingly common for paper PDFs to be annotated with badges showing the level of reproducibility of the work, and papers can win awards for being highly reproducible. The people that check reproducibility simply execute directions from a separate reproducibility submission that is produced after the paper is accepted.

I argue the above approach is about 100 years ahead of what the blog post is suggesting.

Ideally, we would tie federal funding to double blind review and venues with program committees, and papers selected by editors would not count toward tenure at universities that receive public funding.

jltsiren2y ago

The computer science practice you describe is the exception, not the norm. It causes a lot of trouble when evaluating the merits of researchers, because most people in the academia are not familiar with it. In many places, conference papers don't even count as real publications, putting CS researchers at a disadvantage.

From my point of view, the biggest issue is accepting/rejecting papers based on first impressions. Because there is often only one round of reviews, you can't ask the authors for clarifications, and they can't try to fix the issues you have identified. Conferences tend to follow fashionable topics, and they are often narrower in scope than what they claim to be, because it's easier to evaluate papers on topics the program committee is familiar with.

The work done by the program committee was not even supposed to be proper peer review but only the first filter. Old conference papers often call themselves extended abstracts, and they don't contain all the details you would expect in the full paper. For example, a theoretical paper may omit key proofs. Once the program committee has determined that the results look interesting and plausible and the authors have presented them in a conference, the authors are supposed to write the full paper and submit it to a journal for peer review. Of course, this doesn't always happen, for a number of reasons.

fastneutron2y ago· 1 in thread

As much as I agree with the sentiment, we have to admit it isn't always practical. There's only one LIGO, LHC or JWST, for example. Similarly, not every lab has the resources or know-how to host multi-TB datasets for the general public to pick through, even if they wanted to. I sure didn't when I was a grad student.

That said, it infuriates me to no end when I read a Phys. Rev. paper that consists of a computational study of a particular physical system, and the only replicability information provided is the governing equation and a vague description of the numerical technique. No discretized example, no algorithm, and sure as hell no code repository. I'm sure other fields have this too. The only motivation I see for this behavior is the desire for a monopoly on the research topic on the part of authors, or embarrassment by poor code quality (real or perceived).

oneshtein2y ago

Gravitational waves are confirmed by watching of distant quasars:

https://scitechdaily.com/gravitational-waves-detected-using-...

Hydrodynamic quantum analogs can be uses to study quantum particles at macro-scale:

https://en.wikipedia.org/wiki/Hydrodynamic_quantum_analogs

ESA Euclid near-infrared telescope launched few weeks ago:

https://www.esa.int/Science_Exploration/Space_Science/Euclid...

JR14272y ago· 1 in thread

One thing I think people are missing, is that labs replicate other experiments all the time as part of doing their own research. It's just that the results are not always published, or not published in a like-for-like way.

But the information gets around. In my former field, everyone knew which were the dodgy papers, with results no-one could replicate.

m-watson2y ago

That is something that I have struggled to convey. Working in a scientific field looks a lot different than just reading things from that scientific community. Sub-fields and up real small and you know what is going on, what the problems are, and who are what kind of players in your field.

nomilk2y ago· 1 in thread

https://web.archive.org/web/20230130143126/https://blog.ever...

the_arun2y ago

Thank you. Currently the original article is throttled.

Seems like article is not about software code.

jimmar2y ago· 1 in thread

How do you replicate a literature review? Theoretical physics? A neuro case? Research that relies upon natural experiments? There are many types of research. Not all of them lend themselves to replication, but they can still contribute to our body of knowledge. Peer review is helpful in each of these instances.

Science is a process. Peer review isn't perfect. Replication is important. But it doesn't seem like the author understands what it would take to simply replace peer review with replication.

janalsncm2y ago

I don’t think the existence of papers that are difficult to replicate undermines the value of replicating those that are easier.

okaleniuk2y ago· 1 in thread

Back when I was active in academia, our publishers were reluctant to print source code or even repository links (that was largely before GitHub) but they could still share a paper source on demand. If you reference someone else's paper and want to quote some formula, it is easier and less error prone to copy rather than retype.

At that point I thought about making a TeX interpreter so one could easily "run a paper" on their own data to see if the papers claims hold. As it turned out, people often write the same formula in multiple ways and to make a TeX interpreter you'd have to specify a "runnable" subset and convince anyone to use that subset instead of what they got used to. So the idea stalled.

In a few years, publishing a GitHub link along the paper became the norm, and the problem disappeared. At least in applied geometry, people do replicate each other results all the time.

kjkjadksj2y ago

“Running a paper” is honestly challenging these days because of the resource requirements of a lot of scientific code, or the size of certain datasets. One group might have access to a beefy cluster and there’s no pressure for very performant code when they can parallelize the work across a few dozen xeons or have access to tbs of memory. Another group might be running their code on a laptop. Maybe if your data is much larger than the authors data, their code doesn’t even work since it was designed for much smaller datasets.

Tools like nextflow or snakemake help with respect to having a one liner to generate all data in a paper potentially, handle dependencies, list resource expectations, use your own profile to handle your environment specific job scheduling commands and parameters. However, this still doesn’t do anything for whether you have access to the resources needed.

hooby2y ago· 1 in thread

Currently BOTH is being used - peer review is the first pass, reproduction the second.

Peer review might (or might not) weed out a few papers before they ever get to being reproduced - and that a paper "passed" peer review often means very little. (In some journals more, in some less).

You can't replace peer review with peer replication. Reviewers often do volunteer work - supporting their field and the journal by checking submissions just for any grave errors/mistakes. They often spend just 10 to 15 minutes per submission - for hundreds of submissions. It's not realistic to ask those reviewers to do a full replication attempt for hundreds of submissions.

So any attempt to "replace" review with replication, would end up basically removing review altogether, without increasing the amount of replication attempts being made.

hooby2y ago

I did some work for online journals where papers got published even IF the peer review was bad and rejections were exceptionally rare.

The review score of the abstract was only used to decide on the best topics to invite for a presentation or talk - and the review score of the paper was used to hand out awards, decide "highlighted" papers, and it also influenced how high up in the search results a given paper might appear.

elashri2y ago· 1 in thread

Great, but who is going to fund the peer replication?. The economics of research now doesn't even provide a compensation for peer review process time.

nine_k2y ago

Maybe the numerous complaints about the crisis of science are somehow related to the fact that scientific work is severely underpaid.

The pay difference between research and industry in many areas is not even funny.

fodkodrasz2y ago· 1 in thread

How would you peer-replicate observation of a rare, or unique event, for example in astronomy?

lordnacho2y ago

Either get your own telescope and gather your own data, or if only one telescope captured a fleeting event, take that data and see if the analysis turns out the same.

tines2y ago· 1 in thread

"Replace peer code review with 'peer code testing.'"

Probably not gonna catch on.

dongpingOP2y ago

"peer code testing" is already the job of the CI server. As it is nothing new, it probably is not going to catch on.

1 more reply

j452y ago· 1 in thread

Can every thing be replicated in every field

User232y ago

That’s the defining characteristic of engineering. If you can’t reliably replicate everything in an engineering discipline then it’s not an engineering discipline.

jxramos2y ago

You know what I would love to see is metadata attributes surrounding a paper such as [retracted], [reproduced], [rejected], etc. We already have the preprint thing down. Some of these would be implied by being published, ie not a preprint. Maybe even a quick symbol for what method of proof was relied upon—-video evidence, randomized control trial, observational study, Sample count of n>1000 (predefined inequality brackets), etc. I think having this quick digest of information would help an individual wade through a lot of studies quickly.

whatever12y ago

We can have tiers. Tier 1 peer reviewed. Tier 2 peer replicated. We can have it as a stamp on the papers.

All PhD programs have requirement for a minimum number of novel publications. We could add to the requirements a minimum number of replications.

But truth to be told, a PhD in science/ engineering will probably spend their first two years trying to replicate the SOTA anyway. It’s just that today you cannot publish this effort, nobody cares, except yourself and your advisor.

hgsgm2y ago

The problem is equating publication with truth.

Publication is a starting point, not a conclusion

Publication is submitting your code. It still needs to be tested, rolled out, evaluated, and time-tested.

amai2y ago

It would already be a step in the right direction, if papers would also publish a VM with all their code, data and dependencies. It is nice to have the code (https://blog.arxiv.org/2020/10/08/new-arxivlabs-feature-prov...), but without necessary dependencies, the correct OS, compiler Version, etc. replication is even with code often impossible.

Having running demos is another step in the right direction (see https://blog.arxiv.org/2022/11/17/discover-state-of-the-art-...).

But outside of computer science replication is even more difficult. Maybe if people would use standardized laboratories and robots, one could replicate findings by rerunning the robots code on another standard robot lab ( Basically the idea here is to virtualize laboratory work).

But even then for the biggest most complex experiments this will not work: Replicate CERN anyone?

10g1k2y ago

Peer review is also not part of the scientific method. It's nice, but it's not strictly part of the method.

It may be more accurate to suggest that repeatability is part of the scientific method. But even that is not strictly true.

Consider, the single longest running scientific work was not repeatable, and was not shared with anyone outside the cadre of people doing it. Around 3000 years ago, a secretive caste of astrologers/scribes watched the heavens, and recorded their observations for several centuries. They did not publish their findings, thus making them anecdotal (yes, that's what anecdotal means, just that it wasn't published). The exact circumstances and variables were never repeatable, due to the movements of the celestial bodies, precession, etc.

Similarly, the UQ pitch drop experiment, having not yet completed, has not been repeated. But it's still an entirely valid scientific experiment.

geysersam2y ago

Both review and replication has their place. The mistake is treating researchers and the scientific community as a machine: "pull here, fill these forms, comment this research, have a gold star"

Let people review what they want, where they want, how they want. Let people replicate when they find interesting and motivating to work on.

bartwr2y ago

I review 10-20 papers a year.

It's a ton of unpaid, volunteer work, if I want to be a high quality reviewer then it's at least a day (at least 3 thorough reads, taking notes, writing the review, reviewer discussions, post rebuttal, back-and-forth for journals). I am lucky and privileged that my employer counts this towards work time. Only 20% papers get accepted in my domain.

Now if I had to spend a week on replicating a paper - and this is CS/graphics, where it's easy and "free" - I'd never volunteer to being a reviewer.

You'd need professional "replicators", but who will pay for them? And who will be them - you need experts, and if you are an expert, you don't want to merely replicate others people work full time, instead of working on your own innovation.

jhart992y ago

Replication in many fields comes with substantial costs. We are unlikely to see this strategy employed on many/most papers. I agree with other commenters that materials and methodology should be provided in sufficient detail so that others could replicate if desired.

cycomanic2y ago

While I agree with the general sentiment of the paper and creating incentives for more replication is definitely a good idea, I do think the approach is flawed in several ways.

The main point is that the paper seriously underestimates the difficulty and time it requires to replicate experiments in many experimental fields. Who will decide which work needs to be replicated? Should capable labs somehow become bogged down with just doing replication work? Even if they don't find the results not interesting?

In reality if labs find results interesting enough to replicate they will try to do so. The current LK-99 hurrah is a perfect example of that, but it happens on a much smaller scale all the time. Researchers do replicate and build on other work all the time, they just use that replication to create new results (and acknowledge the previous work) instead of publishing a "we replicated paper".

Where things usually fail is in publication of "failed replication" studies, and those are tricky. It is not always clear if the original research was flawed or the people trying to reproduce made an error (again just have a look at what's happening with LK-99 at the moment). Moreover, it can be politically difficult to try to publish a "fail to reproduce" result if you are small unknown lab, if the original result came from a big known group. Most people will believe that you are the one who made the error (and unfortunately big egos might get in the way, and the small lab will have a hard time).

More generally, in my opinion the lack of replication of results is just one symptom of a bigger problem in science today. We (as in society) have essentially turned the scientific environment increasingly competitive, under the guise of "value for tax payer money". Academic scientists now have to constantly compete for grant funding, publish to keep the funding going. It's incredibly competitive to even get in ... At the same time they are supposed to constantly provide big headlines for university press releases, communicate their results to the general public and investigate (and patent) the potential for commercial exploitation. No wonder we see less cooperation.

TrackerFF2y ago

Seems to have been hugged to death.

But - a quick counterexample - as far as replication goes: What if the experiments were run on custom made or exceedingly expensive equipment? How are the replicators supposed to access that equipment? Even in fields which are "easy" to replicate - like machine learning - we are seeing barriers of entry due to expensive computing power. Or data collection. Or both.

But then you move over to physics, and suddenly you're also dealing with these one-off custom setups, doing experiments which could be close to impossible to replicate (say you want to conduct experiments on some physical event that only occurs every xxxx years or whatever)

gordian-not2y ago

The incentive should be to clear the way for tenure track

The junior faculty will clear the rotten apples at the top by finding flaws in their research and then will win the tenure that was lost in return

This will create a nice political atmosphere and improve science

throwawaymaths2y ago

How about we create a Nobel prize for replication. One impressive replication or refutation from last decade (that holds up) gets the prize split up to three ways among the most important authors.

staunton2y ago

Let's get people to publish their data and code first, shall we? That's sooo much easier than demanding whole studies to be replicated... and people still don't do it!

JR14272y ago

I think this wouldn't work, because many experiments need such specific equipment and expertise, that it would be hard to find labs that already have said equipment.

pajushi2y ago

Why shouldn't we hold science more accountable?

"Science needs accounting" is a search I had saved for months which really resonates with the idea of "peer replication."

In accounting, you always have checks and balances, you never are counting money alone. In many cases, accountants duplicate their work to make sure that it is accurate.

Auditors are the corollary to the peer review process. They're not there to redo your work, but to verify that your methods and processes are sound.

tonmoy2y ago

We just need a second LHC with double the number of particle physicists in the world to replicate observation of the Higgs Boson, no big deal

SonOfLilit2y ago

My first thought was "this would never work, there is so much science being published and not enough resources to replicate it all".

Then I remembered that my main issue with modern academia is that everyone is incentivized to publish a huge amount of research that nobody cares about, and how I wish we would put much more work into each of much fewer research directions.

gxt2y ago

So why haven't "science modules" been developed yet? I see a library sized piece of equipment to physically perform the lab work that can be configured akin to CNC machining. Papers would then be submitted with the module program and be easily replicated by other labs.

ayakang314152y ago

One of the Nobel prizes in Physics was the discovery of Higgs Boson at LHC. It cost billions of dollars just to build the facility, and required hundreds of physicists working on it to just conduct the experiment. You can't replicate this. Although I fully agree that replication must come first when it is reasonably doable.

dongpingOP2y ago

https://web.archive.org/web/20230130143126/https://blog.ever...

ahmadmijot2y ago

Quite related: nowadays there is this movement within scientific researches ie Open Science where the (raw) data from ones research is open source. And even methods for in-house fabrication and development together with its source code is open source (open hardware and open software)

ugh1232y ago

Why not just develop a standard "replication instructions" format that papers would need to adhere to? All methods, source code, ingredients, processes, etc are documented in a standard way. This could help tease out a lot of bullshit just by reading this section.

user67232y ago

I remember showing someone raw video of a Safire plasma chamber keeping the ball of plasma lit for several minutes. They said they would need to see a peer reviewed paper. The presumption brought about by the enlightenment era that everyone should get a vote was a mistake.

hinkley2y ago

Is there space in the world for a few publications that only publish replicated work? Seems like that would be a reasonable compromise. Yes you were published, but were you published in Really Real Magazine? Get back to us when you have and we’ll discuss.

User232y ago

One thing that everyone needs to remember about “peer review” is that it isn’t part of the scientific method, but rather that it was imposed on the scientific enterprise by government funding authorities. It’s basically JIRA for scientists.

65102y ago

Seems like a great way for "inferior" journals to gain reputation. Counting citations seems a pretty silly formula/hack. How often you say something doesn't affect how true it is.

husamia2y ago

I review articles all the time. I look for things that tells me about their real work. there are nuances to some experiments that can't be known without replication.

GuB-422y ago

Peer review is not the end. When replication is particularly complex or expensive, peer review may just a way to see if the study is worth replicating.

andsoitis2y ago

Why would you bother replicating someone else’s work (thereby validating it), when you could use that time and resources to do something novel?

seventytwo2y ago

There would need to be an incentive structure where the first replications get (nearly) the same credit as the original publisher.

wcerfgba2y ago

What do we recommend for qualitative research, where replicability is not a quality criterion?

paulpauper2y ago

this would not apply to math or something subjective such as literature. only experimental results need to be replicated.

hospadar2y ago

I assume that the goal here is to reduce the number of not-actually-valid results that get published. Not-actually-valid results happen for lots of reasons (whoops did experiment wrong, mystery impurity, cherry picked data, not enough subjects, straight-up lie, full verification expensive and time consuming but this looks promising) but often there's a common set of incentives: you must publish to get tenure/keep your job, you often need to publish in journals with high impact factor [1].

High impact journals [6] tend to prefer exciting, novel, and positive results (we tried new thing and it worked so well!) vs negative results (we mixed up a bunch of crystals and absolutely none of them are room-temp superconductors! we're sure of it!).

The result is that cherry picking data pays, leaning into confirmation bias pays, publishing replication studies and rigorous but negative results is not a good use of your academic inertia.

I think that creating a new category of rigor (i.e. journals that only publish independently replicated results) is not a bad idea, but: who's gonna pay for that? If the incentive is you get your name on the paper, doesn't that incentivize coming up with a positive result? How do you incentivize negative replications? What if there is only one gigantic machine anywhere that can find those results (LHC, icecube, etc, a very expensive spaceship)?

There might be easier and cheaper pathways to reducing bad papers - incentivizing the publishing of negative results and replication studies separately, paying reviewers for their time, coming up with new metrics for researchers that prioritize different kinds of activity (currently "how much you're cited" and "number of papers*journal impact" things are common, maybe a "how many results got replicated" score would be cool to roll into "do you get tenure"? See [3] for more details). PLoS publish.

I really like OP's other article about a hypothetical "Journal of One Try" (JOOT) [2] to enable publishing of not-very-rigorous-but-maybe-useful-to-somebody results. If you go back and read OLD OLD editions of Philosophical Transactions (which goes back to the 1600's!! great time, highly recommend [4], in many ways the archetype for all academic journals), there are a ton of wacky submissions that are just little observations, small experiments, and I think something like that (JOOT let's say) tuned up for the modern era would, if nothing else, make science more fun. Here's a great one about reports of "Shining Beef" (literally beef that is glowing I guess?) enjoy [5]

[1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6668985/ [2] https://web.archive.org/web/20220924222624/https://blog.ever... [3] https://www.altmetric.com/ [4] https://www.jstor.org/journal/philtran1665167 [5] https://www.jstor.org/stable/101710 [6] https://en.wikipedia.org/wiki/Impact_factor, see also https://clarivate.com/

Hiromy2y ago

Hola te amo

j / k navigate · click thread line to collapse

337 comments

203 comments · 60 top-level

fabian2k2y ago· 67 in thread

I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

This is also all work that doesn't benefit the scientists replicating the paper. It only costs them money and time.

kergonath2y ago

> I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

They also tend to over-estimate the effect of peer review (often equating peer review with validity).

techdragon2y ago

Academic research can genuinely suck sometimes… particularly when you want to actually apply it.

2 more replies

vibrio2y ago

“They also tend to over-estimate the effect of peer review (often equating peer review with validity).“

In my experience, scientists ate comfortably cynical about peer review- even those that serve as reviewers and editors- except maybe junior scientists that haven’t gotten burned yet.

4 more replies

sebzim45002y ago

>I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

I think it would be fine to half the productivity of these fields, if it means that you can reasonably expect papers to be accurate.

dmarchand902y ago

I believe that, contrary to popular belief, the implementation of this system would lead to a substantial increase in productivity in the long run. Here's why:

2 more replies

advisedwang2y ago

2 more replies

crote2y ago

1 more reply

mapt2y ago

We could easily 10x the funding and 5x the manpower we throw at STEM research if we actually cared what they produced.

1 more reply

harimau7772y ago

The issue that I see is: even if halving productivity is acceptable to the field as a whole; how do you incentivize a given scientist to put in the effort?

This seems particularly problematic because it is already notoriously hard to get tenure and academia is already notoriously unrewarding to researchers who don't have tenure.

ImPostingOnHN2y ago

hoosieree2y ago

Half is wildly optimistic.

sqrt_12y ago

FYI there is a at least one science journal that only publishes reproduced research:

https://en.wikipedia.org/wiki/Organic_Syntheses

jamesash2y ago

Started in 1924 and still going strong 100 years later. The gold standard for organic chemistry procedures.

"If you can't reproduce a procedure in Org Syn, it's YOUR fault" - my PhD supervisor

ebiester2y ago

It's completely doable in some cases. (It may never be doable in some areas either.)

tnecniv2y ago

Your proposal has a whole slew of issues.

For every lab doing new work, you’d basically need a clone of that lab to replicate their work.

2 more replies

rapjr92y ago

harimau7772y ago

I think that there's also a lot of psychological/cultural/political issues that work also need to be worked out:

If someone wins the Nobel Prize, do the people who replicated their work also win it? When the history books are written do the replicators get equal billing to the people who made the discovery?

When selecting candidates for prestigious positions, are they really going to consider a replicator equal to an original researcher?

Eddy_Viscosity22y ago

It's not easy because it isn't simple. How do get all of the universities to change their incentives to back this?

1 more reply

SkyMarshal2y ago

> x fewer papers but x number of replications, and you are expected to have x replications in your specialty.

Could it be simplified it even further to say x number of papers, but they only count if they’re replicated by others in the field?

1 more reply

RugnirViking2y ago

lets be brutally honest with ourselves.

indymike2y ago

> 99% of all papers mean nothing. They add nothing to the collective knowledge of humanity.

1 more reply

staunton2y ago

LBTables2y ago

> In my field of robotics there are SOOO many papers that are basically taking three or four established algorithms/machine learning models, and applying them to off-the-shelf hardware.

awesomeMilou2y ago

Thank you. Not even saying this to shit on academia, but modern scientific publishing follows the same governing rules as publishing a YouTube video (in principle).

> There should probably be some sort of separate process for things that actually claim to make important discoveries.

PaulHoule2y ago

justinpombrio2y ago

> If someone cares enough about the work to build on it, they will replicate it anyway.

p1esk2y ago

It is the case in my field (ML): if I care enough about a published result I try to replicate it.

1 more reply

johnnyworker2y ago

> If someone cares enough about the work to build on it, they will replicate it anyway.

MrJohz2y ago

1 more reply

davidktr2y ago

3 more replies

jofer2y ago

Then you have tens to hundreds of thousands of dollars in instrument time to pay to run various analysis which are needed in parallel with the field observations.

It's rarely the simple data analysis that's flawed and far more frequently subtle issues with everything else.

throwaway4aday2y ago

wizofaus2y ago

> What's the value in publishing something that is never replicated?

Because it presents an experimental result to other scientists that they may consider worth trying to replicate?

1 more reply

geysersam2y ago

It still has value if we assume the experiment was done by competent honest people who are unlikely to try to fool us on purpose and unlikely do have made errors.

It would be even better if it was replicated of course.

Depending on what certainty you need you might have to wait for the result of one or several replications, but that is application dependent.

mattkrause2y ago

Longer, even!

Some experiments that study biological development or trained animals can take a year or more of fairly intense effort to start generating data.

Maxion2y ago

A year? some data sets take decades to build up before significant papers can be published on their data. Replication of the dataset is just not feasible.

This whole thread just shows how little the average HNer knows about the academic sciences.

tnecniv2y ago

coldtea2y ago

>I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

Then perhaps those papers shouldn't be published? Or held in any higher esteem than a blog post by the same authors?

gus_massa2y ago

An arXiv preprint is like a blog post.

A paper in a peer review journal is like posting a request for reproduction in a heavily moderated mailing list.

A paper in a predatory journal is like the "You are the best ___" price that you get if you pay to go to the "congress" invitation in spam.

Neither of them guaranty that the result is true. The publication in some peer review journals give a minimal guaranty that the paper is not horribly bad, but I've seen too much crap there too.

I know a few journals and author in my area that are serious and I can guess the result will hold, but I find very difficult to evaluate journals and authors in other areas.

kshahkshah2y ago

You'd have no idea if you were going down a well trodden path which would yield no success because you have no idea it was well trod. No one publishes negative results, etc.

majormajor2y ago

> If someone cares enough about the work to build on it, they will replicate it anyway.

oldgradstudent2y ago

> This is also all work that doesn't benefit the scientists replicating the paper. It only costs them money and time

If you build upon a result, you almost have to replicate it.

An acquaintance spent years building upon a result that turned out to be fraudulent/p-hacked.

dongpingOP2y ago

This is, of course, a naive proposal without too much thought into it. But I was wondering what I would have missed here.

i_no_can_eat2y ago

and in this proposal, who will be tasked with replicating the work?

1 more reply

boxed2y ago

> I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

I don't see how the current system works really either. Fraud is rampant, and replication crisis is the most common state of most fields.

Basically the current system is failing at finding out what is true. Which is the entire point. That's pretty damn bad.

tptacek2y ago

Fraud seems rampant because you hear about cases of fraud, but not about the tens of thousands of research labs plugging away day after day.

2 more replies

coding1232y ago

pvaldes2y ago

One. Doing experiments is yet enough difficult and painful.

[1] as the valuable experts are now stuck validating things instead doing their own job

faeriechangling2y ago

In some fields research can’t be replicated later. Much of all autism research will NEVER be replicated because the population of those considered autistic is not stable over time.

DoctorOetker2y ago

> [...] non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper

After "peer review" any apparently promising results prompt other groups to build on them by utilizing it as a step or building block.

On paper it sounds more expensive to require independent replication, but only because the costs of replication attempts are hidden until its typically rather late.

Is it really more expensive if the replication attempts are in some sense mandatory?

The pseudo-final word, end of line?

brightball2y ago

> I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

ImPostingOnHN2y ago

I'm not convinced this proposed alternative is better than the status quo. It's simply not feasible, no matter how many benefits one might imagine.

1 more reply

backtoyoujim2y ago

Yes it would indeed mean slowing down and having more scientists.

It would mean disruption is no longer a useful tool for human development.

ebiester2y ago

brnaftr3612y ago

throwawaymaths2y ago

> I don't see how this could ever work,

http://www.orgsyn.org/

> All procedures and characterization data in OrgSyn are peer-reviewed and checked for reproducibility in the laboratory of a member of the Board of Editors

Never is a strong word.

omgwtfbyobbq2y ago

What about a system where peer replication is required if the number of citations exceeds some threshold?

p1esk2y ago

Who will be replicating it? Why would I want to set aside my own research to replicate some claim someone made? How would this help my career?

3 more replies

iamthemonster2y ago

ljf2y ago

Only then could he even start building the experiment - total time to run it all seems to run across years.

techas2y ago

mandmandam2y ago

It's wild to me that although we know that it was Ghislaine Maxwell's daddy who started this incredibly corrupt system, people hardly mention this fact.

The US system, and others, even attack people who dare to try and make science more open. RIP Aaron Swartz, and long live Alexandra Elbakyan.

indymike2y ago

> This is also all work that doesn't benefit the scientists replicating the paper. It only costs them money and time.

Maybe this is what needs to change. If we only reward discovery and success, then the incentive is to only produce discovery and success.

chmod6002y ago

Please excuse my ignorance, but I'm not convinced.

What are we supposed to do in a hundred years when the scientists of today are dead and we have a bunch of results with important implications that aren't documented well enough to replicate?

jononomo2y ago

If it is not replicated it shouldn't be published, other than as a provisional draft. I don't care if it hurts your feelings.

wilde2y ago

The pressure to replicate would make folks publish things in forms that are easier to replicate. This cost would go down over time.

picadores2y ago

Yes, the amount of work done that could go into more paper churn?

miga2y ago· 17 in thread

Peer review does not serve to assure replication, but assure readability and comprehensibility of the paper.

Given that some experiments cost billions to conduct, it is impossible to implement "Peer Replication" for all papers.

What could be done is to add metadata about papers that were replicated.

NalNezumi2y ago

At least that's my understanding

kergonath2y ago

kkylin2y ago

snowwrestler2y ago

> Isn’t readability and comprehensibility the job of the editor/journal to check

Yes, who do you think ask the reviewers to perform their reviews?

> peer review is more for checking if the methodology, scope, claim, direction, conclusion and relevances is sound&trustable.

No, the parent comment has it right. The only thing being reviewed is the paper, and the point is to make sure it communicates clearly, not that it’s “sound and trustable.”

kjkjadksj2y ago

hedora2y ago

In CS, the editor / journal don’t do those things. Instead, the reviewers do. (Sometimes reviewers “shepherd” papers to help fix readability after acceptance).

Also, most work goes to conferences; journals typically publish longer versions of published works.

jxramos2y ago

Yes a metadata relationship link would be outstanding. Reproduced in some paper xyz, or by some institution, named individuals, etc. some kind of structured information would be very useful.

kergonath2y ago

Barriers to publication should be lower for replication studies, I think that’s the main problem.

s1artibartfast2y ago

They're basically no barriers to publication. There are a number of normal journals that publish everything submitted if it appears to be honest research.

1 more reply

jxramos2y ago

I wonder if undergrads could be harnessed to enter into this kind of work, maybe under the supervision of doctoral students and a well meaning and interested PI.

strangattractor2y ago

Maybe add people as special authors/contributors to the original work.

ebiester2y ago

mathisfun1232y ago

>Peer review does not serve to assure replication, but assure readability and comprehensibility of the paper.

I have had a paper rejected twice in a row over the last year. Both times the comments include something like "paper was very well-wriiten; well-written enough that an undergrad could read it".

Peer review ensures the gates are kept.

julienreszka2y ago

>Experiments that cost billions to conduct

If you can't replicate them it's like they didn't happen anyways

kergonath2y ago

But yeah, in the grand scheme of things if it hasn’t been replicated, then it hasn’t been proven, but some works are credible on their own.

thfuran2y ago

So no experiments have happened because I don't have a lab, and CERN is just an elaborate ruse?

tnecniv2y ago

Ah yes, if I can’t run the LHC at home, none of the work there happened

matthewdgreen2y ago· 9 in thread

sebastos2y ago

Very well put. This is the clearest way of looking at it in my view.

Maxion2y ago

casualscience2y ago

Most of this is very legit, but this

Maxion2y ago

And on the extreme end you have the multi-decade longitudinal studies in epidemiology / biomedicine that would be more-or-less impossible to replicate.

1 more reply

oneshtein2y ago

nine_k2y ago

For sharing results widely, there's arxiv. The problem is that the fanout is now overwhelming.

The public perception of a publication in a prestigious journal as the established truth does not help, too.

isaacremuant2y ago

> The public perception of a publication in a prestigious journal as the established truth does not help, too.

it's not so much the public perception but what govs/media/tech and other institutions have pushed down so that the public doesn't question whatever resulting policy they're trying to put forth.

"Trust the science" means "Thou shalt not question us, simply obey".

Anyone with eyes who has worked in institutions knows that bureocracy, careerism and corruption are intrinsic to them.

dmbche2y ago

Your analysis seems to portray all scientists as pure hearted. May I remind you of the latest Stanford scandal where the president of Stanford was found to have manipulated data?

matthewdgreen2y ago

2 more replies

eesmith2y ago· 7 in thread

> the real test of a paper should be the ability to reproduce its findings in the real world. ...

> What if all the experiments in the paper are too complicated to replicate? Then you can submit to [the Journal of Irreproducible Results].

Observational science is still a branch of science even if it's difficult or impossible to replicate.

Consider the first photographs of a live giant squid in its natural habitat, published in 2005 at https://royalsocietypublishing.org/doi/10.1098/rspb.2005.315... .

Who seriously thinks this shouldn't have been published until someone else had been able to replicate the result?

Who thinks the results of a drug trial can't be published until they are replicated?

Or replicate the results of the flyby of Pluto, or flying a helicopter on Mars?

Here's a paper I learned about from "In The Pipeline"; "Insights from a laboratory fire" at https://www.nature.com/articles/s41557-023-01254-6 .

How would peer replication be relevant?

msla2y ago

Would this require labs to improve their software environments and learn some new tools? Would this require labs to give up whatever used to be secret sauce? That's. The. Point.

counters2y ago

eesmith2y ago

> someone else being able to run that code on that dataset and get the same results.

I think when people talk about "replicate" they mean something more than that.

The dataset could contain coding errors, and the analysis could contain incorrect formulas and bad modeling. Reproducing a bad analysis, successfully, provide no corrective feedback.

I know for one paper I could replicate the paper's results using the paper's own analysis, but I couldn't replicate the paper's results using my analysis.

> Would this require labs to give up whatever used to be secret sauce? That's. The. Point.

That seems to be a very different Point.

Newton famously published results made from using his secret sauce - calculus - by recasting them using more traditional methods.

1 more reply

kergonath2y ago

> Who seriously thinks this shouldn't have been published until someone else had been able to replicate the result?

In your example, new observations of giant squids are still massively valuable even if not that novel anymore. So new observations should be encouraged (as I am sure they are).

> Or replicate the results of the flyby of Pluto, or flying a helicopter on Mars?

> How would peer replication be relevant?

eesmith2y ago

Your comment concerns post-publication peer-replication, yes?

If so, it's a different topic. The linked-to essay specifically proposes:

That's pre-publication peer-replication, and my comment was only meant to be interpreted in that light.

1 more reply

phpisthebest2y ago

Yes in a perfect world we would also replicate the data collection but we do not live in a perfect world

For example the FDA says it will take decades to release the raw data from the COVID Vaccine trials.. Why... and that is after being forced to do so via a law suit.

eesmith2y ago

> For example many teams use the same raw data from Large Colliders, or JWT, or other large science projects to reach competiting conclusions.

Yes, but why must the first team wait until the second is finished before publishing?

Other paleontologists might not be able to replicate the work, but still tell if it's publishable - that's what they do now, yes?

> but we do not live in a perfect world

Alternatively, we don't live in a perfect world which is why we have the current system instead of requiring replication first.

Since the same logic works for both cases, I don't think it's persuasive logic.

> the FDA says it will take decades

Well, that's a tangent. The FDA is charged with protecting and promoting public health, not improving the state of scholarly literature.

And the FDA is only one of many public health organizations which carried out COVID vaccine trials.

abnry2y ago· 7 in thread

If scientists are going to complain that's its too hard or too expensive to replicate their studies, then that just shows their work is BS.

snitty2y ago

>If scientists are going to complain that's its too hard or too expensive to replicate their studies, then that just shows their work is BS.

1/10th of that amount for anti-flagellin antibody is $372. [2]

A kit to prep a cell for RNA sequencing is $6-10 per use. That's JUST isolation of the RNA. Not including reverse transcribing it to cDNA for sequencing, or the sequencing itself. [3]

Let's not even reach things like materials science where you may be working on an epitaxial growth paper, and there are only a handful of labs where they could even feasibly repeat the experiment.

Or say something with a BSL-3 lab where there are literally only 15 labs in the US that could feasibly do the work, assuming they aren't working on their own stuff. [4]

alsodumb2y ago

Nah, it doesn't. It just shows that it's time consuming and expensive to replicate their studies.

abnry2y ago

If that's the case, then don't claim confidence in the work or make policy decisions based off of it. If there is no epistemological humility, then yes, it is still BS.

1 more reply

fodkodrasz2y ago

I guess if software developers will complain that it's too hard or too expensive to thoroughly test their code to ensure exactly zero bugs at release[1], then that just shows their work is BS.

[1]: if you have delivered telco code to Softbank you may have heard this sentence

abnry2y ago

Replication is not the same thing as zero bugs in software.

1 more reply

azan_2y ago

Maxion2y ago

Something in switzerland called the Large Hadron Collider comes to mind.

I guess we should not talk about the Higgs before someone else builds a second one and replicates the papers.

1 more reply

titzer2y ago· 5 in thread

I get the impression that suggestions like these are written by non-scientists who do not have experience with the peer review process of any discipline. Things just don't work like that.

mike_hearn2y ago

lou13062y ago

1 more reply

Maxion2y ago

> I get the impression that suggestions like these are written by non-scientists who do not have experience with the peer review process of any discipline. Things just don't work like that.

Not to mention that the cutting edge in many sciences are perhaps two-three research groups of 5-30 individuals each in varying research institutions around the world.

ramesh312y ago

> Even with explicit instructions, it is hard enough to even get the same code to run in a different environment and give the same results.

Is it really that hard for researchers to standardize around providing Dockerfiles? Environment replication is a solved problem.

titzer2y ago

> Environment replication is a solved problem.

Unfortunately, no. Dockerfiles aren't as portable as you think, and not architecture-independent. VMs are better, but even then, performance isn't portable either.

The last artifact I produced included builds of 3 web browsers from source--it was over 10GB. One doesn't just "build Chrome in a dockerfile".

jonnycomputer2y ago· 4 in thread

peteradio2y ago

What field is it too expensive or difficult to reproduce the data?

jonnycomputer2y ago

Lots?

5424582y ago

TrackerFF2y ago

On the top of my head: Say you want to research biological data from the Mariana trench.

Or even worse: Some passing by comet, or planet/moon/whatever in the solar system. And just to make things EVEN worse, you need to analyze the data in some destructive way.

Certainly very plausible scenarios, but also some which could prohibitively expensive to do multiple times.

infogulch2y ago· 3 in thread

1. Rebrand peer review as a "readability review" which is what reviewers tend to focus on today.

analog312y ago

Every experimental paper I've ever read has contained an "Experimental" section, where they provide the details on how they did it. Those sections tend to be general enough, albeit concise.

janalsncm2y ago

2 more replies

ahmadmijot2y ago

> My doctoral thesis project used roughly 1/2 million dollars of gear,

1 more reply

waynecochran2y ago· 3 in thread

cptskippy2y ago

You'll quickly discover when you enter the workforce that the reasons we have CI/CD, Docker, and virtualization are because of a similar problem. The dread "it works on my machine" response.

janalsncm2y ago

I had a very similar experience in my masters. Really made me think, what exactly are the peers “reviewing” if they don’t even know whether the technique works in the first place.

waynecochran2y ago

NalNezumi2y ago· 2 in thread

Imo, A more realistic thing to do is "replicability review" and/or requirement to submit "methodology map" to each paper.

Shit like this make me lose faith in certain science directions. And I've seen a couple of junior researcher giving it all up because they concluded it's all just house of cards.

[1] https://arxiv.org/abs/2005.12729

[2] https://arxiv.org/abs/2202.02465

Edit: also if you think that's too tedious/costly, reminder that publishers rake in record profits so the resources are already there https://youtu.be/ukAkG6c_N4M

kergonath2y ago

> I've met Bio PHD that have wasted Months of their life tracking up experimental details not mentioned in a paper

That’s just evil!

Maxion2y ago

> Large dataset should go to a repository or a dataset journal, that way the method is still peer reviewed and the dataset has a doi and is much easier to re-use.

This may be possible in some sciences, but not in epidemiology or biomed. Often the study is based on tissue samples owned by some entity, with permission granted only to some certain entity.

Datasets in epidemiology are often full of PII, and cannot be shared publicly for many reasons.

SubiculumCode2y ago· 2 in thread

Scientist publishes paper based on ABCD data.

Replicator: Do you know how much data I'll need to collect? 11,000 particpants followed across multiple timepoints of MRI scanning. Show me the money.

petesergeant2y ago

Definitely something that needs large charitable investment, but charities like that do exist, eg Wellcome Trust

SubiculumCode2y ago

Like 290+ million, just to get started.

freeopinion2y ago· 2 in thread

dongpingOP2y ago

In the sense of replicating the results, we do have CI servers and even fuzzers running for our "code replication".

freeopinion2y ago

Would code replication result in fewer use after free, or off by one than code review? Or would it mostly be a waste of resources including time?

1 more reply

moelf2y ago· 2 in thread

I wish we can replicate the LHC

Maxion2y ago

No talking about the Higgs before that happens, apparently.

kergonath2y ago

We will, don’t worry.

janalsncm2y ago· 1 in thread

For a while Reddit had the mantra “pics or it didn’t happen”.

At least in CS/ML there needs to be a “code or it didn’t happen”. Why? Papers are ambiguous. Even if they have mathematical formulas, not all components are defined.

Peer replication in these fields is an easy low hanging fruit that could set an example for other fields of science.

bradley132y ago

CS and ML are my field, although I'm no longer active in research. I always made a code archive available. Want to replicate? Download and run.

This should be standard now, in the age of GitHub, GitLab, et al. If a paper discusses an implementation, but doesn't provide code, it is probably BS.

leedrake52y ago· 1 in thread

Peer Review is the right solution to the wrong problem: https://open.substack.com/pub/experimentalhistory/p/science-...

vinnyvichy2y ago

hedora2y ago· 1 in thread

The website dies if I try to figure out who the author (“sam”) is, but it sounds like they are used to some awful backwater of academia.

This mostly eliminates the problem where work is suppressed for political reasons, etc.

I argue the above approach is about 100 years ahead of what the blog post is suggesting.

jltsiren2y ago

fastneutron2y ago· 1 in thread

oneshtein2y ago

Gravitational waves are confirmed by watching of distant quasars:

https://scitechdaily.com/gravitational-waves-detected-using-...

Hydrodynamic quantum analogs can be uses to study quantum particles at macro-scale:

https://en.wikipedia.org/wiki/Hydrodynamic_quantum_analogs

ESA Euclid near-infrared telescope launched few weeks ago:

https://www.esa.int/Science_Exploration/Space_Science/Euclid...

JR14272y ago· 1 in thread

But the information gets around. In my former field, everyone knew which were the dodgy papers, with results no-one could replicate.

m-watson2y ago

nomilk2y ago· 1 in thread

https://web.archive.org/web/20230130143126/https://blog.ever...

the_arun2y ago

Thank you. Currently the original article is throttled.

Seems like article is not about software code.

jimmar2y ago· 1 in thread

Science is a process. Peer review isn't perfect. Replication is important. But it doesn't seem like the author understands what it would take to simply replace peer review with replication.

janalsncm2y ago

I don’t think the existence of papers that are difficult to replicate undermines the value of replicating those that are easier.

okaleniuk2y ago· 1 in thread

In a few years, publishing a GitHub link along the paper became the norm, and the problem disappeared. At least in applied geometry, people do replicate each other results all the time.

kjkjadksj2y ago

hooby2y ago· 1 in thread

Currently BOTH is being used - peer review is the first pass, reproduction the second.

Peer review might (or might not) weed out a few papers before they ever get to being reproduced - and that a paper "passed" peer review often means very little. (In some journals more, in some less).

So any attempt to "replace" review with replication, would end up basically removing review altogether, without increasing the amount of replication attempts being made.

hooby2y ago

I did some work for online journals where papers got published even IF the peer review was bad and rejections were exceptionally rare.

elashri2y ago· 1 in thread

Great, but who is going to fund the peer replication?. The economics of research now doesn't even provide a compensation for peer review process time.

nine_k2y ago

Maybe the numerous complaints about the crisis of science are somehow related to the fact that scientific work is severely underpaid.

The pay difference between research and industry in many areas is not even funny.

fodkodrasz2y ago· 1 in thread

How would you peer-replicate observation of a rare, or unique event, for example in astronomy?

lordnacho2y ago

Either get your own telescope and gather your own data, or if only one telescope captured a fleeting event, take that data and see if the analysis turns out the same.

tines2y ago· 1 in thread

"Replace peer code review with 'peer code testing.'"

Probably not gonna catch on.

dongpingOP2y ago

"peer code testing" is already the job of the CI server. As it is nothing new, it probably is not going to catch on.

1 more reply

j452y ago· 1 in thread

Can every thing be replicated in every field

User232y ago

That’s the defining characteristic of engineering. If you can’t reliably replicate everything in an engineering discipline then it’s not an engineering discipline.

jxramos2y ago

whatever12y ago

We can have tiers. Tier 1 peer reviewed. Tier 2 peer replicated. We can have it as a stamp on the papers.

All PhD programs have requirement for a minimum number of novel publications. We could add to the requirements a minimum number of replications.

hgsgm2y ago

The problem is equating publication with truth.

Publication is a starting point, not a conclusion

Publication is submitting your code. It still needs to be tested, rolled out, evaluated, and time-tested.

amai2y ago

Having running demos is another step in the right direction (see https://blog.arxiv.org/2022/11/17/discover-state-of-the-art-...).

But even then for the biggest most complex experiments this will not work: Replicate CERN anyone?

10g1k2y ago

Peer review is also not part of the scientific method. It's nice, but it's not strictly part of the method.

It may be more accurate to suggest that repeatability is part of the scientific method. But even that is not strictly true.

Similarly, the UQ pitch drop experiment, having not yet completed, has not been repeated. But it's still an entirely valid scientific experiment.

geysersam2y ago

Both review and replication has their place. The mistake is treating researchers and the scientific community as a machine: "pull here, fill these forms, comment this research, have a gold star"

Let people review what they want, where they want, how they want. Let people replicate when they find interesting and motivating to work on.

bartwr2y ago

I review 10-20 papers a year.

Now if I had to spend a week on replicating a paper - and this is CS/graphics, where it's easy and "free" - I'd never volunteer to being a reviewer.

jhart992y ago

cycomanic2y ago

While I agree with the general sentiment of the paper and creating incentives for more replication is definitely a good idea, I do think the approach is flawed in several ways.

TrackerFF2y ago

Seems to have been hugged to death.

gordian-not2y ago

The incentive should be to clear the way for tenure track

The junior faculty will clear the rotten apples at the top by finding flaws in their research and then will win the tenure that was lost in return

This will create a nice political atmosphere and improve science

throwawaymaths2y ago

How about we create a Nobel prize for replication. One impressive replication or refutation from last decade (that holds up) gets the prize split up to three ways among the most important authors.

staunton2y ago

Let's get people to publish their data and code first, shall we? That's sooo much easier than demanding whole studies to be replicated... and people still don't do it!

JR14272y ago

I think this wouldn't work, because many experiments need such specific equipment and expertise, that it would be hard to find labs that already have said equipment.

pajushi2y ago

Why shouldn't we hold science more accountable?

"Science needs accounting" is a search I had saved for months which really resonates with the idea of "peer replication."

In accounting, you always have checks and balances, you never are counting money alone. In many cases, accountants duplicate their work to make sure that it is accurate.

Auditors are the corollary to the peer review process. They're not there to redo your work, but to verify that your methods and processes are sound.

tonmoy2y ago

We just need a second LHC with double the number of particle physicists in the world to replicate observation of the Higgs Boson, no big deal

SonOfLilit2y ago

My first thought was "this would never work, there is so much science being published and not enough resources to replicate it all".

gxt2y ago

ayakang314152y ago

dongpingOP2y ago

https://web.archive.org/web/20230130143126/https://blog.ever...

ahmadmijot2y ago

ugh1232y ago

user67232y ago

hinkley2y ago

User232y ago

65102y ago

Seems like a great way for "inferior" journals to gain reputation. Counting citations seems a pretty silly formula/hack. How often you say something doesn't affect how true it is.

husamia2y ago

I review articles all the time. I look for things that tells me about their real work. there are nuances to some experiments that can't be known without replication.

GuB-422y ago

Peer review is not the end. When replication is particularly complex or expensive, peer review may just a way to see if the study is worth replicating.

andsoitis2y ago

Why would you bother replicating someone else’s work (thereby validating it), when you could use that time and resources to do something novel?

seventytwo2y ago

There would need to be an incentive structure where the first replications get (nearly) the same credit as the original publisher.

wcerfgba2y ago

What do we recommend for qualitative research, where replicability is not a quality criterion?

paulpauper2y ago

this would not apply to math or something subjective such as literature. only experimental results need to be replicated.

hospadar2y ago

The result is that cherry picking data pays, leaning into confirmation bias pays, publishing replication studies and rigorous but negative results is not a good use of your academic inertia.

Hiromy2y ago

Hola te amo

j / k navigate · click thread line to collapse