P < 0.05 is outdated at this point, for anything truly ground breaking p < 0.005 or < 0.0005 is probably a better choice, and even then I would ask "Where did you get your dataset from and did you combine (!!!) datasets from multiple orgs."
The methodology is deeply flawed, given that you haven't stopped accidental p-hacking (publication bias) and deliberate p-hacking (researcher data mining, e.g adding endpoints after looking at results), which compounds to create fake results with astronomically tiny p-values.
You can only trust single, extremely large, canonical RCTs which were announced in advance, and in which you are confident there is no survivorship bias in terms of the possibility that this RCT would have been cancelled had the results been thought to be negative halfway through.
Epidemiology (victim of researcher p-hacking and impossible to deal with confounders) or meta-analysis of RCTs or single small RCTs or RCTs that weren't announced in advance (victim of publication bias) should be taken with a grain of salt. If you accept conclusions drawn from these things at face value, be prepared to accept anything, because the methodology you've accepted is proven to be easily capable of demonstrating any fake phenomenon as true.
There are so many ways that are studies can be flawed either accidentally or purposefully. I truly assume most studies are not purposefully fraudulent - people, by and large, are honest... but I do believe there are enough problems with our current methods that most studies are not truly accurate.
Having a publish clinical trial regimen - created before your study starts - helps with a lot of this. It answers how you segment patients, match patients, handle patient dropout, and specifies what you are trying to compare for outcomes.
Is it the end-all be-all? obviously not. I don't know what the true answer is.
Its actually my job.
What are you designing new experiments for? Are you taking results in one cell like and condition and trying to transfer them to new cell lines and conditions?
Are you trying to reproduce the same data in the same cel lines/organisms?
There's such a huge wide array of what your could be taking about, but based on my experience in life sciences what you say does not seem the least bit realistic, in my first interpretation of what you are saying.
Even then, I would worry that the results may be caused by some confounder in the original dataset/design instead of something you can trust.