If you were starting a diagnostic radiology residency, including intern year and fellowship, you'd just be finishing now. How can you really think that "computers can't read diagnostic images" if models such as this can describe a VGA connector outfitted with a lighting cable?
1. Radiology =/= interpreting pixels and applying a class label.
2. Risk and consequences of misclassifying T-staging of a cancer =/= risk of misclassifying a VGA connector.
3. Imaging appearance overlap of radiological findings >>>>>>>>>> imaging appearance overlap of different types of connectors (e.g. infection and cancer can look the same, we make educated guesses on a lot of things considering many patient variables, clinical data, and prior imaging.) You would need to have a multi-modal model enriched with a patient knowledge graph to try and replicate this, while problems like this are being worked on we are no where close enough for this to be a near-term threat. We haven't even solved NLP in medicine, let alone imaging interpretation!
4. Radiologists do far more than interpret images, unless you're in a tele-radiology eat-what-you-kill sweatshop. This includes things like procedures (i.e. biopsies and drainages for diagnostic rads) and multidisciplinary rounds/tumor boards.
But, at the end of the day, diagnostic radiology is about taking an input set of bytes and transforming that to an output set of bytes - that is absolutely what generative AI does excellently. When you said "I'm not sure how you can say this with a straight face?", I couldn't understand if you were talking about now, or what the world will look like in 40 years. Because someone finishing med school now will want to have a career that lasts about 40 years. If anything, I think the present day shortage of radiologists is due to the fact that AI is not there yet, but smart med students can easily see the writing on the wall and see there is a very, very good chance AI will start killing radiology jobs in about 10 years, let alone 40.
First AI will make our lives much easier as it will on other industries, saying it will take 10 years to solve the AI problem for most of diagnostic radiology is laughable. There are many reasons why radiology AI is currently terrible and we don't need to get into them but let's pretend that current DL models can do it today.
The studies you would need to make to validate this across multiple institutions while making sure population drift doesn't happen (see the Epic sepsis AI predicting failure in 2022) and validating long term benefits (assuming all of this is going right) will take 5-10 years. It'll be another 5-10 years if you aggressively lobby to get this through legislation and deal the insurance/liability problem.
Separately w have to figure out how we set up the infrastructure for this presumably very large model in the context of HIPAA.
I find it hard to hard to believe that all of this will happen in 10 years, when once again we still don't have models that do it close to being good enough today. What will likely happen is it will be flagging nodules for me so I don't have to look as carefully at the lungs and we will still need radiologists like we need cardiologists to read a voltage graph.
Radiology is a lot about realizing what is normal, 'normal for this patient' and what we should care about while staying up to date on literature and considering the risks/benefits of calling an abnormality vs not calling one. MRI (other than neuro) is not that old of a field we're discovering new things every year and pathology is also evolving. Saying it's a solved problem of bits and bytes is like saying ChatGPT will replace software engineers in 10 years because it's just copy pasting code from SO or GH and importing libraries. Sure it'll replace the crappy coders and boilerplate but you still need engineers to put the pieces together. It will also replace crap radiologists who just report every pixel they see without carefully interrogating things and the patient chart as relevant.
This tendency to simplify is everywhere in radiology: When looking for a radial head fracture, we're taught to exam the cortex for discontinuities, look for an elbow joint effusion, evaluate the anterior humeral line, etc. But what if there's some feature (or combination of feature) that is beyond human perception? Maybe the radial ulnar joint space is a millimeter wider than it should be? Maybe soft tissues are just a bit too dense near the elbow? Just how far does the fat pad have to be displaced to indicate an effusion? Probably the best "decision function" is a non-linear combination of all these findings. Oh, but we only have 1 minute to read the radiograph and move on to the next one.
Unfortunately, as someone noted below, advances in medicine are glacially slow. I think change is only going to come in the form of lawsuits. Imagine a future where a patient and her lawyer can get a second-opinion from an online model, "Why did you miss my client's proximal scaphoid fracture? We uploaded her radiographs and GPT-4 found it in 2 seconds." If and when these types of lawsuits occur, malpractice insurances are going to push for radiologists to use AI.
Regarding other tasks performed by radiologists, some radiologists do more than dictate images, but those are generally the minority. The vast majority of radiologists read images for big money without ever meeting the patient or the provider who ordered the study. In the most extreme case, radiologists read studies after the acute intervention has been performed. This happens a lot in IR - we get called about a bleed, review the imaging, take the patient to angiography, and then get paged by diagnostic radiology in the middle of the case.
Orthopedists have already wised-up to the disconnect between radiology reimbursement and the discrepancy in work involved in MR interpretation versus surgery. At least two groups, including the "best orthopedic hospital in the country" employ their own in-house radiologists so that they can capture part of the imaging revenue. If GPT-4 can offer summative reads without feature simplification, and prior to intervention, why not have the IR or orthopedist sign off the GPT-4 report?
1b. With respects to the simplicity of LI-RADS, if you are strictly following the major criteria only it's absolutely simple. This was designed to assist the general radiologist so they do not have to hedge (LR-5 = cancer). If you are practicing in a tertiary care cancer center (i.e. one where you would be providing locoregional therapy and transplant where accurate diagnosis matters), it is borderline negligent to not be applying ancillary features (while optional LR-4 triggers treatment as you would be experienced with in your practice). Ancillary features and accurate lesion segmentation over multiple sequences that are not accurately linked on the Z-axis remains an unsolved problem, and are non-trivial to solve and integrate findings on in CS (I too have a CS background and while my interest is in language models my colleagues involved with multi-sequence segmentation have had less than impressive results even using the latest techniques with diffusion models, although better than U-net, refer to Junde Wu et al. from baidu on their results). As you know with medicine it is irrefutable that increased / early diagnosis does not necessarily lead to improved patient outcomes, there are several biases that result from this and in fact we have routinely demonstrated that overdiagnosis results in harm for patients and early diagnosis does not benefit overall survival or mortality.
2a. Again a fundamental misunderstanding of how radiology and AI work and in fact the reason why the two clinical decision algorithms you mentioned were developed. First off, we generally have an overdiagnosis problem rather than an underdiagnosis one. You bring up a specifically challenging radiographic diagnosis (scaphoid fracture), if there is clinical suspicion for scaphoid injury it would be negligent to not pursue advanced imaging. Furthermore, let us assume for your hypothetical GPT-4 or any ViLM has enough sensitivity (in reality they don't, see Stanford AIMI and Microsoft's separate on chest x-rays for more detail), you are ignoring specificity. Overdiagnosis HARMS patients.
2b. Sensitivity and specificity are always tradeoffs by strict definition. For your second example of radial head fracture, every radiologist should be looking at the soft tissues, it takes 5 seconds to window if the bone looks normal and I am still reporting these within 1-2 minutes. Fortunately, this can also be clinically correlated and a non-displaced radial head fracture that is 'missed' or 'occult' can be followed up in 1 week if there is persistent pain with ZERO (or almost zero) adverse outcomes as management is conservative anyway. We do not have to 'get it right' for every diagnosis on every study the first time, thats not how any field of medicine works and again is detrimental to patient outcomes. All of the current attempts at AI readers have demonstrably terrible specificity hence why they are not heavily used even in research settings, its not just inertia. As an aside, the anterior humeral line is not a sign of radial head fracture.
2c. Additionally, if you were attempting to build such a system using a ViLM model is hardly the best approach. It's just sexy to say GPT-4 but 'conventional' DL/ML is still the way to go if you have a labelled dataset and has higher accuracy than some abstract zero-shot model not trained on medical images.
3. Regarding lawsuits, we've had breast computer-aided-diagnosis for a decade now and there have been no lawsuits, at least major enough to garner attention. It is easy to explain why, 'I discounted the AI finding because I reviewed it myself and disagreed.' In fact that is the American College of Radiology guidance on using breast CAD. A radiologist should NOT change their interpretation solely based on a CAD finding if they find it discordant due to aforementioned specificity issues and the harms of overdiagnosis. What you should (and those of us practicing in these environments do) is give a second look to the areas identified by CAD.
4. Regarding other tasks, this is unequivocally changing. In most large centres you don't have IR performing biopsies. I interviewed at 8 IR fellowships and 4 body imaging fellowships and in all of those this workload was done by diagnostic radiologists. We also provide fluoroscopic services, I think you are referring to a dying trend where IR does a lot of them. Cleveland Clinic actually has nurses/advanced practice providers doing this. Biopsies are a core component of diagnostic training per ACGME guidelines. It is dismissive to say the vast majority of radiologists read images for big one without ever reviewing the clinical chart, I don't know any radiologist who would read a complex oncology case without reviewing treatment history. How else are you assessing for complications without knowing what's been done? I don't need to review the chart on easy cases, but that's also not what you want a radiologist for. You can sign a normal template for 90% of reports, or 98% of CT pulmonary embolism studies without looking at the images and be correct. That's not why were trained and do fellowships in advanced imaging, its for the 1% of cases that require competent interpretation.
5. Regarding orthopedists, the challenge here is that it is hard for a radiologist to provide accurate enough interpretation without the clinical history for a single or few pathologies that a specific orthopedist deals with. For example, a shoulder specialist looks at the MRI for every one of their patients in clinic. As a general radiologist my case-volumes are far lower than theres. My job on these reports is to triage patients to the appropriate specialty (i.e. flag the case as abnormal for referral to ortho) who can then correlate with physical exam maneuvers and adjust their ROC curves based on arthroscopic findings. I don't have that luxury. Fortunately, that is also not why you employ a MSK radiologist as our biggest role is contributing to soft tissue and malignancy characterization. I've worked with some of very renowned orthopedists in the US and as soon as you get our of their wheelhouse of the 5 ligaments they care about they rely heavily on our interpretations.
Additionally, imaging findings in MSK does not equal disease. In a recent study of asymptomatic individuals > 80% had hip labral tears. This is why the clinical is so important. I don't have numbers on soft tissue thickening as an isolated sign of radial head fracture but it would be of very low yield, in the very infrequent case of a radial head fracture without joint effusion I mention the soft tissues and as above follow-up in 1 week to see evolution of the fracture line if it was occult. That's a way better situation than to immobilize every child because of a possible fracture due to soft tissue swelling.
With respects to the best orthopaedic hospital in the country, presumably referring to HSS, they employ radiologists because that is the BEST practice for the BEST patient outcomes/care. It's not solely/mostly because of the money. EVERY academic/cancer center employs MSK radiologists.
6. Respectfully, the reason to not have IR sign off the GPT-4 report is because you are not trained in advanced imaging of every modality. See point 1b, if you aren't investing your time staying up to date on liver imaging because you are mastering your interventional craft you may be unaware of several important advances over the past few years.
7. With respect to hidden features, there are better ones to talk about than soft tissue swelling. There is an entire field about this with radiomics and texture analysis, all of the studies on this have been underwhelming except in very select and small studies showing questionable benefit that is very low on the evidence tree.
To summarize, radiology can be very very hard. We do not train to solely diagnose simple things that a junior resident can pickup (a liver lesion with APHE and washout). We train for the nuanced cases and hard ones. We also do not optimize for 'accurate' detection on every indication and every study type, there are limitations to each imaging modality and the consequences of missed/delayed diagnosis vary depending on the disease process being discussed, similarly with overdiagnosis and overtreatment. 'Hidden features' have so far been underwhelming in radiology or we would use them.
A scattered history of labs probably provides an opportunity to notice something early, even if you don't know what you are looking for. But humans are categorically bad at detecting complex patterns in tabular numbers. Could routinely feeding people's lab history into a model serve as a viable early warning system for problems no one thought to look for yet?
We have established and validated reference ranges for bloodwork, there is also inherent lab error and variability in people's bloodwork (hence a reference range).
People < 50 should not be having routine bloodwork, and routine blood work on annual check-ups in older patients are very easy to interpret and trend.
Early warning systems need to be proven to improve patient outcomes. We have a lot of hard-learned experience in medicine where early diagnosis = bad outcomes for patients or no improved outcomes (lead-time bias).
If an algorithm somehow suspected pancreatic cancer based on routine labs, what am I supposed to do with that information? Do I schedule every patient for an endoscopic ultrasound with its associated complication rates? Do I biopsy something? What are the complication rates of those procedures versus how many patients am I helping with this early warning system?
In some case (screening mammography, colonoscopy) demonstrably improved patient outcomes but took years to decades to gather this information. In other cases (ovarian ultrasound screening) it led to unnecessary ovary removal and harmed patients. We have to be careful about what outcomes we are measuring and not rely on 'increased diagnosis' as the end goal.
Radiology is not the lowest hanging fruit when you talk about AI taking over jobs.
What do you think is going to happen to tech hiring when a LLM is putting out production ready code (or refactoring legacy). I would be far more worried (in reality learning new/advanced skills) if I was a software engineer right now where there isn’t a data or regulatory hurdle to cross.
As with every other major advancement in human history, people’s job descriptions may change but won’t eliminate the need.
With that said people are also dramatically overstating the power of LLMs which appear very knowledgeable at face value but aren’t that powerful in practice.
You might have images, but not the diagnoses to train the AI with.
In addition, there are compliance reasons, just because you manage that data doesn't mean that you can train an AI on it and sell it, unless of course you get explicit permission from every individual patient (good luck).
I do believe that with enough effort we could create AI specialist doctors, and allow the generalist family doctor to make a comeback, augmented with the ability to tap into specialist knowledge.
Technology in the medical industry is extremely far behind modern progress though, CT images are still largely 512 by 512 pixels. It's too easy to get bogged down with legacy support to make significant advancements and stay on the cutting edge.
A chest x-ray isn't going to do the model much good to interpret a prostate MRI.
Add in heterogeneity in image acquisition, sequence labelling, regional and site-specific disease prevalence, changes in imaging interpretation and most importantly class imbalance (something like >90% of imaging studies are normal) it is really really hard to come up with a reasonably high quality dataset with enough cases (from personal experience trying).
With respects to training a model, IRB/REB (ethics) boards can grant approval for this kind of work without needing individual patient consent.
That's what the unsupervised learning is for. GPT doesn't have labels either, just raw data.
Also, even within the US framework, there's pressure. A radiologist can rubberstamp 10x as many reports with AI-assistance. That doesn't eliminate radiology, but it eliminates 90% of the radiologists we're training.
Not if its an emergency.
> but it eliminates 90% of the radiologists we're training.
Billing isnt going to change. Billing is a legal thing, not a supply/demand thing.
But yes, I fully plan to utilize travel medicine and potentially black market prescription drugs in my lifetime if there isnt meaningful reform for the middle/upper class.
https://www.opensecrets.org/federal-lobbying/top-spenders?cy...
I've worked at places with AI/CAD for lung nodules, mammo and stroke and there isn't even a whisper at cutting fee codes because of AI efficiency gains at the moment.
N.B. I say this as a radiologist who elected not to pursue an interventional fellowship because I see reimbursement for diagnostic work skyrocketing with AI due to increases in efficiency and stagnant fee codes.