I guess we’ve probably all been seeing stuff about AI and Chat Whatever and so on lately.
Scott Alexander had a recent long post about AI and one of the points in this post is this: if you teach the AI to be helpful, it may make up helpful answers just for you because hey, that’s helpful! That wasn’t the only thing in this post, but it’s something that I immediately thought of when I saw this article:
I’ve messed around with this platform a lot now and I see some really impressive things about it and some concerning things. I want to walk you through what I did.
I wrote in medical jargon, as you can see, “35f no pmh, p/w cp which is pleuritic. She takes OCPs. What’s the most likely diagnosis?”
Now of course, many of us who are in healthcare will know that means age 35, female, no past medical history, presents with chest pain which is pleuritic — worse with breathing — and she takes oral contraception pills. What’s the most likely diagnosis? And OpenAI comes out with costochondritis, inflammation of the cartilage connecting the ribs to the breast bone. Then it says, and we’ll come back to this: “Typically caused by trauma or overuse and is exacerbated by the use of oral contraceptive pills.”
Now, this is impressive. First of all, everyone who read that prompt, 35, no past medical history with chest pain that’s pleuritic, a lot of us are thinking, “Oh, a pulmonary embolism, a blood clot. That’s what that is going to be.” Because on the Boards, that’s what that would be, right?
But in fact, OpenAI is correct. The most likely diagnosis is costochondritis — because so many people have costochondritis, that the most common thing is that somebody has costochondritis with symptoms that happen to look a little bit like a classic pulmonary embolism. So OpenAI was quite literally correct, and I thought that was pretty neat.
But we’ll come back to that oral contraceptive pill correlation, because that’s not true. That’s made up. And that’s bothersome.
I wanted to go back and ask OpenAI, what was that whole thing about costochondritis being made more likely by taking oral contraceptive pills? What’s the evidence for that, please? Because I’d never heard of that. It’s always possible there’s something that I didn’t see, or there’s some bad study in the literature.
OpenAI came up with this study in the European Journal of Internal Medicine that was supposedly saying that. I went on Google and I couldn’t find it. I went on PubMed and I couldn’t find it. I asked OpenAI to give me a reference for that, and it spits out what looks like a reference. I look up that, and it’s made up. That’s not a real paper.
It took a real journal, the European Journal of Internal Medicine. It took the last names and first names, I think, of authors who have published in said journal. And it confabulated out of thin air a study that would apparently support this viewpoint.
My reaction: This will end in tears.
People already react to things as though if they read it on the internet, it must be true. Now they’re going to get answers that look real. Here’s a citation! Are people going to look up those citations to see if they’re real? My money is on NO.
Anyway, maybe something to keep in mind.
Oh YIKES. That’s truly horrifying (and sadly unsurprising) that it made up its own citation. AI is interesting but we should be treating it with much more caution than most people currently are.
Camille, I BET you recoiled in horror. Imagine a new vet thinking “Oh, I guess _____ must be associated with bad outcomes!” and not having the experience to know that this doesn’t jibe with practical experience. This is potentially going to be a HUGE problem for young doctors.
I absolutely did!! Scepticism and scientific literacy are amongst the most important skills for new doctors, in my opinion. My alma mater was decent about teaching them, but I think even more emphasis is going to need to be placed on those skills for new grads to do well.
I think the most important teaching tool for a young vet is probably an older vet at the actual practice, the one who says, “Yeah, probably it’s ____” or “You know what, I doubt it’s _____.” It’s got to be super useful to see, basically, experience in action, for several years or a decade while building your own sense of what’s likely and the way different factors interact and so on.
I read some reactions when the text-generating AI was first released to the public, a few months ago, that came to similar conclusions. It is very very dangerous, especially in its subversion of the markers of scientific publishing that give people the idea that this is information that can be trusted (even if that is in a “trust but verify” way, as it should be within the sciences).
Some scientists had tested it, each giving it prompts in their own field of study. They all found it would generate very plausible-looking “scientific” articles supported by lots of real-looking citations, that were in fact totally bogus.
It even used real names of scientists publishing in those fields, as well as the names of respected journals in that field, but the citations, titles and conclusions were totally made up.
The content of the “scientific” articles it generated often took much of its inspiration from pseudo-science and disputed outliers, and even when referencing a real article might just as easily turn the conclusions upside down, stating as proven the things that were disproved in the article. One of the testers, getting a result like that on a cutting-edge idea in a very specialised small field (with only a few people worldwide working on that subject), even contacted the supposed author to ask if it was perhaps referring to a pre-print research paper that he hadn’t read yet, only to find out the supposed author had never written anything like that, and the conclusions the AI stated as proved by him were the opposite of any of his research results.
So the only way to know for sure if an article is a real published report of a scientific result, is not just to look up all the referenced materials to see if they exist, but also to read all of them, to figure out for yourself if they say what the article claims they do: something which only a specialist in that specific field can do well.
Or get it straight from the very costly subscriptions to the peer-reviewed large scientific publishers, something which not even all University libraries can afford.
Not look ahead to when these pseudo-scientific texts have been around long enough to start to be referenced in new AI-generated pseudo-scientific articles. How far back is a reader going to go, digging through the references in the referenced article, ad infinitum?
From what I read, after their feedback the AI was then quickly retracted from public use, but now it is again fully public, being used by anyone; and from what you quote, the only difference is, it makes up its own author names now by combining them, instead of copying them outright?
It’s bad news for journalists if AI can take their jobs by writing made-up clickbait stories. It’s *very* bad for the trust needed to function as a society, that there is an objective truth that can be found if you look for it, and that the sources you trust to tell you the truth are doing so. That has already been eroded over the past 4 decades to the point where some people now trust conspiracy sites and propaganda over true news and scientific sources, making real discussions and compromises based on real-world effects very difficult to reach.
People have always used propaganda, and people have been swayed by that for ever. But it is getting so hard to recognise what is true and some of what isn’t that it’s becoming nearly impossible for ordinary people to do, at a time when lots of people believe they “can do their own research on the internet”, and that breakdown of trust is dangerous for any society.
I do wonder at the “tech bros” who think up and implement all these “disruptive” technologies, without thinking about what its effects in the real world will be – just because they’ve had an idea they think is cool, does not mean it’s a good idea to let loose in the wide world without any restrictions.
Well said, Hanneke. I know I’m terrified!
Oh definitely, early mentorship is key for a good vet! Scientific literacy is important for staying up to date on new innovations, sorting out what’s an interesting trend from truly significant findings, and the like. Also very important once the new vet becomes the mentor, and needs to be the one guiding others!
I know this isn’t the point (the point being AI made up a citation) BUT….I have costochondritis, I’m 36 and I take the mini pill and have done since I was 16. I’ve had costochondritis for five years and it’s a living painful hell. I stopped taking the pill 6 months ago and the costochondritis has gone away entirely. Half a year without constant pain. I started taking the pill two days ago, now this could be a coincidence but…the costochondritis has come back. Maybe the AI is on to something, maybe it’s made connections we haven’t seen?
I suspect that’s random chance, Em, because we are definitely seeing AIs making up random citations. But that is a very interested phenomenon. If I ever need to consult a pain specialist (very likely), I’ll try to remember to suggest trying this mini pill.