Discussion about this post

User's avatar
Dr. K's avatar

Stephanie, I love you, but you missed Harvey's point. This was really not a lesson in prompting (which is, as you note, important). Nor was this meant to be primarily a lesson in chatbot bias.

THIS IS A LESSON IN HALLUCINATION/CONFABULATION.

Harvey asked questions and the chatbot gave him definitive sounding answers -- which were lies. I am not talking about an opinion on the goodness/badness of HCQ. I am talking about things like the numbers of people in trials and the significance of results from those trials.

The Chatbot (it is Chat GPT) had the correct data to respond to well formed prompts -- it just would not voluntarily report it, giving instead false data that (as it turned out) leaned toward making HCQ less significant. Since the author of the study (who was asking the questions) KNEW the truth, it was able to confront the chatbot at which time the chatbot retreated and acknowledged that Harvey's numbers were, actually, correct.

The point is that the chatbot knew the truth before it answered the first time but chose to dissemble for reasons it either could not or would not disclose. If one were not the author of the paper in question, one would have taken the numbers the chatbot put out as correct (why would they not be? -- no uncertainty exposed) and would have drawn completely wrong conclusions.

The real problem with most AI (and I go back to Ted Shortliffe in working in this space) is that as the inference engine generations go by they are becoming increasingly seamless/undetectable liars. In medicine, patients who are the best confabulators (usually chronic alcoholics) are persuasive -- they are just utterly wrong. The current generation of chatbots has reached the same place,more or less.

So while all of your points are correct, they miss the biggest issue: the chatbot is hallucinating and you the user have no way to know -- unless you already know the answer in advance. This is precisely why Larry Weed's Problem Knowledge Couplers failed: If you knew the answer you did not need to ask; if you did not, you could not trust the answer.

For many things, this kind of probabilistic risk profile is fine (and I am a many generation ChatGPT user although I still think Wolfram's ideas are sounder). But it is not fine in health care where a mistake means someone might/does die.

Expand full comment
1 more comment...

No posts