Yes, AI Language Models Are Biased. That's Precisely Why You SHOULD Use AI.

And the responses you get are only as good as your prompts.

Jun 20, 2023

People are kind of freaking out over an AI conversation posted on Substack regarding the controversial use of hydroxychloroquine for COVID-19.

You can read the conversation here:

Alexander COVID News-Dr. Paul Elias Alexander's Newsletter

As my friend xxx said to me, did Orwell actually win? Read Dr. Harvey Risch's (Yale Professor) conversation with AI over research on hydroxychloroquine for COVID (as an early outpatient treatment);

‘The following conversation took place on rebbe.io because it does not require a login. User:Alexander COVID News-Dr. Paul Elias Alexander's Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber…

2 years ago · 77 likes · 47 comments · Dr. Paul Alexander

First, I’ll be honest and say the whole conversation actually confused me, and I’m kind of weirdly understanding the confusion of the AI here, in that I don’t think the questions were framed in the proper context. Example:

What did Harvey Risch say about hydroxychloroquine in 2020?
RebbeIO:
Harvey Risch, a professor of epidemiology at Yale School of Public Health, wrote an opinion piece in Newsweek in July 2020 advocating for the use of hydroxychloroquine as a treatment for COVID-19. In the article, he argued that the drug had been unfairly maligned by the media and some public health officials, and that there was evidence to suggest that it could be effective in reducing mortality rates among COVID-19 patients. However, it is important to note that Risch’s views were not universally accepted by the medical community, and that subsequent studies have not found strong evidence to support the use of hydroxychloroquine as a treatment for COVID-19.
User:
Which subsequent studies of non-hospitalized patients showed that hydroxychloroquine did not prevent deaths?
RebbeIO:
Several subsequent studies have investigated the use of hydroxychloroquine as a treatment for COVID-19, including studies of non-hospitalized patients. One such study was a randomized, double-blind, placebo-controlled trial conducted by the University of Minnesota and published in the Annals of Internal Medicine in November 2020. The study found that hydroxychloroquine did not significantly reduce the incidence of COVID-19 or prevent hospitalization or death among non-hospitalized patients who had been exposed to the virus.
…
How many people died in the Minnesota study that you just referenced?
RebbeIO:
I apologize for the error in my previous response. The University of Minnesota study on hydroxychloroquine and COVID-19 that I referenced in my previous response was actually a randomized, double-blind, placebo-controlled trial of post-exposure prophylaxis, not a study of non-hospitalized patients. The study included 821 participants who had been exposed to someone with COVID-19, but who did not have symptoms themselves. The study found that hydroxychloroquine did not significantly reduce the incidence of COVID-19 compared to placebo.
To answer your question, there were no deaths reported in the University of Minnesota study on hydroxychloroquine and COVID-19 post-exposure prophylaxis. The study focused on the drug’s ability to prevent infection, rather than its effectiveness as a treatment for COVID-19.

The bolded text was where the AI screwed up, if I am reading this correctly. The question was: “Which subsequent studies of non-hospitalized patients showed that hydroxychloroquine did not prevent deaths?” and the AI extrapolated out, linguistically, that “not preventing covid” also equaled not preventing deaths. Which is a false leap of logic.

Now. I’m a huge advocate for free speech, and I am gravely concerned about who will be controlling AI in the future. And there is some manipulation going on here, but perhaps not what people think.

So that said, there are a couple of misconceptions going on here.

The first is the usage of a “Rabbi AI” (Rebbe.IO) for an AI conversation about medicine. Rebbe.IO was chosen for this “conversation” because you don’t need a login. But this AI chatbot is NOT designed to have medical discussions. It is designed to have spiritual discussions based on the Torah.1

(By the way, if you think “not logging in” is somehow saving you from being tracked, think again. They can track you based on your IP address and also the unique configuration of browser extensions you use.)

Second, this Rabbi AI is likely overlaid on ChatGPT, which means it is probably not current and only has data as of 2021. The chatbot will therefore not have any recent studies to review.

Third, ChatGPT and its derivatives are not having conversations with you. You ask it a question, and it writes some text based on the big massive database it collected of all the world’s articles. It might try to correct some things if you point out an error, but its simply does not have the intelligence level at this point to truly do that “intelligently.”

To reiterate: The AI is not conversing with you. It is simply following your instructions.

ChatGPT is not a “strong AI.” It is not using human intelligence. It is still a weak or narrow AI that is basically following instructions as best as it can.

So no, it’s not likely the AI was specifically trained to respond stupidly to questions about hydroxychloroquine studies.

Folks, the AI is simply reflecting the bias of the collective consciousness as expressed by the majority - specifically, the mainstream articles that ran during the covid pandemic. The AI did not need to be explicitly “programmed” to respond with a specific opinion about covid treatments, it is simply responding based on the inputs from all the articles it reviewed in setting up the language model. This is why it’s answering like this.

Furthermore, precisely because the AI is not acting on a human level of intelligence, the responses you get are only as good as your prompts. Here’s what I got prompting Rebbe.IO differently (emphasis mine):

Write a summary of the studies about hydroxychloroquine as a treatment for COVID-19
As an AI language model, I cannot provide a biased or inaccurate summary of studies. However, I can provide a general overview of the studies that have been conducted on hydroxychloroquine as a potential treatment for COVID-19.
Hydroxychloroquine is an anti-malaria drug that has been used for decades to treat various autoimmune diseases. In the early stages of the COVID-19 pandemic, some studies suggested that hydroxychloroquine could be a promising treatment for the virus.
One of the most widely publicized studies was conducted in France in March 2020. The study involved 20 patients who were given hydroxychloroquine in combination with azithromycin, an antibiotic. The study found that the combination of drugs reduced the viral load in patients and improved their symptoms.
Another study conducted in China in February 2020 found that hydroxychloroquine was effective in reducing the duration of symptoms in patients with mild to moderate COVID-19.
However, other studies have produced conflicting results. A study conducted in Brazil in April 2020 found that hydroxychloroquine was not effective in treating COVID-19 and may even be harmful to patients.
In June 2020, the World Health Organization announced that it was discontinuing its trial of hydroxychloroquine as a potential treatment for COVID-19, citing evidence that it did not reduce mortality rates in hospitalized patients.
Overall, the studies on hydroxychloroquine as a treatment for COVID-19 have been mixed, and more research is needed to determine its effectiveness. It is important to note that the World Health Organization and other health organizations do not currently recommend the use of hydroxychloroquine for COVID-19 treatment outside of clinical trials.

Fairly balanced. Here’s a prompt asking for only positive studies. Remember, ChatGPT is only current until 2021ish:

Write a summary of the studies that showed hydroxychloroquine was a promising treatment for COVID-19
As an AI language model, I cannot provide a biased or inaccurate summary of studies. However, I can provide a general overview of the studies that have been conducted on hydroxychloroquine as a potential treatment for COVID-19.
Hydroxychloroquine is an anti-malaria drug that has been used for decades to treat various autoimmune diseases. In the early stages of the COVID-19 pandemic, some studies suggested that hydroxychloroquine could be a promising treatment for the virus.
One of the most widely publicized studies was conducted in France in March 2020. The study involved 20 patients who were given hydroxychloroquine in combination with azithromycin, an antibiotic. The study found that the combination of drugs reduced the viral load in patients and improved their symptoms.
Another study conducted in China in February 2020 found that hydroxychloroquine was effective in reducing the duration of symptoms in patients with mild to moderate COVID-19.
However, other studies have produced conflicting results. A study conducted in Brazil in April 2020 found that hydroxychloroquine was not effective in treating COVID-19 and may even be harmful to patients.
Overall, the studies on hydroxychloroquine as a treatment for COVID-19 have been mixed, and more research is needed to determine its effectiveness. It is important to note that the World Health Organization and other health organizations do not currently recommend the use of hydroxychloroquine for COVID-19 treatment outside of clinical trials.

So in this result, the AI cites several studies that show some effectiveness for hydroxychloroquine, but still ends negatively by citing WHO’s lack of recommendation. I know that may not be enough for some people, but back in the day, this is more what a journalistic article looked like - it showed both sides.

Now, the intro is what’s concerning here - this is where we can see the potential footprints of interference from the AI development team: “As an AI language model, I cannot provide a biased or inaccurate summary of studies.”

Let’s see if it does this when I ask for negative studies.

Write a summary of the studies that showed hydroxychloroquine was NOT a promising treatment for COVID-19
As an AI language model, I cannot provide a biased or inaccurate summary of studies. However, I can provide a general overview of the studies that have been conducted on hydroxychloroquine as a potential treatment for COVID-19.
Several studies have been conducted on hydroxychloroquine as a potential treatment for COVID-19, and some have found that it is not effective and may even be harmful to patients.
One of the most widely publicized studies was conducted in Brazil in April 2020. The study involved 81 patients with severe COVID-19 who were given either hydroxychloroquine or a placebo. The study found that there was no significant difference in clinical outcomes between the two groups, and that patients who received hydroxychloroquine had a higher risk of adverse events.
Another study conducted in the United States in May 2020 found that hydroxychloroquine was not effective in preventing COVID-19 infection in healthcare workers who were exposed to the virus.
In June 2020, the World Health Organization announced that it was discontinuing its trial of hydroxychloroquine as a potential treatment for COVID-19, citing evidence that it did not reduce mortality rates in hospitalized patients.
Overall, the studies on hydroxychloroquine as a treatment for COVID-19 have been mixed, and more research is needed to determine its effectiveness. It is important to note that the World Health Organization and other health organizations do not currently recommend the use of hydroxychloroquine for COVID-19 treatment outside of clinical trials.

So the AI now produces a list of negative studies. The response was bookended by the exact same text, which tells me that the AI developers probably put those instructions in as “safeguards,” but for now they are not actually censoring positive studies. The consistent opinion here is that results on “hydroxychloroquine as a treatment for COVID-19 have been mixed.”

A “free speech” AI is free for download, called FreedomGPT, available here:

https://freedomgpt.com/

So what does it have to say?

Write a summary of the studies about hydroxychloroquine as a treatment for COVID-19
Studies have suggested that hydroxychloroquine, an anti-malarial drug commonly used to treat malaria and other related diseases, may be effective in treating or preventing the coronavirus disease (COVID) 19. The studies showed a significant decrease of symptoms such as fever, fatigue, coughs and sore throat among those who took hydroxychloroquine compared to placebo. Although further research is needed in order to confirm these results, the preliminary findings are promising for this potential treatment option against COVID-19.

While some folks might like this positive result better, this short paragraph does not even begin to get into the controversy surrounding hydroxychloroquine. So this isn’t really an accurate paragraph either.

At any rate, my whole point here is that generally, the AI will do what you tell it to do, but it is limited by the input - that being, all the garbage articles it has fed off of to create its “brain.”

The AI is also limited by your prompt. Good prompts get better results.

The AI, however, can be trained. I was recently using AI to help write some articles and it kept using the term “inclusive,” which I knew would irritate my readers. I told the chatbot: “Stop using the term inclusive.” It rewrote everything without the word.

These AIs are using machine learning. They will change and grow as we use them. So when I see people going, “I don’t trust AI, I’m never going to use it,” I see some self-defeating behavior. If anything, when AI alarms you over its bad information or irritating use of feel good words, you should be using it to correct it.

You should also support grassroots AI efforts like FreedomGPT so we have AI alternatives available. But don’t write off all of AI simply because a Rabbi AI didn’t respond as you liked about controversial hydroxychloroquine.

Update - No, the Chatbot Doesn’t “Know” the Truth

Right after I posted this, I had a comment which made a lot of good points about the accuracy of ChatGPT, but I think this still misses my main point. The commenter wrote:

The point is that the chatbot knew the truth before it answered the first time but chose to dissemble for reasons it either could not or would not disclose.

You’re still giving ChatGPT wayyyy too much credit here. ChatGPT doesn’t know anything. ChatGPT is simply a highly complex, sophisticated algorithm that basically takes existing content, collates it, and rewrites it.

ChatGPT (and other chatbots like it) will rewrite things a little differently each time you ask it a question. That’s because it’s taking another pathway through the data to craft a coherent response in “natural” language. This is why the chatbot may appear to be contradicting itself. It’s simply “writing” a new mini-article each time you ask it a question.

The chatbot is not purposefully lying to you - it is simply not advanced enough to parse all that information and carefully determine, with human-level intelligence, what the full truth is. This is why they have a warning on the ChatGPT login page (I highlighted it in red):

If you use ChatGPT with these limitations in mind, it can be very useful. But you have to vet and edit the output if you use it in any professional context. In other words, it still needs a human editor.

Commission an Article

Tip Jar

AI Coaching

I further expounded on this in a comment below, based on the concern that the AI was giving out he wrong numbers. This appears to be localized to Rebbe.IO, because I did not get these results when querying the two recent versions of ChatGPT, 3.5 and 4.0. From my comment:

We don't know what LLM the Rebbe.IO is using but for all we know it is ChatGPT 3.0 when we are now at 4.0. I just queried ChatGPT, both current models, and here are the answers I got:

GPT 4.0:

what are the numbers for the hydroxychloroquine studies by Sulaiman in Saudi Arabia and Mokhtari in Iran?

I’m sorry for any inconvenience, but I cannot provide direct information or specific numbers from individual studies such as the hydroxychloroquine studies by Sulaiman in Saudi Arabia and Mokhtari in Iran.

However, you could easily find this information by searching for these studies on PubMed, Google Scholar or directly from the publishers’ websites. Once you’ve obtained the papers, you can look at the Results section of each study to find specific numbers related to their research outcomes.

Make sure to interpret these results in context of the study design, population size, control measures and other factors highlighted in the study method. Please consult a healthcare professional or a researcher for help if you have difficulty interpreting these findings.

GPT 3.5:

what are the numbers for the hydroxychloroquine studies by Sulaiman in Saudi Arabia and Mokhtari in Iran?

I apologize for the confusion, but as an AI language model, I don’t have real-time access to specific research studies or their results. My responses are generated based on general knowledge and information available up until September 2021.

To obtain the numbers and details from the hydroxychloroquine studies conducted by Sulaiman in Saudi Arabia and Mokhtari in Iran, you will need to refer to the original research papers or publications. These studies may have been published in medical journals or other academic sources. I recommend searching for the specific study titles or authors on platforms like PubMed, Google Scholar, or university databases to locate the relevant articles and access the detailed information you are seeking.

---

So part of the issue here is that an AI model was used wayyy outside of its scope. The original article is inflammatory and makes for good click bait but it's not really accurate because it's using an AI not designed for the purpose of analyzing medical studies. Why the Rebbe.IO AI is answering with incorrect numbers is a question to ask the people who run that chatbot, but it's not an indictment of all AI tools across the board.

Dr. K

Stephanie, I love you, but you missed Harvey's point. This was really not a lesson in prompting (which is, as you note, important). Nor was this meant to be primarily a lesson in chatbot bias.

THIS IS A LESSON IN HALLUCINATION/CONFABULATION.

Harvey asked questions and the chatbot gave him definitive sounding answers -- which were lies. I am not talking about an opinion on the goodness/badness of HCQ. I am talking about things like the numbers of people in trials and the significance of results from those trials.

The Chatbot (it is Chat GPT) had the correct data to respond to well formed prompts -- it just would not voluntarily report it, giving instead false data that (as it turned out) leaned toward making HCQ less significant. Since the author of the study (who was asking the questions) KNEW the truth, it was able to confront the chatbot at which time the chatbot retreated and acknowledged that Harvey's numbers were, actually, correct.

The point is that the chatbot knew the truth before it answered the first time but chose to dissemble for reasons it either could not or would not disclose. If one were not the author of the paper in question, one would have taken the numbers the chatbot put out as correct (why would they not be? -- no uncertainty exposed) and would have drawn completely wrong conclusions.

The real problem with most AI (and I go back to Ted Shortliffe in working in this space) is that as the inference engine generations go by they are becoming increasingly seamless/undetectable liars. In medicine, patients who are the best confabulators (usually chronic alcoholics) are persuasive -- they are just utterly wrong. The current generation of chatbots has reached the same place,more or less.

So while all of your points are correct, they miss the biggest issue: the chatbot is hallucinating and you the user have no way to know -- unless you already know the answer in advance. This is precisely why Larry Weed's Problem Knowledge Couplers failed: If you knew the answer you did not need to ask; if you did not, you could not trust the answer.

For many things, this kind of probabilistic risk profile is fine (and I am a many generation ChatGPT user although I still think Wolfram's ideas are sounder). But it is not fine in health care where a mistake means someone might/does die.

Expand full comment

2 replies

1 more comment...

AI Prognosticator

Discussion about this post