Mixtral, a Strong Alternative to GPT
While Mixtral 8x7B is often compared to GPT 3.5, it's more powerful than that.
Yesterday, I posted a quick review of Merlin that I needed to update because I realized they also offered Mistral Large and Mixtral (with “less content moderation”). The “less content moderation” caught my eye. I’ve seen Mistral/Mixtral offered on other services, but what exactly is Mixtral? I asked Mixtral itself. Here is it’s response:
(Note, we are referring to Mixtral 8x7B here, the advance version is Mixtral 8x22B).
Mixtral, a language model (LLM) developed by Mistral AI, presents some unique qualities when compared to Mistral, which is also an LLM from the same company. Mixtral demonstrates superior performance in multiple benchmarks, outperforming other models such as Llama 2 70B and GPT-3.5 on various standard measures.
One distinctive feature of Mixtral is its capability to process a context of 32k tokens in length, making it well-suited to handle and learn from extensive conversational contexts, large documents, or extended dialogues.
Additionally, Mixtral is a multilingual LLM that supports English, French, Italian, German, and Spanish, enabling users to leverage its language capabilities in a wider range of languages and use cases.
Mixtral also excels in code-related tasks, making it an ideal choice for generating, understanding, and facilitating code-related interactions. It has been observed to produce better results when working with code, offering higher quality and more precise code-related outcomes than its counterparts.
Furthermore, Mixtral is designed with a specific focus on striking a balance between efficiency and size. Despite containing approximately 47B parameters, Mixtral operates with around 13B active parameters during inference, which enables faster processing speeds and better computational efficiency compared to other models of similar size.
In summary, Mixtral uniquely stands out among other LLMs due to its ability to process long contexts, support multiple languages, excel in code-related tasks, and maintain increased efficiency during the inference process.
Mixtral is appealing in that it is truly open source and is at least on par with GPT 3.5, if not more powerful. In terms of its writing capabilities, it will pump out a decent evergreen article on demand, though it still gets stuck in AI cliches such as “in today’s fast-paced, competitive world…”
I haven’t tried asking Mixtral anything truly offensive such as “how do I create a pipe bomb” as I don’t fancy a visit by the FBI. But in terms of its bias, without any direction from the user, Mixtral tends towards the general bias of all AIs regarding the mainstream viewpoints fed to it in training.
However, Mixtral doesn’t seem to lecture as much as some other AIs, such as Claude, which is run by a company that apparently feels it has the right to tell you what you should believe about things like psychiatric drugs…. While Claude 3 refuses to provide any criticism of modern psychiatry, Mixtral will whip up a critical article for you, no problem, if you direct it to do so.
Given its ability to follow directions and create coherent articles from scratch, along with its large context window, I find Mixtral to be about on par with GPT 4, actually, not 3.5, or perhaps even better for certain use cases.
At any rate, Mixtral is available in Merlin as mentioned above, and it is one of the cheapest options if you are using credits vs. an unlimited plan. Mixtral is also found in Poe and the Brave browser, including Brave’s free, rate-limited version of “Leo.”
Stephanie, If you were going to compare the top end sourcers available in Merlin, which do you think is all-around best? Best answers coupled with least idea throttling/influencing is, I suppose, how one would define best. You have used these more than most -- interested in your opinion.