Wind of Change: Mistral AI’s Open-Source Models and New Mixtral 8x22B

Mistral AI is blowing through the AI scene just like its namesake, aiming to be Europe’s flagship LLM provider, especially with its cutting-edge Mixtral 8×22B model, which is allegedly on par with ChatGPT. Mixtral 8×22B was quietly launched to the excitement of netizens, especially on Reddit.

There has been nothing but praise for Mixtral 8×22B. However, despite the hype around Mixtral, in a test video on YouTube it was found lacking in some logical test questions given to it.

Video: Matthew Berman / YouTube

Is Mistral Europe’s Gen-AI Champion?

Everyone is talking about U.S. companies like OpenAI, but Europe is not asleep at the wheel. Mistral is a French outfit, and they are giving European LLM research a good name.

At the forefront of open-source artificial intelligence and machine learning solutions, Mistral AI was founded just last year in April 2023 with a vision to “to make frontier AI ubiquitous”, and has rapidly established itself as a key player in open LLM tech.

Its core goal is to develop advanced AI algorithms and systems designed specifically to meet the needs of its clients. The company offers a range of AI models to companies intending to use them for the development of applications such as machine learning, data analysis, robotics, computer vision, text-to-image generation, speech recognition, gaming, and chatbots.

Mistral AI is currently valued at $2 billion, and reportedly Mistral’s founders are in serious talks with investors to more than double that amount. Though that’s a significant amount, it’s a tiny fraction compared to OpenAI’s $80 billion. Even so, it’s quite impressive considering its initial funding of $260 million!

Wind blowing down a row of robots
Robots taking off in a gust of wind. Image: Gina Gin / DALLE-3

Mistral AI’s Founders

The success of Mistral can be attributed to its founders: Arthur Mensch, Timothée Lacroix, and Guillaume Lample.

Arthur Mensch made significant contributions to fields of neural networks and deep learning while working at Google DeepMind. His research often focuses on the development and understanding of neural network architectures, optimization methods, and the interpretability of deep learning models. He studied at one of France’s leading institutions in science and technology, École Polytechnique. He also has a PhD in Machine Learning from the University of Saclay.

Mensch’s co-founders Timothée Lacroix and Guillaume Lample have similar prestigious backgrounds at École Polytechnique and as researchers for Meta, known for their work in machine learning, particularly in the field of reinforcement learning (RL).

Their team consists of 63 skilled engineers and data scientists who work closely with clients to develop customized AI solutions.

Other key players at Mistral include Pierre Stock, Christopher Bamford (AI scientist), William El Sayed (founder associate), Thomas W. (Research engineer), Guillaume Bour (Head of Enterprise), Marie-Anne Lauchaux (AI research engineer), Theophile Gervet, Devendra Chaplot, Teven le Scao, Louis Ternon, Lucille Saulnier, Alexis Tacnet, and a host of others.

Mistral AI joining the league of AI startups making their source code open has been a buzzing conversation for a while now, with OpenAI and Google saying it’s a wrong move — they believe openly accessible model weights enable dangerous misuse and the dark side of LLMs. However, Mistral has not commented on this, as they remain committed to making their LLMs open source.

What I admire most about Mistral is its commitment to responsible AI development and dedication to ensuring that its technologies are used ethically. They adhere to strict data privacy and security standards and work closely with clients to implement AI solutions that comply with ethical guidelines.

Mistral AI Models

Since its founding in 2023, Mistral has released some powerful AI product models. Three of which are open source, and the other three are closed source: Mistral Small, Mistral Medium, and Mistral Large. Our focus is going to be on the open source models.

Mistral 7B

This is the first Mistral AI model in the open-source category. Mistral 7B is a dense model, meaning in its neural network, each layer is fully connected to the next one. This enables the model to capture complex patterns and relationships within the data.

Mistral 7B’s primary advantage is its flexibility, allowing developers and researchers to modify and adapt the model to suit various tasks and applications. Mistral 7B with its seven billion parameters — the 7B in Mistral 7B — is punching above its weight, comparable in capabilities to models with up to 30 billion parameters.

In AI, “parameters” are the internal variables that the model learns from the data. During compute-intensive training the model adjusts these variables to make more accurate predictions.

A model with more parameters generally has a higher capacity to learn intricate patterns from vast amounts of data, leading to improved performance. Therefore, Mistral 7B’s ability to match the capabilities of models with up to 30 billion parameters indicates its proficiency in handling complex tasks and datasets, making it a significant advancement in open-source AI models.

Mixtral 8×7B

This is the second model in the open-source category. Mistral 8×7B uses a sparse mixture of experts model that uses up to 45 billion parameters. This means a specialized type of neural network architecture where different subsets of the model, known as “experts,” handle specific parts of the input data.

Despite its vast parameter capacity, during the inference process, the model utilizes only around 12 billion parameters. This strategic utilization results in the model processing tasks more quickly. However, this efficiency comes at the expense of increased virtual Random Access Memory (vRAM) usage. Unlike Mistral 7B, your average home computer probably won’t be running Mixtral soon.

Mixtral 8×22B

This is the newest open-source model. Prior to its release, there were speculations that Mistral wouldn’t launch another open-source model because it was being bought by Microsoft, but they were proved wrong. Mistral AI launched Mixtral 8×22B, an expanded sparse mixture of experts model with a capacity of up to 141 billion parameters, larger and better than the previous Mixtral 8×7B and Mistral 7B. When working on a task, it uses about 39 billion parts to work faster, which you have to admit is quite an exciting upgrade from the former models! This lets the model perform tasks more quickly. But because it’s working faster, it uses even more computer memory.

Mistral AI has become a favorite among LLMs open-source fans and fine-tuning hobbyists. Its popularity and usefulness are not waning anytime soon. We look forward to more powerful open-source models from Mistral!