JARVIS: An LLM with Minions

Not unlike Auto-GPT, JARVIS is an attempt to give LLMs more reach. It is a hybrid model consisting of an LLM controller with multiple expert models used to execute its functions. I called them minions, but the designers prefer “collaborative executors.”

As you can see in their flow design image, they break the process down into four stages:

Task Planning: Using ChatGPT to analyze the requests of users to understand their intention, and disassemble them into possible solvable tasks.
Model Selection: To solve the planned tasks, ChatGPT selects expert models hosted on Hugging Face based on their descriptions.
Task Execution: Invokes and executes each selected model, and return the results to ChatGPT.
Response Generation: Finally, using ChatGPT to integrate the prediction of all models, and generate responses.

You can read the full design paper here. It’s good stuff.