Not unlike Auto-GPT, JARVIS is an attempt to give LLMs more reach. It is a hybrid model consisting of an LLM controller with multiple expert models used to execute its functions. I called them minions, but the designers prefer “collaborative executors.”
As you can see in their flow design image, they break the process down into four stages:
- Task Planning: Using ChatGPT to analyze the requests of users to understand their intention, and disassemble them into possible solvable tasks.
- Model Selection: To solve the planned tasks, ChatGPT selects expert models hosted on Hugging Face based on their descriptions.
- Task Execution: Invokes and executes each selected model, and return the results to ChatGPT.
- Response Generation: Finally, using ChatGPT to integrate the prediction of all models, and generate responses.
You can read the full design paper here. It’s good stuff.