The underlying issue stems from how ChatGPT processes text. It uses a tokenizer to translate words into numerical tokens, which are then fed into a deep neural network. However, this process doesn’t capture the structure of letters within words, which is crucial for solving Wordle puzzles.
Interestingly, ChatGPT showed better performance when tasked with writing a poem with specific acrostics or generating computer programs related to Wordle. It suggests that the AI has learned associations between words and their first letters but struggles with more complex word structures. ChatGPT may have picked up the knack from the abundance of textbooks and programming websites in its training data, all chock full of acronyms and Capital Case Capitalization.
So, how can future LLMs overcome these limitations? One possible solution is to augment the training data to include mappings of every letter position within every word in its dictionary. This would allow the AI to better understand the structure of words and improve its performance on tasks like Wordle.
Also discussed as an exciting solution is an idea called Toolformer. In this approach, LLMs use external tools to carry out tasks where they normally struggle, such as arithmetic calculations or word puzzles. By hooking up to outside resources for specific jobs, the LLM would become even more capable and versatile.