This is very interesting, I had a similar feeling about Claude performing much w...

This is very interesting, I had a similar feeling about Claude performing much worse than GPT-4. Granted, I didn't put much work into optimizing the prompts, but then again, the prompts were certainly not GPT-optimized or specific either. The problems were severe, such as it choosing the wrong side of the conversation, hallucinating weird stuff plus repeating part of the prompt, all in the same message