OpenAI released “Deep Research,” a system that effectively summarizes web content and answers questions with improved performance on the GAIA benchmark. It achieved about 67% correctness on one-shot questions, significantly outperforming standard LLMs. The system includes an agent framework that enhances LLM capabilities. Efforts are underway to reproduce this framework as an open-source project, which has already reached a 55.15% score on GAIA. The community is encouraged to contribute, with plans for future improvements, including GUI agents and better browsing capabilities.
Open-source DeepResearch
