Speaker
Description
The ATLAS Collaboration is composed of around 6,000 scientists, engineers, developers, students and administrators, with decades of institutional documentation spread across wikis, code docs, meeting agendas, recommendations, publications, tutorials, and project management systems. With the advent of retrieval augmented generation (RAG) and sophisticated large language models (LLMs) such as GPT-4, there is now an opportunity to produce a “front door” to this intimidatingly large corpus. ChATLAS is an attempt to provide this entrypoint, as ATLAS’ official AI assistant and search system. In this contribution, we present the infrastructure and technologies explored in the ChATLAS prototype, as well as lessons learnt and best practices across data collection, vector database construction, LLM prompt templating, and user interface design. We will sketch out a roadmap of improving the ChATLAS system, that includes the use of knowledge graphs, fine tuning, and multi-modal retrieval.