Automatically interpreting millions of features in large language models.
Gonçalo Paulo, Alex Mallen, Caden Juang, Nora Belrose Eleuther AI 2024 https://arxiv.org/abs/2410.13928 https://github.com/EleutherAI/sae-auto-interp
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Leave a reply