[Thanks to Logan Riggs and Hoagy for their help writing this post.] In this post, I’m going to translate the post [Interim research report] Taking features out of superposition with sparse autoencoders by Lee Sharkey, Dan Braun, and beren (henceforth ‘the authors’) into language that makes sense to me, and hopefully you too!
I really enjoyed a technical paper brought down to earth. 👍