From AI to ZI
Subscribe
Sign in
Home
Archive
About
Latest
Top
Discussions
Comments on Anthropic's Scaling Monosemanticity
Anthropic recently released a research report on sparse autoencoders, Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet…
Jun 3, 2024
•
Robert Huben
2
April 2024
The Hidden Scratchpad
[Epistemic Status: Short, off-the-cuff post to build intuition.
Apr 1, 2024
•
Robert Huben
3
2
March 2024
Research Report: Sparse Autoencoders find only 9/180 board state features in OthelloGPT
Abstract
Mar 5, 2024
•
Robert Huben
January 2024
Why Transformers Are Good
200 words on why the Transformer architecture is good
Jan 19, 2024
•
Robert Huben
December 2023
Rating my AI Predictions
9 months ago I predicted trends I expected to see in AI over the course of 2023. With the year coming to a close, let’s rate how I did!
Dec 21, 2023
•
Robert Huben
1
1
The Future of From AI To ZI
For the past few months, the blog’s regular readers may have noticed its irregular writing. This is a short post to explain what I’ve been up to instead…
Dec 11, 2023
•
Robert Huben
November 2023
Twelve Months in AI Safety
The AI safety landscape has been totally transformed since I started this blog twelve months ago.
Nov 1, 2023
•
Robert Huben
July 2023
Unsafe AI as Dynamical Systems
[Thanks to Valerie Morris for help editing this post.]
Jul 14, 2023
•
Robert Huben
2
2
AIs teams will probably be more superintelligent than individual AIs
Summary
Jul 4, 2023
•
Robert Huben
1
June 2023
[Research Update] Sparse Autoencoder features are bimodal
Overview
Jun 22, 2023
•
Robert Huben
2
Explaining "Taking features out of superposition with sparse autoencoders"
[Thanks to Logan Riggs and Hoagy for their help writing this post.]
Jun 16, 2023
•
Robert Huben
2
1
May 2023
Is behavioral safety "solved" in non-adversarial conditions?
I’m trying to crystalize something I was said to a friend recently: I think that techniques like RLHF and Constitutional AI seem to be sufficient for…
May 25, 2023
•
Robert Huben
5
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts