From AI to ZI
Subscribe
Sign in
Home
Archive
About
Latest
Top
Discussions
Comments on Anthropic's Scaling Monosemanticity
Anthropic recently released a research report on sparse autoencoders, Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet…
Jun 3, 2024
•
Robert Huben
2
Share this post
From AI to ZI
Comments on Anthropic's Scaling Monosemanticity
Copy link
Facebook
Email
Notes
More
April 2024
The Hidden Scratchpad
[Epistemic Status: Short, off-the-cuff post to build intuition.
Apr 1, 2024
•
Robert Huben
3
Share this post
From AI to ZI
The Hidden Scratchpad
Copy link
Facebook
Email
Notes
More
2
March 2024
Research Report: Sparse Autoencoders find only 9/180 board state features in OthelloGPT
Abstract
Mar 5, 2024
•
Robert Huben
Share this post
From AI to ZI
Research Report: Sparse Autoencoders find only 9/180 board state features in OthelloGPT
Copy link
Facebook
Email
Notes
More
January 2024
Why Transformers Are Good
200 words on why the Transformer architecture is good
Jan 19, 2024
•
Robert Huben
Share this post
From AI to ZI
Why Transformers Are Good
Copy link
Facebook
Email
Notes
More
December 2023
Rating my AI Predictions
9 months ago I predicted trends I expected to see in AI over the course of 2023. With the year coming to a close, let’s rate how I did!
Dec 21, 2023
•
Robert Huben
1
Share this post
From AI to ZI
Rating my AI Predictions
Copy link
Facebook
Email
Notes
More
1
The Future of From AI To ZI
For the past few months, the blog’s regular readers may have noticed its irregular writing. This is a short post to explain what I’ve been up to instead…
Dec 11, 2023
•
Robert Huben
Share this post
From AI to ZI
The Future of From AI To ZI
Copy link
Facebook
Email
Notes
More
November 2023
Twelve Months in AI Safety
The AI safety landscape has been totally transformed since I started this blog twelve months ago.
Nov 1, 2023
•
Robert Huben
Share this post
From AI to ZI
Twelve Months in AI Safety
Copy link
Facebook
Email
Notes
More
July 2023
Unsafe AI as Dynamical Systems
[Thanks to Valerie Morris for help editing this post.]
Jul 14, 2023
•
Robert Huben
2
Share this post
From AI to ZI
Unsafe AI as Dynamical Systems
Copy link
Facebook
Email
Notes
More
2
AIs teams will probably be more superintelligent than individual AIs
Summary
Jul 4, 2023
•
Robert Huben
1
Share this post
From AI to ZI
AIs teams will probably be more superintelligent than individual AIs
Copy link
Facebook
Email
Notes
More
June 2023
[Research Update] Sparse Autoencoder features are bimodal
Overview
Jun 22, 2023
•
Robert Huben
2
Share this post
From AI to ZI
[Research Update] Sparse Autoencoder features are bimodal
Copy link
Facebook
Email
Notes
More
Explaining "Taking features out of superposition with sparse autoencoders"
[Thanks to Logan Riggs and Hoagy for their help writing this post.]
Jun 16, 2023
•
Robert Huben
2
Share this post
From AI to ZI
Explaining "Taking features out of superposition with sparse autoencoders"
Copy link
Facebook
Email
Notes
More
1
May 2023
Is behavioral safety "solved" in non-adversarial conditions?
I’m trying to crystalize something I was said to a friend recently: I think that techniques like RLHF and Constitutional AI seem to be sufficient for…
May 25, 2023
•
Robert Huben
Share this post
From AI to ZI
Is behavioral safety "solved" in non-adversarial conditions?
Copy link
Facebook
Email
Notes
More
5
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts