From AI to ZI
Subscribe
Sign in
Home
Archive
About
Latest
Top
Discussions
The Hidden Scratchpad
[Epistemic Status: Short, off-the-cuff post to build intuition. No claims to originality.] When evaluating language models for safety properties, we…
Apr 1
•
Robert Huben
1
Share this post
The Hidden Scratchpad
aizi.substack.com
Copy link
Facebook
Email
Note
Other
March 2024
Research Report: Sparse Autoencoders find only 9/180 board state features in OthelloGPT
Abstract A sparse autoencoder is a neural network architecture that has recently gained popularity as a technique to find interpretable features in…
Mar 5
•
Robert Huben
Share this post
Research Report: Sparse Autoencoders find only 9/180 board state features in OthelloGPT
aizi.substack.com
Copy link
Facebook
Email
Note
Other
January 2024
Why Transformers Are Good
200 words on why the Transformer architecture is good
Jan 19
•
Robert Huben
Share this post
Why Transformers Are Good
aizi.substack.com
Copy link
Facebook
Email
Note
Other
December 2023
Rating my AI Predictions
9 months ago I predicted trends I expected to see in AI over the course of 2023. With the year coming to a close, let’s rate how I did! Predictions…
Dec 21, 2023
•
Robert Huben
1
Share this post
Rating my AI Predictions
aizi.substack.com
Copy link
Facebook
Email
Note
Other
The Future of From AI To ZI
For the past few months, the blog’s regular readers may have noticed its irregular writing. This is a short post to explain what I’ve been up to instead…
Dec 11, 2023
•
Robert Huben
Share this post
The Future of From AI To ZI
aizi.substack.com
Copy link
Facebook
Email
Note
Other
November 2023
Twelve Months in AI Safety
The AI safety landscape has been totally transformed since I started this blog twelve months ago. In this post I want to recap some of the major events…
Nov 1, 2023
•
Robert Huben
Share this post
Twelve Months in AI Safety
aizi.substack.com
Copy link
Facebook
Email
Note
Other
July 2023
Unsafe AI as Dynamical Systems
[Thanks to Valerie Morris for help editing this post.] Overview Large Language Models (LLMs) and their safety properties are often studied from the…
Jul 14, 2023
•
Robert Huben
1
Share this post
Unsafe AI as Dynamical Systems
aizi.substack.com
Copy link
Facebook
Email
Note
Other
1
AIs teams will probably be more superintelligent than individual AIs
Summary Teams of humans (countries, corporations, governments, etc) are more powerful and intelligent than individual humans. Our prior should be the…
Jul 4, 2023
•
Robert Huben
1
Share this post
AIs teams will probably be more superintelligent than individual AIs
aizi.substack.com
Copy link
Facebook
Email
Note
Other
June 2023
[Research Update] Sparse Autoencoder features are bimodal
Overview The sparse autoencoders project is a mechanistic interpretability effort to algorithmically find semantically meaningful “features” in a…
Jun 22, 2023
•
Robert Huben
1
Share this post
[Research Update] Sparse Autoencoder features are bimodal
aizi.substack.com
Copy link
Facebook
Email
Note
Other
Explaining "Taking features out of superposition with sparse autoencoders"
[Thanks to Logan Riggs and Hoagy for their help writing this post.] In this post, I’m going to translate the post [Interim research report] Taking…
Jun 16, 2023
•
Robert Huben
1
Share this post
Explaining "Taking features out of superposition with sparse autoencoders"
aizi.substack.com
Copy link
Facebook
Email
Note
Other
1
May 2023
Is behavioral safety "solved" in non-adversarial conditions?
I’m trying to crystalize something I was said to a friend recently: I think that techniques like RLHF and Constitutional AI seem to be sufficient for…
May 25, 2023
•
Robert Huben
Share this post
Is behavioral safety "solved" in non-adversarial conditions?
aizi.substack.com
Copy link
Facebook
Email
Note
Other
5
Statistics for the Working Mathematician
I’m a mathematician, but I managed to avoid almost all education about statistics, and I have found that statistics resources are often catered to a…
May 25, 2023
•
Robert Huben
1
Share this post
Statistics for the Working Mathematician
aizi.substack.com
Copy link
Facebook
Email
Note
Other
Share
Copy link
Facebook
Email
Note
Other
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts