AI are less surprising when you ignore…

Nov 28, 2022

Morality is complicated, AI try to be simple

3 Comments

Nov 28, 2022Edited

Also, in the Zelda example in the post, if we interpret the actions in the way you say (i.e. killing the shopkeeper is evil) then the AI is actually doing the opposite of betraying us when it makes its "treacherous" turn, right? Because around episode 3,000 it suddenly starts using the benevolent strategy, in addition to the violent one which it was using all along. (I realize this doesn't matter for the overall argument.)

Expand full comment

Siddharth

Nov 28, 2022

Nice post! Do you think this is a correct interpretation? "Qualitatively different strategies emerge at different levels of capability. Just because the strategies that emerge at lower capability levels don't disempower us doesn't mean the ones at higher capability levels aren't going to disempower us either."

Expand full comment

Reply (1)

Robert Huben

Nov 29, 2022

I think that's a correct and important insight about AI, just not the one I was trying to make in this post! I'd summarize this post as "the strategies that emerge are independent of our moral interpretation of the situation, unless we encode our morality in the reward function".

Expand full comment

From AI to ZI

AI are less surprising when you ignore…