Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
0 points
Zee2
2mo ago
0 comments
Save
Share
Alignment “appearing” better as model capabilities increase scares the shit out of me, tbh.
0 comments
2 comments · 2 top-level
top
newest
oldest
arcanus
2mo ago
Conversely: in humans, intelligence is inversely correlated with crime.
It doesn't go to zero, however!
5 more replies
mik09
2mo ago
yeah anthropic tries to address this through mechanistic interpretation but not sure they are progressing as fast in that domain as their model development
j
/
k
navigate · click thread line to collapse