[At decision squares, the 5×5 rand-region cheese maze network will put max cumulative probability on the maximal-advantage action at least] 50% of the time
Created by TurnTrout on 2023-02-09; known on 2023-02-16
- TurnTrout estimated 60% on 2023-02-09
- TurnTrout on 2023-02-09
- peligrietzer estimated 70% on 2023-02-09
- uli estimated 80% and said “Base rate is 1 : 2, needs evidence of 2: 1 to get 50%, which I think is provided by model capabilities. only reason I don’t put higher: the advantage function could be totally fucked for some reason, I haven’t seen obvious patterns in it yet.” on 2023-02-12
- rhaps0dy estimated 85% on 2023-03-02