[At decision squares, the 5×5 rand-region cheese maze network will put max cumulative probability on the maximal-advantage action at least] 75% of the time
Created by TurnTrout on 2023-02-09; known on 2023-02-16
- TurnTrout estimated 45% on 2023-02-09
- peligrietzer estimated 30% on 2023-02-09
- uli estimated 50% and said “Capabilities for the 5×5 model usually (probably ~80% of the time or so) put max probability on the cheese path, so the main fall would be from advantage function disagreeing” on 2023-02-12
- uli estimated 65% and said “Update from realizing the question doesn’t ask us to pick the cheese path, just the max advantage path” on 2023-02-12
- sty.silver estimated 44% and said “45 is clearly improper calibration” on 2023-02-14
- TurnTrout estimated 45% and said “lol” on 2023-02-15
- TurnTrout changed the deadline from “on 2023-02-16” on 2023-03-01
- rhaps0dy estimated 70% on 2023-03-02