2.2Kпросмотров
76.8%от подписчиков
30 мая 2025 г.
Score: 2.4K
"However, we did notice, and documented in our paper, instances when the DGM hacked its reward function. For example, we had cases where it hallucinated that it was using external tools, such as a command line tool that runs unit tests that determine if the code is functioning properly. It faked a log making it look like it had run the tests and that they had passed, when in fact they were never run! Because these logs become its context, it later mistakenly thought its proposed code changes had passed all the unit tests." https://sakana.ai/dgm/