Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is one of those things that is a feature of Claude, not a bug. Sonnet and opus 4.5 can absolutely detect prompt attacks, however they are post-trained to ignore them in let's say ... Certain scenarios... At least if you are using the API.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: