Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wouldn't trust it without a deeper inspection. I've had Claude do a workaround (ie: use a javascript interpreter and wrap it in Python) and then claim that it completed the task! The CoT was an interesting read on how his mind think about my mind (the user want ... but this should also achieve this ... the user however asked it to be this ... but this can get what the user want ...; that kind of salad)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: