Also worth checking out was codestral... I think that had a 256k context and use... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		anshumankmr 73 days ago \| parent \| context \| favorite \| on: 'Western Qwen': IBM Wows with Granite 4 LLM Launch... Also worth checking out was codestral... I think that had a 256k context and used Mamba even if it is slightly older model now... it had worked great for a Text2SQL use case we worked on.

incomingpain 73 days ago [–]

Magistral 2509 just came out. It super slows down when you go over 40,000 context. It's quite a fantastic model.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact