DeepSeek-R1-Distill-Qwen-1.5B, DeepSeek-R1-Distill-Qwen-7B, DeepSeek-R1-Distill-Qwen-14B and DeepSeek-R1-Distill-Qwen-32B are derived from Qwen-2.5 series, which are originally licensed under Apache 2.0 License, and now finetuned with 800k samples curated with DeepSeek-R1.
DeepSeek-R1-Distill-Llama-8B is derived from Llama3.1-8B-Base and is originally licensed under llama3.1 license.
DeepSeek-R1-Distill-Llama-70B is derived from Llama3.3-70B-Instruct and is originally licensed under llama3.3 license.
Agreed - my ex has spent her whole career in the public sector and the benefits are indeed insane. IIRC the pension was especially lucrative (from my POV), it was something like min ~20% employer contribution but I could be remembering incorrectly.
I wonder why they don't advertise all the benefits more explicitly in public sector job listings.
It's disheartening to see that people have grown increasingly paranoid about the authenticity of online content, suspecting that everything might be written by AI. This is just how I write!
It's not that people suspect everything might be written by an AI, but that your comment pattern matches many of the qualities of AI-generated writing. It's overly verbose, reads like the introduction of a low-quality blog article, includes information a human would assume people reading this thread already know, and uses the same sentence transitions and structures we see frequently in last generation's AI output. If you really do write like that—which I doubt—you should maybe get GTP-4 to help; you'd sound more human.
impressive stuff. reminds me of when ppl started using image classifier networks on spectrograms in order to classify audio. i would not have thought to apply a similar concept for generative models, but it seems obvious in hindsight.
DeepSeek-R1-Distill-Qwen-1.5B, DeepSeek-R1-Distill-Qwen-7B, DeepSeek-R1-Distill-Qwen-14B and DeepSeek-R1-Distill-Qwen-32B are derived from Qwen-2.5 series, which are originally licensed under Apache 2.0 License, and now finetuned with 800k samples curated with DeepSeek-R1. DeepSeek-R1-Distill-Llama-8B is derived from Llama3.1-8B-Base and is originally licensed under llama3.1 license. DeepSeek-R1-Distill-Llama-70B is derived from Llama3.3-70B-Instruct and is originally licensed under llama3.3 license.