Apple was always going to fail this, and even more so going forward. LLM are bui...

wmf · 2025-01-06T04:18:08 1736137088

LLMs are mostly trained on public data while Apple's privacy stance applies to private data. There's no conflict between them.

(Meta can probably train on private data but OpenAI and Anthropic seem to be doing OK without it as far as we know.)

astrange · 2025-01-06T06:13:34 1736144014

All the LLM advances these days are from synthetic or explicitly created data too. You need public data mostly because it contains facts about the world, or because it's easier to talk about a book when it's "read" the book. But for a known topic area (as opposed to open Q&A) it's not critical since you can go and create or license it.

shuckles · 2025-01-06T05:08:49 1736140129

No, Apple’s privacy stance is about giving users control over data in ways they understand. Posting on Reddit or Arxiv is not a blank check to have your words be reused for LLM training, even if it’s technically public.

jitl · 2025-01-06T05:28:46 1736141326

Apple’s slogan is “what happens on your iPhone stays on your iPhone”. I think “I published a paper” or “I posted on Reddit” are clearly out of scope - those things are happening in public.