More

thefroh · 2025-11-18T14:20:01 1763475601

thefroh · 2025-11-05T06:46:30 1762325190

presumably it's the reverse engineered server that has most of the work put into it, and one would hope that's what is going to be released if the developer decides to

thefroh · 2025-10-16T14:43:01 1760625781

which is important to bear in mind if people are introducing a "drop earliest messages" sliding window for context management in a "chat-like" experience. once you're at that context limit and start dropping the earliest messages, you're guaranteeing every message afterwards will be a cache miss.

a simple alternative approach is to introduce hysteresis by having both a high and low context limit. if you hit the higher limit, trim to the lower. this batches together the cache misses.

if users are able to edit, remove or re-generate earlier messages, you can further improve on that by keeping track of cache prefixes and their TTLs, so rather than blindly trimming to the lower limit, you instead trim to the longest active cache prefix. only if there are none, do you trim to the lower limit.

thefroh · 2025-10-16T04:53:18 1760590398

because you can have multiple breakpoints with Anthropic's approach, whereas with OpenAI, you only have breakpoints for what was sent.

for example if a user sends a large number of tokens, like a file, and a question, and then they change the question.

simonw · 2025-10-16T05:25:43 1760592343

I thought OpenAI would still handle case? Their cache would work up to the end of the file and you would then pay for uncached tokens for the user's question. Have I misunderstood how their caching works?

thefroh · 2025-10-16T14:30:30 1760625030

not if call #1 is the file + the question, call #2 is the file + a different question, no.

if call #1 is the file, call #2 is the file + the question, call #3 is the file + a different question, then yes.

and consider that "the file" can equally be a lengthy chat history, especially after the cache TTL has elapsed.

simonw · 2025-10-16T19:38:18 1760643498

I vibe-coded up a quick UI for exploring this: https://tools.simonwillison.net/prompt-caching

As far as I can tell it will indeed reuse the cache up to the point, so this works:

Prompt A + B + C - uncached

Prompt A + B + D - uses cache for A + B

Prompt A + E - uses cache for A

thefroh · 2025-08-13T06:19:15 1755065955

here's the associated issue, I believe.

https://github.com/syncthing/syncthing/pull/10005

thefroh · 2025-04-21T20:13:27 1745266407

does this allow running applications which need FUSE? I know for termux that's only possible with root.

thefroh · 2025-04-17T11:49:50 1744890590

it's also the effect that lets you kinda know if you're near a wall (for example when you're fumbling around in the dark)

thefroh · on Nov 20, 2024

i've heard good things about using the 1 euro filter for user input related tasks, where you're trying to effectively remove noise, but also keep latency down.

see https://gery.casiez.net/1euro/ with plenty of existing implementations to pick from

plasticeagle · on Nov 20, 2024

That sounds very interesting. I've been needing a filter to deal with noisy A/D conversions for pots in an audio project. Noise on a volume control turns into noise on the output, and sounds horrible, but excessive filtering causes unpleasant latency when using the dials.

thefroh · on Oct 21, 2024

while I'm a fan of TypeScript and using type hints in Python from an autocomplete and linting perspective, I am curious...

... has either language leveraged these to better tell the CPU what to do? presumably for perf.

yurishimo · on Oct 21, 2024

PHP does but the types actually mean something. If your types can be stripped out to make the program run, I have a hard time believing that there is any optimization occurring there.

iforgotmysocks · on Oct 21, 2024

python ignores type hints

thefroh · on Oct 16, 2024

for me, my desktop and laptop are the main go-to. the mobile is an extra device with different, more specific use cases

and so I've been a little disappointed with how these devices keep getting bigger and bigger. I was pretty happy with the size of the Pixel 3

I think I like to be able to access the whole screen comfortably with one hand, not fumbling it about. easy to manipulate, easy to pocket. the Pixel 8 shrunk a bit over its predecessors so I nabbed that, and it's probably at or just over the limit for me, size wise