Hacker Newsnew | past | comments | ask | show | jobs | submit | nateroling's commentslogin

This is on the cluster level, while the article is talking about the database level, I believe.

Seeing the Gemini 3 capabilities, I can imagine a near future where file formats are effectively irrelevant.


I have family members with health conditions that require periodic monitoring. For some tests, a phlebotomist comes home. For some tests, we go to a hospital. For some other tests, we go to a specialized testing center. They all give us PDFs in their own formats. I manually enter the data to my spreadsheet, for easy tracking. I use LLMs for some extraction, but they still miss a lot. At least for the foreseeable future, no LLM will ever guarantee that all the data has been extracted correctly. By "guarantee", I mean someone's life may depend on it. For now, doctors take up the responsibility of ensuring the data is correct and complete. But not having to deal with PDFs would make at least a part of their job (and our shared responsibilities) easier.


Can you elaborate? Are you never reading papers directly but only using Gemini to reformat or combine/summarize?


I mean that when a computer can visually understand a document and reformat and reinterpret it in any imaginable way, who cares how it’s stored? When a png or a pdf or a markdown doc can all be be read and reinterpreted into an infographic or a database or an audiobook or an interactive infographic the original format won’t matter.


Files.

Truth in general, if we aren't careful.


Tell that to publishing companies.


Seriously. More people need to wake up to this. Older generations can keep arguing over display formats if they want. Meanwhile younger undergrad and grad students are getting more and more accustomed to LLMs forming the front end for any knowledge they consume. Why would research papers be any different.


> Meanwhile younger undergrad and grad students are getting more and more accustomed to LLMs forming the front end for any knowledge they consume.

Well, that's terrifying. I mean, I knew it about undergrads, but I sure hoped people going into grad school would be aware of the dangers of making your main contact with research, where subtle details are important, through a known-distorting filter.

(I mean, I'd still be kinda terrified if you said that grad students first encounter papers through LLMs. But if it is the front end for all knowledge they consume? Absolutely dystopian.)


I admit it has dystopian elements. It’s worth deciding what specifically is scary though. The potential fallibility or mistakes of the models? Check back in a few months. The fact they’re run by giant corps which will steal and train on your data? Then run local models. Their potential to incorporate bias or persuade via misalignment with the reader’s goals? Trickier to resolve, but various labs and nonprofits are working on it.

In some ways I’m scared too. But that’s the way things are going because younger people far prefer the interface of chat and question answering to flipping through a textbook.

Even if AI makes more mistakes or is more misaligned with the reader’s intentions than a random human reviewer (which is debatable in certain fields since the latest models game out), the behavior of young people requires us to improve the reputability of these systems. (Make sure they use citations, make sure they don’t hallucinate, etc). I think the technology is so much more user friendly that fixing the engineering bugs will be easier than forcing new generations to use the older systems.


This made me do a double-take. Surely you would never do this, right? It seems to be directly counter to the idea of being able to audit changes:

“Event replay: if we want to adjust a past event, for example because it was incorrect, we can just do that and rebuild the app state.”


No, that definitely happens.

There are two kinds of adjustments: an adjustment transaction (pontual), or re-interpreting what happened (systemic). The event sourcing pattern is useful on both situations.

Sometimes you need to replay events to have a correct report because your interpretation at the time was incorrect or it needs to change for whatever reason (external).

Auditing isn't about not changing anything, but being able to trace back and explain how you arrived at the result. You can have as many "versions" as you want of the final state, though.



The argument I've always heard for this was issues with code, not the event. If for a period of time you have a bug in your code, with event sourcing, you can fix the bug and replay all the events to correct current projections of state.


What if your correction renders subsequent events nonsensical?


There is a very real chance of this happening, and two choices.

One - bake whatever happens into your system permanently, like 99% of all apps, and disallow corrections.

Two - keep the events around so that you can check and re-check your corrections before you check in new code or data.


Instead of modifying the original (and incorrect) event, you can add a manual correction event with the info of who did it and why, and replay the events. This is how we dealt with such corrections with event sourcing.


But you don't need to replay in that case. You just fire the correction event and the rest is taken care of.


It would be outside of the normal exceptional cases, yes.

Like buggy data that crashes the system.

If you have the old events there, you can "measure twice, cut once", in the sense that you can keep re-running your old events and compare them to the new events under unit-test conditions, and be absolutely sure that your history re-writing won't break anything else.

It's not for just doing a refund or something.


Yeah, that's a big NO. Events are immutable. If an event is wrong, you post an event with an amendment. Then yes, rebuild the app state.


Not speaking about their case, but I think some cases a "versioned mutable data store" with a event log that lists updates/inserts makes more sense than an "immutable event log" one like kafka.

Consider the update_order_item_quantity event in a classic event sourced systems. It's not possible to guarantee that two waiters dispatching two such events at same time when current quantity is 1 would not cause the quantity to become negative/invalid.

If the data store allowed for mutability and produced an event log it's easy:

Instead of dispatching the update_order_item_quantity you would update the order document specifying the current version. In the previous example second request would fail since it specified a stale version_id. And you can get the auditability benefits of classic event sourcing system as well because you have versions and an event log.

This kind of architecture is trivial to implement with CouchDB and easier to maintain than kafka. Pity it's impossible to find managed hosting for CouchDB outside of IBM.


Any modern DB with a WAL (write ahead log) is an immutable event system, where the events are the DB primitives (insert, update, delete...).

When you construct your own event system you are constructing a DB with your own primitives (deposit, withdraw, transfer, apply monthly interest...).

You have to figure out your transaction semantics. For example, how to reject invalid events.


> Any modern DB with a WAL (write ahead log) is an immutable event system, where the events are the DB primitives (insert, update, delete...).

Agreed, I just wish apart from WAL they also had versioning as first class and their update api required clients to pass the version they have "last seen" to prevent inconsistencies.


On most SQL databases, you can put CHECK constraints on columns so that the database rejects events. But this is controversial, as people don't like putting logic on the DB.


CosmosDB has etags on every document


DBs only work because the events are artificial and nobody cares about what's written in them.

And DBs are not really CQRS because the events are artificial and don't have business data that people are interested in keeping.


The big caveat here is GDPR and other privacy laws. In some cases you need the ability to scrub the event store completely of PII for legal reasons, even if only to null the relevant fields in the events.

Without preemptive defensive coding in your aggregates (whatever you call them) this can quickly blow up in your face.


What I have read about it is: encrypt PII with a client-depending key, do not post the key to the event system. When an erasure request comes in, delete the corresponding key. Now the data cannot be decrypted anymore for that client.


That's what I said too, and the answer was "No, just because it cannot be decrypted today does not mean it cannot be decrypted in the future. The data must be deleted"


That is some serious architecture to put in place before you can even start using event sourcing.


For finance recordkeeping requirements take precedence over privacy requirements. Audit trail data must be on WORM storage and must not be scrubbable.


It's poorly phrased but I'm not sure they meant "mutate the past". The keyword is "adjust" which could mean "append a correction".


But then you wouldn't need a replay. So the author really means mutate the past.


Can you write a prompt to optimize prompts?

Seems like an LLM should be able to judge a prompt, and collaboratively work with the user to improve it if necessary.


100% yes! There've been some other writers who've been doing parallel work around that in the last couple weeks.

https://www.dbreunig.com/2025/06/10/let-the-model-write-the-... is an example.

You can see the hands on results in this hugging face branch I was messing around in:

here is where I tell the LLM to generate prompts for me based on research so far

https://github.com/AlexChesser/transformers/blob/personal/vi...

here is the prompts that produced:

https://github.com/AlexChesser/transformers/tree/personal/vi...

and here is the result of those prompts:

https://github.com/AlexChesser/transformers/tree/personal/vi.... (also look at the diagram folders etc..)



What’s the benefit for GPT 5 model not to optimize the prompt they get directly instead of relying on this optimizer?


I use Grok to write the prompts. Its excellent. I think human created prompts are insufficient in almost all cases.

Write your prompt in some shape and ask grok

Please rewrite this prompt for higher accuracy

-- Your prompt


Wouldn't you be better doing it with almost anything other than Grok?

How do you know it won't introduce misinformation about white genocide into your prompt?


Any llm is fine. Grok had the latest model available for free use. Thats all. Gpt5 produces similar result too.

We have to read the result of course.


If someone keeps choosing to use the "mechahitler chatbot" at this point, I don't think they care about what misinformation goes into their prompt.


The LLM is basically a runtime that needs optimized input bc the output is compute bottlenecked. Input quality scales with domain knowledge, specificity and therefore human time input. You can absolutely navigate an LLMs attention piecemeal around a spec until you build an optimized input.


This is pretty much DSPy.


yes, just prepend your request to llm with "Please give me a well-structured LLM prompt that will solve this problem..."


This, but for many things.

Paint is ready at the hardware store Table is ready at the restaurant Construction is done on a bridge

All kinds of things that we need a one-time notification for.


Marketing has, and will continue to, ruin notifications. One time notification paint is ready? Surely, they think, they can upsell you on other related and tangential products. You know, to ~~keep the cash flowing~~ help you be more successful at your DIYing.


> Marketing has, and will continue to, ruin notifications.

This is one of my personal "Laws", except it's not just notifications. Any communications medium will eventually be ruined by spam. What makes a medium useful to legitimate consumers/users is what makes it a target for "marketing", i.e., spam.

I've seen multiple media ruined by marketing in my lifetime. This isn't a technological problem.


Improvements are still improvements.

How does a good actor do this in good faith right now?

Email? Costs money. SMS? Costs money. RSS? Wildly unpopular. ActivityPub? Can't be statically hosted and fairly unpopular.

Right now they basically use fucking Facebook and fucking Twitter, and even then you're subscribing to an entire stream.


It is technically trivial to make a forwarding email address that works once and only once.


That was the first thought that struck me when I saw the headline and tried to guess what the posting was about before clicking the link. At least 2-3 times per week I find myself wanting something like this. Often it involves leaving my email address or my phone number to have someone contact me. Or having to check back.

In its simplest form, this isn't all that hard to build. The tricky bit is to get people to use it. And perhaps even to explain what it does and possibly how it works.

If someone knows how to sell it, I'd be willing to build it.


Looking at the trees in the background of the first photo, it’s clear he’s using a longer focal length on the non-iPhone.

He has some good points, maybe, but in general it’s a pretty naive comparison.


I don’t think control center actually uses the liquid glass elements. They don’t respond to accessibility options like reduce transparency, for one thing.


Fine-grained reactivity (ie Knockout) was a thing well before React. If anything, React was a response to the deficiencies of fine-grained reactivity.


I’ve done the same with SQLAlchemy in Python and SQLKata in C#.

Sadly the whole idea of composable query builders seems to have fallen out of fashion.


If composable query builders have fallen out of fashion, that's news to me. I've used SQLAlchemy in Python fairly heavily over the years, it's gotta be one of the best out there, I plan to continue using it for many more years. And I've recently gotten my feet wet with Ecto in Elixir, I'm quite impressed with it too.


I had the same thought, but it sounds like this operates at a much lower level than that kind of thing:

> Then, a physics-based neural network was used to process the images captured by the meta-optics camera. Because the neural network was trained on metasurface physics, it can remove aberrations produced by the camera.


I'd like to see some examples showing how it does when taking a picture of completely random fractal noise. That should show it's not just trained to reconstruct known image patterns.

Generally it's probably wise to be skeptical of anything that appears to get around the diffraction limit.


I believe the claim is that the NN is trained to reconstruct pixels, not images. As in so many areas, the diffraction limit is probabalistic so combining information from multiple overlapping samples and NNs trained on known diffracted -> accurate pairs may well recover information.

You’re right that it might fail on noise with resolution fine enough to break assumptions from the NN training set. But that’s not a super common application for cameras, and traditional cameras have their own limitations.

Not saying we shouldn’t be skeptical, just that there is a plausible mechanism here.


My concern would be that if it can't produce accurate results on a random noise test, then how do we trust that it actually produces accurate results (as opposed to merely plausible results) on normal images?

Multilevel fractal noise specifically would give an indication of how fine you can go.


"Accurate results" gets you into the "what even is a photo" territory. Do today's cameras, with their huge technology stack, produce accurate results? With sharpening and color correction and all of that, probably not.

I agree that measuring against such a test would be interesting, but I'm not sure it's possible or desirable for any camera tech to produce an objectively "true" pixel by pixel value. This new approach may fail/cheat in different ways, which is interesting but not disqualifying to me.


we've had very good chromatic aberration correction since I got a degree in imaging technology and that was over 20 years ago so I'd imagine it's not particularly difficult for name your flavour of ML.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: