Hacker Newsnew | past | comments | ask | show | jobs | submit | eob's commentslogin

I build coding agents for a living, and I'm struggling to map this onto the set of things I do at work.

In general, interoperability and user choice are really important for us to get right as the community of people building AI platforms...

Have others reading this document been able to map it onto their work?

As a specific example:

> ai://bank/service/payments?amount=10&currency=USD

I'm not sure what this is representing here. Is it a way to encode a clickable link to chat with `bank` about `service/payments` with a few additional args attached?


Third party apps can’t use the network though. Iirc there’s an async message queue with eventual delivery that each app gets, which it can use to send messages back and forth with a paired phone app.


That was once the case, but no longer. Third-party WatchOS apps can work without a phone present, up to being installed directly from the watch's app store. They can definitely do independent networking, but there are still some restrictions, eg they can't do it when backgrounded, and websockets are pretty locked down (only for audio-streaming as per Apple policy).

I reckon the lack of general-purpose websockets is probably the issue for a system based on Phoenix LiveView.


Bravo -- this is fantastic.

I've been waiting for this ever since reading some interview with Orson Scott Card ages ago. It turns out he thinks of his novels as radio theater, not books. Which is a very different way to experience the audio.


Thanks for the kind words :)))


Or vice versa - perhaps some subset of the "thought chains" of Cyc's inference system could be useful training data for LLMs.


When I first learned about LLMs, what came to mind is some sort of "meeting of the minds" with Cyc. 'Twas not to be, apparently.


I view Cyc's role there as a RAG for common sense reasoning. It might prevent models from advising glue on pizza.

    (is-a 'pizza 'food)
    (not (is-a 'glue 'food))
    (for-all i ingredients
      (assert-is-a i 'food))



sure but the bigger models don’t make these trivial mistakes, and I’m not sure if translating the LLM english sentences into LISP and trying to check them is going to be more accurate than just training the models better


The bigger models avoid those mistakes by being, well, bigger. Offloading to a structured knowledgebase would achieve the same without the model needing to be bigger. Indeed, the model could be a lot smaller (and a lot less resource-intensive) if it only needed to worry about converting $LANGUAGE queries to Lisp queries and converting Lisp results back into $LANGUAGE results (where $LANGUAGE is the user's natural language, whatever that might be), rather than having to store some approximation of that knowledgebase within itself on top of understanding $LANGUAGE and understanding whatever ad-hoc query/result language it's unconsciously invented for itself.


Beyond just checking for mistakes, it would be interesting to see if Cyc has concepts that the LLMs don't or vice versa. Can we determine this by examining the models' internals?


An aspect of these self-improvement thought experiments that I’m willing to tentatively believe.. but want more resolution on, is the exact work involved in “improvement”.

Eg today there’s billions of dollars being spent just to create and label more data, which is a global act of recruiting, training, organization, etc.

When we imagine these models self improving, are we imagining them “just” inventing better math, or conducting global-scale multi-company coordination operations? I can believe AI is capable of the latter, but that’s an awful lot of extra friction.


This is exactly what makes this scenario so absurd to me. The authors don't even attempt to describe how any of this could realistically play out. They describe sequence models and RLAIF, then claim this approach "pays off" in 2026. The paper they link to is from 2022. RLAIF also does not expand the information encoded in the model, it is used to align the output with a set of guidelines. How could this lead to meaningful improvement in a model's ability to do bleeding-edge AI research? Why wouldn't that have happened already?

I don't understand how anyone takes this seriously. Speculation like this is not only useless, but disingenuous. Especially when it's sold as "informed by trend extrapolations, wargames, expert feedback, experience at OpenAI, and previous forecasting successes". This is complete fiction which, at best, is "inspired by" the real world. I question the motives of the authors.


Are you comfortable sharing the video & lip-sync stack you use? I don't know anything about the space but am curious to check out what's possible these days.


For my last video I used https://github.com/warmshao/FasterLivePortrait with a png of the character on my RTX 3090 desktop and recorded the output of that real-time but in the next video I'm going to spin up a runpod instance and do the FasterLivePortrait in the cloud after the fact because then I can get a smooth 60fps which looks better. I think the only real-time cloud way to do AI vtubing in the cloud is my own GenDJ project (fork of https://github.com/kylemcdonald/i2i-realtime but tweaked for cloud real-time) but that just doesn't look remotely as good as LivePortrait. Somebody needs to rip out and replace insightface in FasterLivePortait (it's prohibited for commercial use) and fork https://github.com/GenDJ to have the runpod it spins up run the de-insightfaced LivePortrait instead of i2i-realtime. I'll probably get around to doing that in the next few months if nobody else does and nothing else comes along and makes LivePortrait obsolete (both are big ifs).

AIWarper recently released a simpler way to run FasterLivePortrait for vtubing purposes https://huggingface.co/AIWarper/WarpTuber but I haven't tried it yet because I already have my own working setup and as I mentioned I'm shifting my workload for that to the cloud anyways


Do you mind sharing your yt account? If you are okay with linking it to your hn account. I'd quite like to see the results.


I was curious as well.

Not OP but via their website linked in their profile -

https://youtu.be/Tl3pGTYEd2I


This looks fantastic — congrats on the launch.

As an armchair observer, the agents + browser space feels like it’s waiting for someone to make the open source framework that everyone piles on to.

Proxy rotation sounds like a solid way to monetize for businesses.


Yeah that’s precisely why we introduced the cloud version!


There was an SF office of 18F -- IIRC it was in the building to the right of the Civic Center park as you looked at it from the Bart stop. They were great folks from every encounter I had with them.


Let's presume this gets developed to ShadCN/Tailwind level quality.

In that world, what would be the tradeoffs between:

- NextJs + Tailwind + ShadCN

- NextJS + ?? + Shoelace

I don't have a good sense of how web components compare to practice of "copy-by-value, then compile-into-binary" UI shipping that's common in the Next + Shad world these days.


I think this is more useful for HTMX + AlpineJS based model of development. ShadCN et al are tied to a frontend framework (react/vue/svelte etc). But with shoelace/daisyUI etc, one gets the same level of polish out-of-the-box for HTMX etc


Have you ever managed a complex, dynamic, changing system and found that the optimal size based on current conditions was 20% less than it was at some prior time?

I can think of all sorts examples.


Why can't the CEO share the fate of laid off employees? They did nothing wrong either.


The answer to your question is definitely no. Nobody who has ever ran a business for long would think that every layoff is due to CEO incompetence.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: