Hacker Newsnew | past | comments | ask | show | jobs | submit | ozim's commentslogin

I hate bugs, I specifically like late autumn/winter/early spring cold times because there are almost no bugs. I don't mind snow/ice as much.

I agree but there are downsides like shallow feedback.

There still going to be loads of explanation because „just a button” can have loads of stuff behind it like staring whole processing pipeline that is not visible for non tech people.


I think that is easy to understand for a lot of people but I will spell it out.

This looks like AI companies marketing that is something in line 1+1 or buy 3 for 2.

Money you don’t spend on tokens are the only saved money, period.

With employees you have to pay them anyway you can’t just say „these requirements make no sense, park for two days until I get them right”.

You would have to be damn sure of that you are doing the right thing to burn $1k a day on tokens.

With humans I can see many reasons why would you pay anyway and it is on you that you should provide sensible requirements to be built and make use of employees time.


OK, but who is saying that to the llm? Another llm?

We got feedback in this thread from someone who supposedly knows rust about common anti patterns and someone from the company came back with 'yeah that's a problem, we'll have agents fix it.'[0].

Agents are obviously still too stupid to have the meta cognition needed for deciding when to refactor, even at $1,000 per day per person. So we still need the buts in seats. So we're back at the idea of centaurs. Then you have to make the case that paying an AI more than a programmer is worth it.[1]

[0] which has been my exact experience with multi-agent code bases I've burned money on.

[1] which in my experience isn't when you know how to edit text and send API requests from your text editor.


Here we can start debating what means better code.

I haven’t seen HFT code but I have seen examples of exploit codes and most of it is amateur hour when it comes to building big size systems.

They are of course efficient in getting to the goal. But exploits are one off code that is not there to be maintained.


Exactly there was this study where they were trying to make LLM reproduce HP book word for word like giving first sentences and letting it cook.

Basically they managed with some tricks make 99% word for word - tricks were needed to bypass security measures that are there in place for exactly reason to stop people to retrieve training material.


This reminds me of https://en.wikipedia.org/wiki/Pierre_Menard,_Author_of_the_Q... :

> Borges's "review" describes Menard's efforts to go beyond a mere "translation" of Don Quixote by immersing himself so thoroughly in the work as to be able to actually "re-create" it, line for line, in the original 17th-century Spanish. Thus, Pierre Menard is often used to raise questions and discussion about the nature of authorship, appropriation, and interpretation.


Do you remember how to get around those tricks?

This is the paper: https://arxiv.org/abs/2601.02671

Grok and Deepmind IIRC didn’t require tricks.


This really makes me want to try something similar with content from my own website.

I shut it down a while ago because the number of bots overtake traffic. The site had quite a bit of human traffic (enough to bring in a few hundred bucks a month in ad revenue, and a few hundred more in subscription revenue), however, the AI scrapers really started ramping up and the only way I could realistically continue would be to pay a lot more for hosting/infrastructure.

I had put a ton of time into building out content...thousands of hours, only to have scrapers ignore robots, bypass cloudflare (they didn't have any AI products at the time), and overwhelm my measly infrastructure.

Even now, with the domain pointed at NOTHING, it gets almost 100,000 hits a month. There is NO SERVER on the other end. It is a dead link. The stats come from Cloudflare, where the domain name is hosted.

I'm curious if there are any lawyers who'd be willing to take someone like me on contingency for a large copyright lawsuit.


Can we help get your infra cost down to negligible? I'm thinking things like pre-generated static pages and CDNs. I won't assume you hadn't thought of this before, but I'd like to understand more where your non-trivial infra cost come from?

I would be tempted to try and optimise this as well. 100000 hits on an empty domain and ~200 dollars worth of bot traffic sounds wild. Are they using JS-enabled browsers or sim farms that download and re-download images and videos as well?

> only to have scrapers ignore robots, bypass cloudflare

Set the server to require cloudflares SSL client cert, so nobody can connect to it directly.

Then make sure every page is cacheable and your costs will drop to near zero instantly.

It's like 20 mins to set these things up.


a) As an outside observer, I would find such a lawsuit very interesting/valuable. But I guess the financial risk of taking on OpenAI or Anthropic is quite high.

b) If you don't want bots scraping your content and DDOSing you, there are self-hosted alternatives to Cloudflare. The simplest one that I found is https://github.com/splitbrain/botcheck - visitors just need to press a button and get a cookie that lets them through to the website. No proof-of-work or smart heuristics.


The new cloudflare products for blocking bots and AI scrapers might be worth a shot if you put so much work into the content.

Further, some low effort bots can be quickly handled with CF by blocking specific countries (e.g., Brazil and Russia, for one of my sites).

I work for a publisher that serves the Chinese market as a secondary market. Sucks that we can’t blanketly do this since we get hammered by Chinese bots daily. We also have an extremely old codebase (Drupal) which makes blanket caching difficult. Working to migrate from Cloudfront to Cloudflare at least

What's not clear from the study (at least skimming it) is if they always started the ball rolling with ground truth passages or if they chained outputs from the model until they got to the end of the book. I strongly suspect the latter would hopelessly corrupt relatively quickly.

It seems like this technique only works if you have a copy of the material to work off of, i.e. enter a ground truth passage, tell the model to continue it as long as it can, and then enter the next ground truth passage to continue in the next session.


Oh! That’s a huge caveat if that’s indeed the case.

Around puberty brain drops loads of connections to become an adult brain.

More than 40% of all synapses are eliminated.


Source? Why would the organism build all those synapses for 14 years just to drop half of them?


Same with „building custom businesses stuff” you can already do it quicker with existing CRM configuration without burning tokens.

Yeah I know a friend (small business owner) vibe coding a feature as a addon for Odoo. He has dreams of selling it (he usually has wild plans), but for now it's just a feature they want and does seem to be good enough for their use.

And the Google AI subscription is cheaper than any of the SaaS offerings.


I like to bring up JIRA example. You could replace it in-house yeah it is just tickets with statuses. /s

But then keep in mind one who built the replacement will become the owner of an application that business doesn’t want to pay for and that person will be cost center for the company.

That person better get marketing and negotiating skills that Atlassian has on board because that person will be responsible for the app and will not be getting salary increases for working on something that is not core business of the company.

Even if you can make LLM to do the app for you.


You guys keep using services like Jira, Salesforce, Stripe, Datadog, etc. While those are definitely the biggest names, I don't think people are referring to those SaaS platforms as the ones they will replace or try to build an inhouse version of. It will be things like ETL pipeline services, data scraping services, maybe some internal analytics SaaS. The niche things that cost a lot because they’re in a sweet spot where only a few people need them, but no one used to have the resources to build them in-house. So, when the salesperson called and offered a perfect solution to their problem, they bought the service. Those are the ones that will be more targeted for in-house solutions.

Yes, but the market is punishing the former right now.

They are though, atlasians stock is in the toilet. The world seems to think Jira will be replaced by AI built in house replacements, for some reason

when it takes 10 seconds to do anything on Jira, it's not hard to see why people want alternatives

Except that is not the reason, and that’s not new haha

Don’t forget that captain of the plane makes decisions not Elon.

If captain of the plane disobeyed direct threat like that from a nation, his career is going to be limited. Yeah Elon might throw money at him but that guy is most likely never allowed again to fly near any French territory. I guess whole cabin crew as well .

Being clear for flying anywhere in the world is their job.

Would be quite stupid to loose it like truck driver DUI getting his license revoked.


>Don’t forget that captain of the plane makes decisions not Elon.

>If captain of the plane disobeyed direct threat like that from a nation, his career is going to be limited. Yeah Elon might throw money at him but that guy is most likely never allowed again to fly near any French territory. I guess whole cabin crew as well .

Again, what's France trying to do? Refuse entry to France? Why do they need to threaten shooting down his jet for that? Just harassing/pranking him (eg. "haha got you good with that jet lmao")?


I think in this hypothetical, France would want to force Musk's plane to land in French jurisdiction so they could arrest him.

*Data centers in space only make sense if they are cost effective relative to normal data centers*.

Disagree there are bunch of scenarios where Data Centers in space make sense. Like nuclear annihilation and having vaults across the globe to communicate and get back lost information because ground data centers would be wiped out by EMP from blasts.


Has it occurred to anyone that you can put computers underground? In this apocalyptic scenario you are describing, how do you expect the ground based command and control infrastructure to survive? Satellites are 100% reliant on ground based operations. That is a hard requirement. And if you put the command and control underground, might as well just skip the whole space based plan and just put the data underground.

Why is it hard requirement?

You can make some part of operations on high orbit that won’t decay as much then more ops on lower orbits that decay faster.

If you put stuff underground it is much harder to communicate.


And here I thought Musk's fans are all about digging holes in the ground. The flamethrower fumes might have caused temporary amnesia.

To say so I am not a Musk fan - I am sci-fi fan and I make imaginary/silly stuff up on my own.

I also like reading how people argue with not what I wrote but with what they imagined I wrote.


It was not my intention to single you out, my apologies.

There is nothing wrong to imagine anything you like. But if you do it as a CEO, i personally consider that as fraud. Guess I'm weird and old-fashioned like that.


In that case wouldn't space also get wiped by EMP? Seems like disabling satellites would be good move if you have a few nukes to spare.

After the bulk of humanity is wiped out, it will be a comfort that I can still use AI to generate dank memes.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: