More

coffee_am · 2026-01-22T08:18:04 1769069884

On the other side of the equation I've been spending much more time on code-review on an open source project I maintain, because developers are much more productive and I still code-review at the same speed.

The real issue is that I can't trust the AI generated code, or trust the AI to code-review for me. Some repeated issues I see:

- In my experience the AI doesn't integrate well with the code that there is already there: it often rewrites functionality and tend not to adhere to the project's conventions, but rather use what it is trained on.

- The AI often lacks depth into more complex issues. And because it doesn't see the broader implication of changes, it often doesn't write the tests that would cover them. Developers that wrote the PRs accept the AI tests without much investigation into the code-base. Since the changes passes the (also insufficient) tests, they send the PR to code-review.

- With AI I think (?) I'm more often the one careful deep diving into the project and re-designing the generated code in the code-review. In a way it's an indirect re-prompting.

I'm very happy with the increased PRs: they push the project forward, with great ideas of what to implement, and I'm very happy about AI increased productivity. Also, with AI developers are bolder in their contributions.

But this doesn't scale -- or I'll spend all my time code-reviewing :) I hope the AIs get better quickly.

bird0861 · 2026-01-22T18:26:48 1769106408

With respect to the first issue you raise, I would perhaps start including prompts in comments. This is a little sneaky sure. And maybe explicitly putting them in a markdown would be better. But there's the risk that markdown won't be loaded. Perhaps it might be possible to inject the file into context via a comment, I've never tried that though and I doubt every assistant will act in a consistent way. The comment method is probably the best bet IMO.

Forgive me because this is a bit of a tangential rant on the second issue, but Gemini Pro 3 was absolutely heinous about this so I cancelled my sub. I'm completely puzzled what it's supposed to be good for.

To your third issue, you should maybe consider building a dataset from those interactions... you might be able to train a LoRA on them and use it as a first pass before you lift a finger to scroll through a PR.

I think a really big issue is that there is a lack of consistency in the use of AI for SWE. There are a lot of models and poorly designed agents/assistants with really unforgivable performance and people just blindly using them without caring about the outputs amounts to something that is kind of Denial-of-Service-y and I keep seeing this issue be raised over and over again.

At the risk of sounding elitist, the world might be a better place for project maintainers when the free money stops rolling into the frontier labs to offer anyone and everyone free use of the models...never give a baby powertools and so on.

dragonwriter · 2026-01-22T18:34:15 1769106855

That basically matches, in broad outline, what I see from AI use in an enterprise environment; absent a radical change, I think the near term impact of AI on software development is going to be too increase velocity while shifting workload to less (but not zero) code writing and more code reviewing and knowing when you need to prompt-and-review vs. hand-code.

coffee_am · 2025-08-25T05:45:59 1756100759

Is there a public curated list of "good ips" to whitelist ?

partyguy · 2025-08-25T06:32:37 1756103557

> Is there a public curated list of "good ips" to whitelist ?

https://github.com/AnTheMaker/GoodBots

worthless-trash · 2025-08-25T06:26:29 1756103189

So, its relatively easy because there is limited ISP's in my country. I imagine its a much harder option for the US.

I looked at all the IP ranges delegated by APNIC, along with every local ISP that I could find, unioned this with

https://lite.ip2location.com/australia-ip-address-ranges

And so far i've not had any complaints. and I think that I have most of them.

At some time in the future, i'll start including https://github.com/ebrasha/cidr-ip-ranges-by-country

coffee_am · 2025-08-24T15:42:10 1756050130

imho that is just silly ... I can see various ways censorship and freedom and common good at the same time. Actually, I can imagine different set ups where this could work...

But then, you have to define these things. E.g.: freedom of person "A" to kill person "B" infringes on person "B" freedom of come and go and not be killed (by "A" or anyone else) ... so what is freedom. "Common good" is even more complicated ... who should defined it ? And how ?

On the other topic, I for one think that censorship of AI generated content and fake news, as well as AI generated ordering of results should be censored. But it's not that easy, and implementing that is an even bigger can of worms.

Xelbair · 2025-08-25T01:06:43 1756084003

the issue is how do you prove the content was written by AI?

> But then, you have to define these things. E.g.: freedom of person "A" to kill person "B" infringes on person "B" freedom of come and go and not be killed (by "A" or anyone else) ... so what is freedom. "Common good" is even more complicated ... who should defined it ? And how ?

even worse - how do you make sure the definition of such terms stays up to date with changing times?

coffee_am · 2025-07-19T11:21:58 1752924118

Of course one can generalize using the colloquial "Here in Europe". And generalization is useful -- one cannot go into all the complexity and details all the time, at some point one has to summarize/generalize an argument.

Yes, Europe is not a monolithic bloc, but there is a large fraction that is less sex focused, it's a fair generalization and comment to express that.

coffee_am · 2025-05-09T12:07:43 1746792463

Just for another data point, I took me 4 days to cook up a WASM front-end using Go for my otherwise command-line only Hive game:

https://janpfeifer.github.io/hiveGo/www/hive/

Probably everything JS and DOM is better supported from TS, but I have to say, I was never blocked on my small project.

coffee_am · on Jan 5, 2025

Nice, but why create yet another language, that doesn't introduce no real gains over existing ones ? It's not even a research effort ...

All modern languages will compile to WASM as well, that is not a new feature.

I'm failing to see the selling point here -- but I only looked at the first pages of the documentation, and some of the tutorial.

And lacking anything really new, I'd rather see efforts to improve existing languages :)

coffee_am · on Nov 18, 2024

I think there is a misconception there.

Go is as good a language as any for an ML framework. Better if you buy into it being better for its simplicity. It's clearly a worse if you factor in the ecosystem that already exists in Python, and one is leveraging that.

The C FFI ("foreign function interface") doesn't play a role at the ML Framework level. One does not need to call a C function at a fine-grained level. One step of inference can be in the milliseconds time scale for LLMs, the C FFI costs are almost 5 orders of magnitude smaller (10s or 100s of nanoseconds?). Look, Python C bindings are also costly, and things just work.

Plus Go doesn't need a bridge to Python necessarily. It only needs a bridge to whatever Python is also using, if it wants to be performant.

Having said that, one cannot expect to implement the lower level operations in Go efficiently (well, competitively with what is best out there), or JIT-compile a computation graph to Go (doable but not as performant anyway). But that doesn't matter, fast execution of computation graphs can be implemented in C++/Rust/Cuda/etc. The ML Framework level of operations (building models, layers, etc.) can all be done in Go (or Rust, Elixir, Julia, Occam, Lisp, Basic, etc.)

But even that, like most things in life, folks think every problem needs the fastest of the fastest ... one can do matrix multiplication fast enough in Go for the majority of the small ML models out there -- for many tasks the time spent in ML inference is just a small fraction of a cost of the total system. It really doesn't matter if it is an order of magnitude slower.

mountainriver · on Nov 21, 2024

Sorry but this is just plain wrong. The real issue is Cuda not Python. Go's memory model makes it to where it has to trampoline to C, this is as much as 300ns in Go which is about 10x what Python is. These things add up significantly when doing ML training.

coffee_am · on Sept 21, 2023

A few considerations come to mind:

1. The O(N^2*d) computation cost of the attention layers. For large graphs (millions of nodes) it's quickly too costly. And in some of the social network problems, the more data you feed the better is the inference on average (on a ~log scale).

2. As the paper suggests, in some cases the graph structure has important information. Flattening out everything, or fully connecting the nodes, a more accurate description of what goes on in an attention layer in this scenario, the structure is lost. The structure information can be introduced as a positional encoding -- see paper (edited/fixed): https://arxiv.org/pdf/2207.02505.pdf So remember to do that if you attempt the attention solution.

3. Then there is overfitting, already a big issue in GNNs. Fully connecting every node with attention has less of an "inductive bias" if you will, created by the graph structure. Not sure how much it matters ...

coffee_am · on July 28, 2023

Noob question: when folks talk about fine-tuning LLM, do they usually fine-tune the encoder (of the prompt), the decoder (that generates the text) or both ?

drdirk · on July 28, 2023

Both can be done. Fine-Tuning the prompt is cheaper and can be done on consumer hardware. Fine-tuning the LLM model weights is expensive and needs cloud support.

coffee_am · on July 29, 2023

Thanks for the reply, but when you mean "fine-tuning" for the prompt, do you mean fine-tuning of the LLM encoder of the prompt right ? (The thing that transforms the prompt into a sequence of embeddings?) But that is not cheap/easy to train ...

I know some systems also allow an extra fixed embedding parameter to the prompt, that can also be fine-tuned. But that is yet another thing that can be fine-tuned very cheaply.

coffee_am · on Aug 5, 2022

One way is to use categorical set splits [1] (proposed for categorical set inputs, but works for categorical features as well), used in TF-DF [1]. Greedy, and expensive to train (cheap inference though), but it gives great results.

[1] https://arxiv.org/pdf/2009.09991.pdf [2] https://www.tensorflow.org/decision_forests/text_features