Hacker Newsnew | past | comments | ask | show | jobs | submit | StableAlkyne's commentslogin

The most annoying ones are those which discuss loosely the methodology but then fail to publish the weights or any real algorithms.

It's like buying a piece of furniture from IKEA, except you just get an Allen key, a hint at what parts to buy, and blurry instructions.


This is so egregious. The value of such papers is basically nothing but they're extremely common.

> I'd love to see future reporting that instead of saying "Research finds amazing chemical x which does y" you see "Researcher reproduces amazing results for chemical x which does y. First discovered by z".

Most people (that I talk to, at least) in science agree that there's a reproducibility crisis. The challenge is there really isn't a good way to incentivize that work.

Fundamentally (unless you're independent wealthy and funding your own work), you have to measure productivity somehow, whether you're at a university, government lab, or the private sector. That turns out to be very hard to do.

If you measure raw number of papers (more common in developing countries and low-tier universities), you incentivize a flood of junk. Some of it is good, but there is such a tidal wave of shit that most people write off your work as a heuristic based on the other people in your cohort.

So, instead it's more common to try to incorporate how "good" a paper is, to reward people with a high quantity of "good" papers. That's quantifying something subjective though, so you might try to use something like citation count as a proxy: if a work is impactful, usually it gets cited a lot. Eventually you may arrive at something like the H-index, which is defined as "The highest number H you can pick, where H is the number of papers you have written with H citations." Now, the trouble with this method is people won't want to "waste" their time on incremental work.

And that's the struggle here; even if we funded and rewarded people for reproducing results, they will always be bumping up the citation count of the original discoverer. But it's worse than that, because literally nobody is going to cite your work. In 10 years, they just see the original paper, a few citing works reproducing it, and to save time they'll just cite the original paper only.

There's clearly a problem with how we incentivize scientific work. And clearly we want to be in a world where people test reproducibility. However, it's very very hard to get there when one's prestige and livelihood is directly tied to discovery rather than reproducibility.


I'd personally like to see top conferences grow a "reproducibility" track. Each submission would be a short tech report that chooses some other paper to re-implement. Cap 'em at three pages, have a lightweight review process. Maybe there could be artifacts (git repositories, etc) that accompany each submission.

This would especially help newer grad students learn how to begin to do this sort of research.

Maybe doing enough reproductions could unlock incentives. Like if you do 5 reproductions than the AC would assign your next paper double the reviewers. Or, more invasively, maybe you can't submit to the conference until you complete some reproduction.


The problem is that reproducing something is really, really hard! Even if something doesn't reproduce in one experiment, it might be due to slight changes in some variables we don't even think about. There are some ways to circumvent it (e.g. team that's being reproduced cooperating with reproducing team and agreeing on what variables are important for the experiemnt and which are not), but it's really hard. The solutions you propose will unfortunately incentivize bad reproductions and we might reject theories that are actually true because of that. I think that one of the best way to fight the crisis is to actually improve quality of science - articles where authors reject to share their data should be automatically rejected. We should also move towards requiring preregistration with strict protocols for almost all studies.

Yeah, this feels like another reincarnation of the ancient "who watches the watchmen?" problem [1]. Time and time again we see that the incentives _really really_ matter when facing this problem; subtle changes can produce entirely new problems.

1. https://en.wikipedia.org/wiki/Quis_custodiet_ipsos_custodes%...


That's fine! The tech report should talk about what the researchers tried and what didn't work. I think submissions to the reproducibility track shouldn't necessarily have to be positive to be accepted, and conversely, I don't think the presence of a negative reproduction should necessarily impact an author's career negatively.

Every time some easy "Reproducibility is hard / not worth the effort" I hear "The original research wasn't meaningful or valuable".

And that's true! It doesn't make sense to spend a lot of resources on reproducing things when there is low hanging fruit of just requiring better research in the first place.

Is it time for some sort of alternate degree to a PhD beyond a Master's? Showing, essentially, "this person can learn, implement, validate, and analyze the state of the art in this field"?

Thats what we call a Staff level engineer. Proven ability to learn, implement and validate is basically the "it factor" businesses are looking for.

If you are thinking about this from an academic angle then sure its sounds weird to say "Two Staff jobs in a row from the University of LinkedIn" as a degree. But I submit this as basically the certificate you desire.


No, this is not at all being a staff engineer. One is about delivering high-impact projects toward a business's needs, with all the soft/political things that involves, and the other is about implementing and validating cutting-edge research, with all the deep academic and technical knowledge and work that that involves. They're incredibly different skillsets, and many people doing one would easily fail in the other.

That sounds precisely like the function of a Ph.D. to me.

A PhD is for making new contributions to a field, not validating existing ones.

> The challenge is there really isn't a good way to incentivize that work.

What if we got Undergrads (with hope of graduate studies) to do it? Could be a great way to train them on the skills required for research without the pressure of it also being novel?


Those undergrads still need to be advised and they use lab resources.

If you're a tenure-track academic, your livelihood is much safer from having them try new ideas (that you will be the corresponding author on, increasing your prestige and ability to procure funding) instead of incrementing.

And if you already have tenure, maybe you have the undergrad do just that. But the tenure process heavily filters for ambitious researchers, so it's unlikely this would be a priority.

If instead you did it as coursework, you could get them to maybe reproduce the work, but if you only have the students for a semester, that's not enough time to write up the paper and make it through peer review (which can take months between iterations)


Most interesting results are not so simple to recreate that would could reliably expect undergrads to do perform the replication even if we ignore the cost of the equipment and consumables that replication would need and the time/supervision required to walk them through the process.

Unfortunately, that might just lead to a bunch of type II errors instead, if an effect requires very precise experimental conditions that undergrads lack the expertise for.

Could it be useful as a first line of defence? A failed initial reproduction would not be seen as disqualifying, but it would bring the paper to the attention of more senior people who could try to reproduce it themselves. (Maybe they still wouldn't bother, but hopefully they'd at least be more likely to.)

> Eventually you may arrive at something like the H-index, which is defined as "The highest number H you can pick, where H is the number of papers you have written with H citations."

It's the Google search algorithm all over again. And it's the certificate trust hierarchy all over again. We keep working on the same problems.

Like the two cases I mentioned, this is a matter of making adjustments until you have the desired result. Never perfect, always improving (well, we hope). This means we need liquidity with the rules and heuristics. How do we best get that?


Incentives.

First X people that reproduce Y get Z percent of patent revenue.

Or something similar.


Patent revenue is mostly irrelevant, as it's too unpredictable and typically decades in the future. Academics rarely do research that can be expected to produce economic value in the next 10–20 years, because the industry can easily outspend the academia in such topics.

Most papers generate zero patent revenue or even lead to patents at all. For major drugs maybe that works but we already have clinical trials before the drug goes to market that validate the efficacy of the drugs.

I'm delighted to inform you that I have reproduced every patent-worthy finding of every major research group active in my field in the past 10 years. You can check my data, which is exactly as theory predicts (subject to some noise consistent with experimental error). I accept payment in cash.

> I'd love to see future reporting that instead of saying "Research finds amazing chemical x which does y" you see "Researcher reproduces amazing results for chemical x which does y. First discovered by z".

But nobody want to pay for it


usually you reproduce previous research as a byproduct of doing something novel "on top" of the previous result. I dont really see the problem with the current setup.

sometimes you can just do something new and assume the previous result, but thats more the exception. youre almost always going to at least in part reproducr the previous one. and if issues come up, its often evident.

thats why citations work as a good proxy. X number of people have done work based around this finding and nobody has seen a clear problem

theres a problem of people fabricating and fudging data and not making their raw data available ("on request" or with not enough meta data to be useful) which wastes everyones time and almost never leads to negative consequences for the authors


It's often quite common to see a citation say "BTW, we weren't able to reproduce X's numbers, but we got fairly close number Y, so Table 1 includes that one next to an asterisk."

The difficult part is surfacing that information to readers of the original paper. The semantic scholar people are beginning to do some work in this area.


yeah thats a good point. the citation might actually be pointing out a problem and not be a point in favor. its a slog to figure out... but seems like the exact type of problem an LLM could handle

give it a published paper and it runs through papers that have cited it and give you an evaluation


That feels arbitrary as a measure of quality. Why isn't new research simply devalued and replication valued higher?

"Dr Alice failed to reproduce 20 would-be headline-grabbing papers, preventing them from sucking all the air out of the room in cancer research" is something laudable, but we're not lauding it.


> you have to measure productivity somehow,

No, you do not have to. You give people with the skills and interest in doing research the money. You need to ensure its spent correctly, that is all. People will be motivated by wanting to build a reputation and the intrinsic reward of the work


> If you measure raw number of papers (more common in developing countries and low-tier universities), you incentivize a flood of junk.

This is exactly what rewarding replication papers (that reproduce and confirm an existing paper) will lead to.


And yet if we can't reproduce an existing paper, it's very possible that existing paper is junk itself.

Catch-22 is a fun game to get caught in.


> The challenge is there really isn't a good way to incentivize that work.

Ban publication of any research that hasn't been reproduced.


If we did that, CERN could not publish, because nobody else has the capabilities they do. Do we really want to punish CERN (which has a good track record of scientific integrity) because their work can't be reproduced? I think the model in many of these cases is that the lab publishing has to allow some number of postdocs or competitor labs to come to their lab and work on reproducing it in-house with the same reagents (biological experiments are remarkably fragile).

> Ban publication of any research that hasn't been reproduced.

Unless it is published, nobody will know about it and thus nobody will try to reproduce it.


Just have a new journal of only papers that have been reproduced, and include the reproduction papers.

lol, how would the first paper carrying some new discovery get published?

There's also the communications aspect:

SO was/is a great site for getting information if (and only if) you properly phrase your question. Oftentimes, if you had an X/Y problem, you would quickly get corrected.

God help you if you had an X/Y Problem Problem. Or if English wasn't your first language.

I suspect the popularity is also boosted by the last two; it will happily tell you the best way to do whatever cursed thing you're trying to do, while still not judging over English skills.


> Hard to trust it if it isn't fully OSS

It's an emulated PDP-11, could you elaborate on the threat model here?

I get that companies are being gross about logging everything online, but come on. It's okay to have fun.

Who in their right mind is using this for anything other than curiosity's sake?


Little bit of banking on an emulator on a random website, why not?


bitcoin will not be mined on its own.


It's less the fact that someone owns JS's trademark, and more that it's specifically Oracle (they got it when they bought Sun).

Oracle is an incredibly litigious company. Their awful reputation in this respect means that the JS ecosystem can never be sure they won't swoop in and attempt to demand rent someday. This is made worse by the army of lawyers they employ; even if they're completely in the wrong, whatever project they go after probably won't be able to afford a defense.


> Oracle is an incredibly litigious company. Their awful reputation in this respect means that the JS ecosystem can never be sure they won't swoop in and attempt to demand rent someday. This is made worse by the army of lawyers they employ; even if they're completely in the wrong, whatever project they go after probably won't be able to afford a defense.

That is why on one level I am surprised by the petition. They are talking to a supercharged litigation monster and are asking it "Dear Oracle, ... We urge you to release the mark into the public domain". You know what a litigation happy behemoth does in that case? It goes asks some AI to write a "Javascript: as She Is Spoke" junk book on Amazon just so they can hang on to the trademark. Before they didn't care but now that someone pointed it out, they'll go out of their way to assert their usage of it.

On the other hand, maybe someone there cares about their image and would be happy to improve it in the tech community's eyes...


> It goes asks some AI to write a "Javascript: as She Is Spoke" junk book on Amazon just so they can hang on to the trademark.

IANAL, but I don't think that wouldn't be enough to keep the trademark.

Also the petition was a "we'll ask nicely first so we can all avoid the hastle and expense of legal procedings", they are now in the process of getting the trademark invalidated, but Oracle, illogically but perhaps unsurprisingly is fighting it.


I was just using it as an example of doing the absolute minimum. They could write a dumb Javascript debugger or something with minimal effort.

But yeah, IANAL either and just guessing, I just know Oracle is shady and if you challenge them legally they'll throw their weight around. And not sure if responding to a challenge with a new "product" is enough to reset the clock on it. Hopefully a the judge will see through their tricks.


That's why courts don't take hypothetical cases. Someone has to be injured to demonstrate actual harm.

Are there any examples of Oracle using their JavaScript trademark to sue anyone? If they did, that petition would have merit.

Unless Demo was, this feels like a marketing project. And it's working, too, so kudos.


Trademark law is kind of about hypotheticals though. The purpose of a trademark is to prevent theoretical damages from potential confusion, neither of which you ever have to show to be real

In this case the trademark existing and belonging to Oracle is creating more confusion than no trademark existing, so deleting it is morally right. And because Oracle isn't actually enforcing it it is also legally right

Imho this is just the prelude to get better press. "We filed a petition to delete the JavaScript trademark" doesn't sound nearly as good as "We collected 100k signatures for a letter to Oracle and only got silence, now we formally petition the USPTO". It's also a great opportunity to find pro-bono legal council or someone who would help fund the petition


It's the specter of a lawsuit that's the problem.


The other aspect here is that general knowledge (citation needed) says that if a company doesn't actively defend their trademark, they often won't be able to keep it if challenged in court. Or perhaps general knowledge is wrong.


At this point I'm going to assume that adding -Script to a trademarked name allows me to use that name freely.


JavaScriptScript?


JavaScript-Script


Kleenex-Script


Unless that suffixed version is itself already trademarked, like AppleScript.


iPhoneScript should be fine though?


Oracl3Script?


Yeah, that's how I'm going to call my LLM-based law-firm.


Turn it around: Scriptacle.


Assuming Oracle did decide to go down that route, who would they sue? No one really uses the JavaScript name in anything official except for "JavaScriptCore" that Apple ships with Webkit.


Afaik they already sued Deno: https://deno.com/blog/deno-v-oracle2

Edit: Seems I'm incorrect, see below


I had no idea this was a thing! I'm surprised this didn't attract more attention.


My bad, after reading more it seems Deno is trying to get Oracle's trademark revoked, but I found out that "Rust for Javascript" devs have received a cease and desist from Oracle regarding the JS trademark, which may have triggered Deno to go after Oracle.


> who would they sue

Anyone they feel like. Lawnmower gonna mow.


The incredibly litigious company here is Deno. Deno sued on a whim, realized they were massively unprepared, then asked the public to fund a legal campaign that will benefit Deno themselves, a for-profit, VC-backed company.

This personal vendetta will likely end with the community unable to use the term JavaScript. Nobody should support this.


Your comment seems incredibly confused.

1. Oracle is the litigious one here. My favorite example is that time they attacked a professor for publishing less-than-glowing benchmarks of their database: https://danluu.com/anon-benchmark/ What's to stop them from suing anyone using the term JavaScript in a way that isn't blessed by them? That's what Deno is trying to protect against.

2. Deno is filing a petition to cancel the trademark, not claim it themselves. This would return it to the public commons.

It should be obvious from these two facts that any member of the public that uses JavaScript should support this, regardless of what they think of Deno-the-company.


> This personal vendetta will likely end with the community unable to use the term JavaScript. Nobody should support this.

Why would that be the case, if not for Oracle's litigiousness?


Hi Larry Ellison! Will you mow my lawn?


I think what the GP was referring to was the "new" owner of Sears, who reorganized the company into dozens of independent business units in the early 2010s (IT, HR, apparel, electronics, etc). Not departments, either; full-on internal businesses intended as a microcosm of the free market.

Each of these units were then given access to an internal "market" and directed to compete with each other for funding.

The idea was likely to try and improve efficiency... But what ended up happening is siloing increased, BUs started infighting for a dwindling set of resources (beyond normal politics you'd expect at an organization that size; actively trying to fuck each other over), and cohesion decreased.

It's often pointed to as one of the reasons for their decline, and worked out so badly that it's commonly believed their owner (who also owns the company holding their debt and stands to immensely profit if they go bankrupt) desired this outcome... to the point that he got sued a few years ago by investors over the conflict of interest and, let's say "creative" organizational decisions.


This happened at a place where I worked years ago, but not as 'on purpose.' We were a large company where most pieces depended on other pieces, and everything was fine - until a new CEO came in who started holding the numbers of each BU under a microscope. This led to each department trying to bill other departments as an enterprise customer, who then retaliated, which then led to internal departments threatening to go to competitors who charged less for the same service. Kinda stupid how that all works - on paper it would have made a few departments look better if they used a bottom barrel competitor, but in reality the company would have bled millions of dollars as a whole...all because one rather large BU wanted to goose its numbers.


Why is that a bad thing? If an internal department that’s not core to their business is less efficient than an external company - use the external company.

Anecdote: Even before Amazon officially killed Chime, everyone at least on the AWS side was moving to officially supported Slack.


I guess it depends on circumstances, but it boils down to each department only cost others some marginal cost in practice.

Imagine a hosting company and a dns company, both with plenty of customers and capacity. The hosting company says... I'll host your DNS site, if you provide DNS to our hosting site. Drop in the bucket for each.

One year the DNS company decides it needs to show more revenue, so will begin charging the hosting company $1000/yr, and guess what the hosting company says the same. Instead, they each get mad and find $500/yr competitors. What was accomplished here?

Further, it just looks bad in many cases. Imagine if Amazon.com decided AWS was too expensive, and decided to move their stuff off to say, Azure only. That wouldn't be a great look for AWS and in turn hurts...Amazon.

I do get your point, but there are a lot of... intangibles about being in a company together.


There is more politics than you think within Amazon Retail about moving compute over to AWS. I’m not sure how much of Amazon Retail runs on AWS instead of its own infrastructure (CDO).

I know one project from Amazon got killed because their AWS bill was too high. Yeah AWS charges Amazon Retail for compute when they run on AWS hardware.

https://www.lastweekinaws.com/blog/the-aws-service-i-hate-th...


As a rule, organizations are created to avoid the transaction costs on those detail tasks. If you externalize every single supporting task into a market, you will be slowed down to a drag, won't be able to use most competitive advantages, and will pay way more than doing them in house.

But removing the market competition is a breeding ground for inefficiency. So there's a balance there, and huge conglomerates tying their divisions together serves only to make the competitive ones die by the need to use the services of the inefficient ones.


My four years at AWS kind of indoctrinated me. As they said, everytime you decide to buy vs build, you have to ask yourself “does it make the beer taste better”?

Don’t spend energy on undifferentiated heavy lifting. If you are Dropbox it makes sense to move away from S3 for instance.


to put a finer point on it, it wasn't just competition or rewarding-the-successful, the CEO straight up set them at odds with each other and told them directly to battle it out.

basically "coffee is for closers... and if you don't sell you're fired" as a large scale corporate policy.


Yes, this is what I was referring to. I should have provided more context, thanks for doing so.


That was a bullshit separation of a single horizontal cut of the market (all of those segments did consumer retail sales) without overlap.

The part about no overlaps already made it impossible for them to compete. The only "competition" they had was in the sense of TV gameshow competition where candidates do worthless tasks, judged by some arbitrary rules.

That has absolutely no similarity to how Samsung is organized.


Do you have a source for this? I don't see any indication from a quick Google other than this thread as the second result.

The license at: https://github.com/juce-framework/JUCE/blob/master/LICENSE.m...

indicates you can just license any module under agpl and avoid the JUCE 8 license (which to be fair, I'm not bothering to read)


https://forum.juce.com/t/archived-juce-8-eula/60947/149

And sure you can license under APGL. It should be obvious that's undesirable.


It boggles the mind that they built a "low code" interface to designing websites, with the express purpose of making it easy to use...

..and then used Excel formulas of all things as the basis for its scripting language.

It's as if they wanted these things to be as clunky and spaghettified as possible.


At some point, doing things the "low code/no code" way turns out to be more painful than just . . . writing code.


for those that can write code. if you can't write code, the more painful way is just the way


A lot of those people end up writing code without realizing they’re writing code.


I don't know the MS offering, but places like Wix/Square or using WordPress definitely do not end up with the user writing code.


Instead, you end up installing an endless list of plugins that are sometimes so poorly written that I've decided to call WordPress "RCE-as-a-Service".


that just sounds more like a case of square peg and a round hole. Yes, WP is a nightmare just like NPM and its ilk are to me as well. Adding WP in my list was fraught for this level of response, and I realize now I should have left it off the list. It really doesn't do much for moving the conversation in the right direction


That's my point. At some point, people's fear of learning code is causing them to do things in ways that are unnecessary and overcomplicated, which is quite a bit ironic.


You say fear. I say unnecessary for task at hand. My mom doesn't need to learn how to code to make a website for her florist. She just needs a site that can host some basic information like contact info, gallery of example images, and maybe some cheesy "about" page that people feel like is oh so important.

We're obviously reading a developer centric forum where people seem to have a hard time seeing things from anything other than a developer's point of view. Have hammer, everything is a nail situation. People just not wanting to become a coder isn't because they are scared of it. They just don't want to do it. I don't want to be a florist. I don't go bitching to florists that there's not an easy way to make floral arrangements without learning basics nor does it make me scared of it. Whatever "fear" you want to imply really makes you sound out of touch with non-developers.


I realize that for the simple use cases like that it's fine. I'm talking about people at work using complicated workflows in "low code" tools or spreadsheets full of macros. At some point it's equally or more complex, just in a different way.


Having been involved in a “no code” product, I’ll just say that it’s a really crappy way to write programs. You’re better off creating a DSL of some sort and asking people to type. Demanding that people click the mouse three times to open an input box where they can type something and then doing that a few hundred times is not “better.” It’s infuriating.


It's crazy how much RAM has inflated in the last month. I checked the price history of a few DDR5 kits and most have tripled since September.


Why specifically just now? It doesn't seem that much has materially changed very recently.


It's due to every hyperscalar building out new AI datacenters. For example you have Google recently saying things like "Google tells employees it must double capacity every 6 months to meet AI demand", and that they need to increase capacity by 1000x within 4-5 years.


There's legitimately interesting research in using it to accelerate certain calculations. For example, usually you see a few talks at chemistry conferences on how it's gotten marginally faster at (very basic) electronic structure calculations. Also some neat stuff in the optimization space. Stuff you keep your eye on hoping it's useful in 10 years.

The most similar comparison is AI stuff, except even that has found some practical applications. Unlike AI, there isn't really much practicality for quantum computers right now beyond bumping up your h-index

Well, maybe there is one. As a joke with some friends after a particularly bad string of natural 1's in D&D, I used IBM's free tier (IIRC it's 10 minutes per month) and wrote a dice roller to achieve maximum randomness.


that was my understanding too - in the fields of chemistry, materials science, pharmaceutical development, etc... quantum tech is somewhat promising and might be pretty viable in those specific niche fields within the decade.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: