HN has been under downward quality pressure for a long time. Figuring out how to withstand it has been the core idea all along:
HN is an experiment. As a rule, a community site that becomes popular will decline in quality. Our hypothesis is that this is not inevitable—that by making a conscious effort to resist decline, we can keep it from happening. - https://news.ycombinator.com/newswelcome.html
pg wrote that over 15 years ago. I've been saying for (a mere) 10 years that we're trying to stave off the arrow of internet entropy: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que... – not doable forever, no doubt, but doable for a while if we expend enough energy.
Based on experience so far, the way to do this is through a combination of a dedicated community, software that does what software can, and moderation to bump the system out of its failure modes.
Trying to hide from the outside world is mostly not the way. pg used to do tricks like Erlang Day on occasion, but having that as the main strategy would be like trying to avoid infection by never going outside. Far better is to have a robust immune system, if possible. Trying to avoid (resistible) pathogens weakens the immune system, and hiding HN from new users is a path to decrepitude. The latter is probably a greater threat IMO, and too easy for established users to discount.
Spam, and its cousins like content marketing, could kill HN if it became orders of magnitude greater—but from my perspective, it isn't the hardest problem on HN. That's because of the dedicated community, which flags these things, reports them when they escape in the wild, and is vigilant about quality. Without such a community, HN would have died long ago.
By far the harder problem, from my perspective, is low-quality comments, and I don't mean by bad actors—the community is pretty good about flagging and reporting those; I mean lame and/or mean comments by otherwise good users who don't intend to and don't realize they're doing that. There's an unholy dynamic between those and the upvote system, so worse comments often get upvoted more than better comments do—often enough to choke the threads with weeds. That's the high-order bit and what I spend more time worrying about—not Cassandra E Oakley and her trading system*, nor the latest startup voting ring and whatnot. If those ever become the high bit, we might be doomed, but we should see it happening long enough in advance to readjust.
p.s. (@dang doesn't work - I only saw this by accident. Well, not by accident because it ended up at the top of the thread)
> There's an unholy dynamic between those and the upvote system, which means that (by default) worse comments get upvoted more than better comments do...
Can you elaborate on this a bit? I don't see why "lame and/or mean comments by otherwise good users who don't intend to and don't realize they're doing that" should have an "unholy dynamic" with the upvote system.
(FWIW, what I have observed is that once a comment becomes established as the top comment in a thread -- and it doesn't take much for that to happen -- it is nearly impossible to dislodge it. That means that getting into a thread early is crucial for getting noticed. I've pretty much stopped commenting on threads that are older than an hour or two because I can be 99.9% certain that whatever I write will never been noticed no matter how good it might be. And FWIW2, the comment I'm responding to is 50 minutes old as I write this.)
The dynamic being referred to is that low quality comments in the form of memes, distasteful jokes, attacks on other people, and similar comments tend to get upvoted a lot as they provide some entertainment to the upvoter, but said upvoted comment is highly damaging to the community in the kind of tone it sets for the thread, as well as the example it sets for the future.
People optimizing for this kind of low effort but highly upvoted comment is called “karma whoring” on some places.
Most upvoting is reflexive rather than reflective [1], so posts which generate a quick response are more likely to get upvoted. I think that mostly happens when the reader has a rapid feeling response—could be indignation (how dare $THEY!), could be familarity (no way! I like $THING too!), could be a quick association from $Familiar-A to $Obvious-B [2], but whatever it is, it's likely to be something that doesn't take much processing.
The reflective circuitry, which takes in new information, turns it over, and generates an unpredictable response, is much slower and harder to run. I suppose it's a bit like the difference between a sugar hit and eating nutritious food with fiber. The latter makes you feel better in the long run, but when it comes to mass dynamics, the sugar hit wins out every time.
> once a comment becomes established as the top comment in a thread -- and it doesn't take much for that to happen -- it is nearly impossible to dislodge it
Moderators downweight top subthreads that are generic or otherwise lame, and repeat this until the top subthread is no longer lame—if possible. The trouble is that this is an intensive manual process. Most likely the software needs to be adjusted as well.
[2] This is probably the basis for the generic subthreads which are the bane of this forum: not bad enough to flag, but predictable enough to suffocate.
> The trouble is that this is an intensive manual process.
And the other problem is that even when it works perfectly, this process as described can only produce non-lameness at best.
I have a suggestion based on something I did at Google 20+ years ago: compute Page Rank on commenters. I did this 24 years ago in the Google Translation Console, which was a (now long-since-retired) public interface for volunteers to translation Google's site content into other languages. Translators would not just input their own translations but also rate the translations submitted by others. Translators whose translations were upvoted more often were deemed more reliable raters, just as links from highly ranked web pages are weighted more highly when computing Page Rank. This successfully prevented any translation spam from ever making it through to the public site (as far as we could tell). It never reached the scale of HN, but it's a lot easier to implement such a thing nowadays too, so I think it might be worth a try.
This is an interesting idea, but some of the lamest commenters have tens of thousands of karma points, and many of the best commenters have relatively little karma because they don't spend all their time on HN.
How do you structure this to avoid that problem? How would the system work for a new account just created to get established?
It has been over 20 years since I wrote the the TranslatorRank code so I have forgotten a lot of the details. But as a first cut, I would divide the raw score by the age of the account, and maybe weight more recent activity over older activity. Brand new accounts would probably need to be handled as a special case.
Just out of curiosity, who would you cite as an example of a lame commenter with tends of thousands of karma points?
>The reflective circuitry ... is much slower and harder to run.
considering this along with the comment below in the thread:
>most activity on a thread seems to happen in the first 24 hours of posting it. the discussion tapers off beyond that.
Supports more software experiments to encourage "slower" or longer discourse, and thus reflection.
This is my longtime wish as a user: reading via RSS most conversations are long dead when I get to them. I subscribe to replies on my comments, but since few other people do, it's not that useful. Simply promoting such features might help, as I bet few users are aware of them.
Well, when I say slower I mean something like (to be generous) a minute, as opposed to 500ms. So we're talking about very different time scales! But you make a good point.
I think reddit has the best system for ranking comments and threads of any I have seen. I haven't studied the source, but it seems to hinge on what I have taken to call 'piloting' new posts: Allow it a brief time in the top spot (possibly only for a random subset of users), and see how well it performs (upvote wise) compared with the other comments. And, importantly, the quality requirement for the post increases the higher in the total hierarchy it is: root-level post, top-level reply to a top comment, and so on.
I know HN does something similar, but it is not quite as good as reddit. From observation, specifically the 'penalty', or added performance requirement, of latching on to a top post is too weak. The result is that all HN comment threads consist of only a few top level posts, with subthreads growing off them, because you can easily 'jump the queue' just by commenting to a top comment. This is also what contributes to the idea that it is pointless to make new root level comments after an hour - because almost all the action is in subcomments to top comments.
Edited to add: Reddit soft-hides (collapses) subthreads that are deemed lower importance, which is probably key to make the ranking system work. Anyone interested in a subthread may expand the hidden/collapsed sections, and they may even be upvoted back to uncollapsed state. But by default they don't muscle into the main conversation. HN already has the collapse feature, which could be reused for this. It's just a client-side collapse, also the reddit one (though in huge threads, deeper threads will be loaded on demand).
> And, importantly, the quality requirement for the post increases the higher in the total hierarchy it is: root-level post, top-level reply to a top comment, and so on
How?
> the 'penalty', or added performance requirement, of latching on to a top post is too weak
I'm confused by what you mean by 'quality' and 'performance', unless you just mean upvotes.
By 'quality' or 'performance' I mean the metric by which a post (and its children) is shown higher or lower, or even auto-collapsed.
I think reddit just counts a ratio of upvotes to views (ofc downvotes too). It is possible that users collapsing a comment/subthread also has some weight. Would make sense.
> And, importantly, the quality requirement for the post increases the higher in the total hierarchy it is: root-level post, top-level reply to a top comment, and so on
What I mean by this, is that the more prominent a place a post holds, the better it must 'perform' (per the above definition). Prominence being mostly just how high on the page it is.
most activity on a thread seems to happen in the first 24 hours of posting it. the discussion tapers off beyond that.
context: i read the "past" page every day, so often am several hours or over a day behind discourse. interstingly enough this post is still relatively new and was popular enough before midnight utc to show up in today's past page.
i wonder how much positive feedback posts receive if they miss the initial window.
> “once a comment becomes established as the top comment in a thread -- and it doesn't take much for that to happen -- it is nearly impossible to dislodge it”
Sometimes moderators actively bump down top comments.
It’s probably related to what dang wrote above about the “unholy dynamic” of low-content comments that trigger easy upvotes. I remember myself having written some (frankly) lightweight low-effort snark, see it end up at the top of the comments for a few hours, and then it went mysteriously halfway down the list even though the upvotes didn’t stop.
To be clear, I think it’s a good thing the moderators do this kind of weighting, and the invisibility of comment upvote counts to non-authors is an important feature because it enables this.
Indeed, the sad reality I've discovered is that vote count != quality. In fact some of the highest quality comments of mine tend to languish in obscurity. I don't care about karma and invisible internet points so this doesn't bother me at all, but it does seem like a reflection of a sub-optimal system, though I of course don't have answers for how to make it better.
Maybe it's not the growth of a community which really causes decline in quality but where the growth comes from. Folks that come because they saw a discussion link in a git message or startup conversation are likely to have very different interactions than folks that come from search engine or social media share.if 100k users showed up tomorrow from the former category would quality here really go down?
I agree it probably doesn't make sense to try to block growth sources though. Just considering the perspective "it's not effective to try to play whack a mole anyways" already seems enough evidence of that alone.
All this reminds me though I've been using the site too much lately and my interaction quality has gone down. Time for another short break :).
Anyways
Yes absolutely. The source makes a massive difference! Every community site suffers from this problem whether it's HN, Reddit, or private torrent trackers. Blocking or making difficult new users is not the answer, that just filters out potentially great users. I don't know how you control for source though without turning it into "you have to know somebody" which tends to isolate people who don't know people IRL.
However, while I completely agree that spam isn't anywhere close to being a threat to Hacker News, I'm not so sure about subtle, well-executed content marketing/influence campaigns. By design, these are much less overt than mere spam links to a porn website, and there are plenty of places for them to pop up on HN - for instance, in any number of recent threads lamenting the sad state of modern appliances/tools (like vacuum cleaners, e.g. [1] - and it would be extremely difficult to even approximate the scale of the problem.
I'm not advocating hiding from the outside world in the same sense that one might try to intentionally isolate their community to prevent others from joining - just in the sense of avoiding a specific discovery mechanism (Google) that results in extremely strong incentives for content marketing companies to infiltrate and influence. More specifically, I posit that the amount of organic discovery and community growth that we would get would be significantly outweighed by the attention from content marketers - most people that I know discovered HN through blog posts and individual recommendations, and not through search engine results.
Ultimately, of course, it's up to you - but please think about this issue (and particularly the tradeoff of the upsides vs downsides of being highly-ranked in Google results), and look for indicators of content marketing. Hacker News has, by far, the highest quality-size product of any community that I've seen on the internet ever, and I would dearly love for it to stay that way for many decades to come - which I don't think will happen if "Hacker News product recommendations" becomes a currency with supply comparable to GitHub stars[2].
I don’t know enough about web dev to know the performance penalty that might be incurred but, has there been any experimentation with hiding the voting buttons for the first 5-15 seconds after a comment is viewed (scaled up the longer the comment is, to force the user to actually think about the comment before reacting)?
Similarly to how reply is hidden for the first few minutes for hot threads.
So many places on the internet used to be wonderful places of discussion. I remember as a teenager I would come to the internet and be in awe at this system that mankind created to allow for all humans to come together and discuss.
But I’ve noticed since 2015 every single discussion place I used to frequent online has become horrible.
I discovered hacker news only recently and this website seems like the last remaining jewel of the internet that still exists. I think the reason is because of you and the philosophy you use to moderate the website.
Just wanted to express my gratitude to you. Not sure how often people say thank you to you for the work you do but they should say it more often!
I found it because one of my friends used to post links to insightful posts here on our group chat. Eventually my curiosity led me to explore the site more and I stuck around.
At first I thought it was a site entirely devoted to computer stuff but it was only later that I realized what this site was really about.
I found it relatively recently and because it was frequently mentioned in tech spaces. (From videos to articles). After a while I checked it out and started reading it regularly myself.
I was browsing the archives of NH the other day and I noticed the quality of most comments to be the pretty much about the same. Similar proportion of off topic or just replying to the title style comments. The length of the comments seemed longer now compared to back in the day. There were a similar number of joke / pun / silly comments which didn't get upvoted. I think there were less lame/mean/snarky comments in the past. I didn't see as much flame.
On that note, Discourse nudges you to do a tutorial on how to use their forum through the private messages from Discobot. I wonder if something like that which shows the kinds of comments that the community needs, along with the upvote limitations I’ve talked about elsewhere and privately messaged you about, may provide a sufficient mitigation to this concern and help users be better contributors in this forum.
Also, stuff like Erlang day isn’t a bad idea: it is also a kind of nudge to remind the community about what kinds of discussions and comments to prioritize.
I disagree with the harder problem. I agree that the problem is low-quality comments, but not the lame or mean kind. When I see comments on a topic I happen to be an expert in it's obvious to me that the majority of them are misinformation or uninformed users. These people don't mean harm, they are legitimately misinformed or inexperienced. There typically are insightful comments as well, and often they are upvoted, but they have become the minority compared to the low quality comments. This ratio has steadily gotten worse over the past years. It's not as bad as reddit, but every pseudonymous community seem to suffer this problem as it's directly tied to trust and identity. There is very little to be gained from comments on HN anymore, I mostly come here for the links now since it's a decent source of news curation.
I think I'd include those under "lame". Misinformed comments aren't lame if they're curious and (therefore) open to correction, but when they are categorical statements (usually some form of denunciation or grandiose generalization), yeah that's lame.
> This ratio has steadily gotten worse over the past years
I'm not sure. Everything has always been getting worse, or feels like it has. You'd have to discount for this bias in order to tell whether things have really gotten worse, and no one really knows how to do that, nor really wants to. It's more satisfying to feel the decline of civilization setting in as one ages. Yes, that's a grandiose generalization and therefore lame.
Or, to flip it into a positive, you're more experienced and informed than you used to be, so more comments appear misinformed or inexperienced.
Some titration of open-minded misinformed comments is probably vital to the functioning of the site; without them, a lot of interesting true stuff is buried as shared subtext among the specialists here, and sails over the heads of everyone else.
This is a problem I've seen in specialist forums elsewhere; the conversation runs out, because nobody's poking the community with wrong stuff, and correcting a wrong statement is intensely more motivating than just making a plain factual statement for its own benefit (also: when you do that, the rest of the community will go "uh, yeah, and?").
HN is an experiment. As a rule, a community site that becomes popular will decline in quality. Our hypothesis is that this is not inevitable—that by making a conscious effort to resist decline, we can keep it from happening. - https://news.ycombinator.com/newswelcome.html
pg wrote that over 15 years ago. I've been saying for (a mere) 10 years that we're trying to stave off the arrow of internet entropy: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que... – not doable forever, no doubt, but doable for a while if we expend enough energy.
Based on experience so far, the way to do this is through a combination of a dedicated community, software that does what software can, and moderation to bump the system out of its failure modes.
Trying to hide from the outside world is mostly not the way. pg used to do tricks like Erlang Day on occasion, but having that as the main strategy would be like trying to avoid infection by never going outside. Far better is to have a robust immune system, if possible. Trying to avoid (resistible) pathogens weakens the immune system, and hiding HN from new users is a path to decrepitude. The latter is probably a greater threat IMO, and too easy for established users to discount.
Spam, and its cousins like content marketing, could kill HN if it became orders of magnitude greater—but from my perspective, it isn't the hardest problem on HN. That's because of the dedicated community, which flags these things, reports them when they escape in the wild, and is vigilant about quality. Without such a community, HN would have died long ago.
By far the harder problem, from my perspective, is low-quality comments, and I don't mean by bad actors—the community is pretty good about flagging and reporting those; I mean lame and/or mean comments by otherwise good users who don't intend to and don't realize they're doing that. There's an unholy dynamic between those and the upvote system, so worse comments often get upvoted more than better comments do—often enough to choke the threads with weeds. That's the high-order bit and what I spend more time worrying about—not Cassandra E Oakley and her trading system*, nor the latest startup voting ring and whatnot. If those ever become the high bit, we might be doomed, but we should see it happening long enough in advance to readjust.
p.s. (@dang doesn't work - I only saw this by accident. Well, not by accident because it ended up at the top of the thread)
* https://news.ycombinator.com/item?id=39425371 - unkilled for the occasion