Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Oh, this whole song and dance thing again.

I have no idea where on Wikipedia this is a problem. Since this is Hacker News, I humbly request than you expand your stomping grounds to include articles on algorithms. Coverage and quality is woefully inadequate, and I've yet to see the tracks of a deletion brigade running through any of them.

Also, I've found learning then explaining algorithms is a wonderful way to retain them long-term. Benefits all around.



Here's a page I created years ago:

http://en.wikipedia.org/wiki/Mary_Ann_Davidson

This is an article about an individual non-famous executive at a tech company.

I've never once had to defend it (it's been more than a year since I even looked at it). Why hasn't it been deleted by roving bands of "deletionists" trying to score points?

Because it cites sources and makes a clear statement of notability, as the Wikipedia project asks.

Does the Wikipedia project make mistakes and delete articles it shouldn't? Sure! All the time. But, for the most part, if you do what Wikipedia asks you to do, the system works fine. If you write an article about an algorithm that cites the academic literature, it'll most likely survive.

On the other hand, if you write an article about a well-known algorithm but fail to cite sources or include a single-sentence lede about why the algorithm is important, it is somewhat likely that some Wikipedian patrolling new articles will nominate it for deletion. Why? Because it'll be a member of a cohort of similar-looking articles most of which will be the CS equivalent of cold fusion research, and without citations, Wikipedians will have nothing to judge it by.

To me, it's a small miracle that Wikipedia works as well as it does, and that Wikipedia has more or less replaced "real" encyclopedias. That they'll occasionally jump the gun on deleting articles that don't look legitimate seems like a very, very small price to pay for that.


> To me, it's a small miracle that Wikipedia works as well as it does, and that Wikipedia has more or less replaced "real" encyclopedias. That they'll occasionally jump the gun on deleting articles that don't look legitimate seems like a very, very small price to pay for that.

A price is something you must give up in exchange for something else. I am not convinced that the wikilawyers are the price of Wikipedia any more than a crazy man punching people outside of McDonald's is the price of a hamburger. In reality, you could probably get rid of one and still have the other.

The value of Wikipedia to most people is in the massive amount of work that is put into improving the articles and the breadth of information it contains despite all the deletion — I know not a single normal person who looks at Wikipedia and says, "Thank God I cannot find anything non-notable on here. I was worried about that."


You haven't actually engaged my argument. My argument is that judgement calls about which articles not to host on Wikipedia are one a small number of driving forces that make the project actually work. You've responded to that by equating judgement calls about not hosting articles with a guy punching people outside a McDonalds. You'll be upvoted for that, because it's good, funny writing, but there's nothing intellectually honest about the point you've made.


I'm not sure if we're just talking past each other or what, but I feel that you're ignoring my point rather than vice-versa. Deleting an article on a computer scientist because he was written up on LWN instead of ComputorEdge does not make the article on Intel any better; it doesn't make the article on the American Revolution any better; it does not have any external impact besides pissing off the guy who wrote the article. Deleting articles does nothing but get rid of those articles. I believe that the site would survive just fine if one day the admins decided to reinstate every good-faith article that was ever deleted.


Where's a link to a debate where a computer scientist's points were deleted from WP because "ComputorEdge" (or any trade rag) trumped LWN?

That's a point you didn't make anywhere upthread, so it's disingenuous to say I'm avoiding it the way you avoided my argument, but I'm happy to stay on track. Point me to the pervasive class of mistakes WP is making by trusting some sources and not others?

I'm sympathetic to this argument, because when I was actually volunteering for WP back in 2007, I spent a lot of time beating back vanity pages that were anchored in one line mentions in trade press articles that were merely regurgitating press releases. So I'm with you about the low value of ComputorEdge. But when you say that computer scientists are systematically disadvantaged because of WP:V rules that prioritize ComputorEdge over LWN, you lose me, because I don't see that happening.


My apologies — I communicated that poorly. I had meant that as a facetious way of saying "niggling issues," rather than a specific indictment. My point was primarily that deleting a questionably notable article does not contribute significantly to the value that people get out of Wikipedia. They are largely orthogonal concerns. I don't think Wikipedia would lose one iota of value in the common person's eye if (without loss of generality) an article on a band in Wisconsin were allowed to remain. Wikipedia was not richer during the period Nemerle's article was deleted.

Incidentally, I just looked over Wikipedia's notability rules and they seem to be a bit more reasonable than they were when I used to edit things there, so props to them for making progress on that issue.


No apology necessary. So, I mostly agree with you: the value of deleting a barely-non-notable article is marginal. But as I've shown I think pretty effectively upthread, with the link to the URL pattern for AfD debates, that's not the problem that confronts Wikipedia; instead, editors on WP are dealing with a torrent of extremely non-notable articles, into which valid articles are, due most often to poor editing, occasionally getting caught up.


It has sat unmolested because you wrote a good bio about a computer security professional. It's difficult for a random Wikipedia denzien to quickly reach the conclusion that Ms. Davidson isn't notable enough. She's works for a powerful, well-known company, and is notable enough that she was asked to testify before Congress on a topic.

Wikipedia gets fuzzy when you step outside the basics. Is comprehensive list of "Two and a Half Men" episodes from 2003 notable? Are the results and player profiles of the 1959 NBA draft worthy? A stub article about a village in rural Poland?

In those cases, the answer is "yes", because there is a constituency for NBA fans and TV fans. When you step outside these types of topics, you are stepping off of a cliff, and wikipedians will capriciously and relentlessly enforce whatever rules they deem important.


From experience on HN: articles about specific living people are the hardest to support. The site has a specific policy (WP:BLP) that raises the sourcing standards for articles about living people.

But I didn't have to do anything to keep my article on the site. All I did was (a) write a clear statement of why the topic was notable, and (b) cite sources. That is not a difficult pair of rules to remember.

But if you believe the prevailing sentiment on HN about how WP and "deletionism" works, it should have been extremely difficult for me to keep Mary Ann Davidson on WP. I should have been in multiple AfD debates defending the article. Instead, I wrote it, walked away, and 5 years later there it stands.

More often than not, what's actually happening in specific deletion freakouts is, the article in question cites no sources, and makes no claim about why the subject is notable.


> If you write an article about an algorithm that cites the academic literature, it'll most likely survive.

Most of the time you actually aren't allowed to do that; there seem to be exceptions in a few fields, such as Medicine, but the overall policy of Wikipedia is that you cannot cite primary sources, preferring, very specifically, newspapers. Of course, there are secondary sources in academia (summary papers), but they are fewer and far between, making it difficult to defend some newer topics. The article on "Coppersmith%27s_Attack" against RSA, for example, is seemingly forced to cite a summary paper. (The article on RSA itself has a couple citations to an original paper, but only if it can be backed with a secondary source.)


This is what Wikipedia actually says about citing journal articles:

    Where available, academic and peer-reviewed publications are usually
    the most reliable sources, such as in history, medicine, and
    science. But they are not the only reliable sources in such areas. You
    may also use material from reliable non-academic sources, particularly
    if it appears in respected mainstream publications. Other reliable
    sources include university-level textbooks, books published by
    respected publishing houses, magazines, journals, and mainstream
    newspapers. You may also use electronic media, subject to the same
    criteria. See details in Wikipedia:Identifying reliable sources and
    Wikipedia:Search engine test.
Their policy appears to be the opposite of the one you suggested they had.


If coverage and quality is woefully inadequate, that means people haven't been doing much with them, so of course nobody is going around harassing these nonexistent people. Only pages that people are actually interested in get the jerk brigade's attention.


What editors are interested in and what's important isn't necessarily the same, though. I try to focus my efforts on areas with the lowest ratio of editors to value of the content. I find that provides a much better effort-to-results ratio than trying to throw in my $0.02 on Israel-Palestine with 100 other people. It also leads to more pleasant colleagues, because when I'm writing about ancient Greek archaeological sites or mathematical theorems, my co-editors are typically other idealistic people who genuinely want to improve the Wikipedia articles on those subjects. If you're writing on something political, then a lot of your co-editors are going to be people with political agendas.

On contentious areas where a lot of people are interested and strongly disagree, I'm not really sure how to do it better. Wikipedia is often suboptimal, but so is the opposite, "expert-based" model. I'm an academic, and if you get bunch of us in a room, from different strongly opposed viewpoints, and ask us to try to come up with a consensus survey article, that experience is usually going to end up being painful. I think I'd actually rather wade into a Wikipedia edit war than serve on those kinds of document-writing committees.


It's not just political "hot button" topics. No matter how innocuous the topic, if somebody takes notice of it, you'll often find yourself constantly having to fight random people — who are not the normal contributors to the articles — over stupid legalistic crap. This article with five sections and 20 references should really be a short subsection of that article because, well, he likes shuffling things around; another one should be deleted because the guy has never heard of it and most of the publications that covered it are online-only, etc.

If your particular niche has avoided this, bully on you. But I personally wouldn't want to join it, because if more people come on board, that means more attention and thus (in my mind, at least) greater likelihood of attracting the wikilawyers.


So, where did this happen to you, or anyone you know? It should be very easy to cite a source to "the jerk brigade" glomming onto some innocuous bit of good content. Most of what happens on the project is logged. Deleted articles "vanish", because the point of deleting content from the project is not to host it at all, but the discussions and talk page articles and AfD debates are still there.


Remember the great programming language purge? http://news.ycombinator.com/item?id=2215168

Just Googling around Hacker News, I find a number of innocuous pages whose maintainers have had to defend themselves against deletion.

http://news.ycombinator.com/item?id=2752285

http://news.ycombinator.com/item?id=216723

http://news.ycombinator.com/item?id=288200

You might agree or disagree with some particular cases, and sometimes things go the right way, but either way it's still a hassle that you have to fight. Personally, I don't remember what the specific pages were I used to help maintain. I was doing it because I wanted to help out and I saw those pages could use it rather than because I cared a lot about the topics. But I do remember it was unpleasant and I wouldn't want to deal with such people again.


Oh come on. Clojure has a Wikipedia page. _why has a Wikipedia page. Y Combinator has a Wikipedia page, as does Paul Graham.

Cite the Wikipedia debates you say are happening. Citing HN freakouts isn't a valid argument, because anyone in the world can spark one at any time, because anybody can nominate any article for deletion.


What are you even talking about? I don't know about these "Wikipedia debates" I apparently brought up, unless you mean the AfD discussions that go along with the HN articles I just linked. Yes, Clojure and Why and all those things have pages, but that's because people fought for them. Nemerle wouldn't have a page if people hadn't fought for it — nor would Why or any number of other things. You yourself said in that thread you found the state of the Y Combinator discussion offputting.

The thing is, even when you succeed in fending off all that, it doesn't really feel good to have spent time on Wikipedia politics.

The fact that anybody can nominate any article for deletion doesn't really run counter to my point, which is, to reiterate:

> No matter how innocuous the topic, if somebody takes notice of it, you'll often find yourself constantly having to fight random people … it was unpleasant and I wouldn't want to deal with such people again.


It is trivially easy to create a bullshit freakout on HN about deletionism. Anyone in the world can go, right now, and nominate Don Knuth for deletion. That's how WP works. And, in at least two of the HN freakouts you cited, that's exactly what happened: articles that had no chance of actually being deleted were marked by someone for deletion, and, predictably, weren't deleted.

In fact, when these freakouts happen, people who believe in the articles and know how WP works have to take pains to tell people not to jump into the AfD debate and start "voting" for the articles, because that actually makes the process work worse. Most of the time, when a self-evidently valuable article is proposed for deletion, WP's editors do a just-fine job of making sure they aren't actually deleted.

What you did here was move the goalposts. You claimed that computer scientists are getting shot down in content debates on WP. I asked you to point us to one of those debates happening, where the "jerk brigade" of editors and admins on WP were shouting contributors off of topics. All those debates and shoutings-down are logged.

You responded by highlighting discussions on HN of people freaking out about deletionism. We already know people on HN are freaking out about deletionism. Stipulated! The point is: those people freaking out on HN are mostly wrong.


All those programming languages actually were deleted. According to the result summary of Why's case, he probably would have been deleted if more "serious" sources hadn't been added in the interim. It happens.

But no, I am not moving the goalposts. I'm pretty convinced at this point that you've projected other people's goalposts onto my playing field. I'll quote myself again:

> No matter how innocuous the topic, if somebody takes notice of it, you'll often find yourself constantly having to fight random people

> Just Googling around Hacker News, I find a number of innocuous pages whose maintainers have had to defend themselves against deletion.

> Personally, I don't remember what the specific pages were I used to help maintain. I was doing it because I wanted to help out and I saw those pages could use it rather than because I cared a lot about the topics. But I do remember it was unpleasant and I wouldn't want to deal with such people again.

Those have been the goalposts all along. I didn't say anything about getting "shouted off topics." I just said it involves more headaches than I feel it ought to. (Clearly some people like mjn and yourself haven't had that experience, and I'm glad, but that doesn't take the bad taste out of the mouths of people who have.)


There were no sources on the original _why article. Sources were added because they had to be. It doesn't mean anything to say "the article would have been deleted if sources hadn't been added"; you can say that about any article in the whole encyclopedia.

If the "fight" we're talking about here is (a) one simple sentence stating why the subject is notable and (b) a couple of links, I'm not sure "fight" is the right term.


I guess I haven't actually found that to be true, despite editing in many niches. Sure, sometimes stuff gets shuffled around, but that's okay too: I want people to improve on my work, which sometimes means moving it elsewhere, splitting it up, whatever. I don't feel the need to "own" the articles I write. If anything, I want more people to do so! I've written articles that are 100% untouched years later even though I know they are not really "done", and someone could improve them. It can be quite nice when someone comes by and tidies up a rough draft: fixes some spelling, adds an infobox and geographical coordinates, rearranges my text-dump into some nice sections, formats my citations nicely and adds ISBNs and DOIs to them, etc.

The people I've seen run into problems most often are either in political hot-button areas, or have too close connection to a subject: someone writing an article on their own programming language, on their own academic contributions, their own company, or that of someone/something they have a close relationship to. If anything, that kind of CoI editing is still rather laxly tolerated, rather than too strongly policed. I know of at least one university that actually has paid staff writing puff pieces on their professors, and most of them are sadly still there, untouched, because there honestly isn't that much close scrutiny.

Imo experiences are much better if you start from the perspective of wanting to improve the encyclopedia, rather than from the perspective of wanting to get a particular thing into it and then maintain/defend that article. The way I usually work: start with a good source I have no personal connection to, and write articles based on it (and citing it!). For example, pick up Knuth's TAOCP, find some interesting subjects it discusses that are not yet covered in Wikipedia, and write a well-cited article. There are >99% odds that people are going to be happy with that kind of contribution, not try to delete it. I've recently been doing that with some books on archaeological sites, and I've gotten only positive comments about it; people are generally happy that I'm filling in articles on important sites that Wikipedia still lacks articles on, and that I'm doing so with references to good scholarly literature on them.

That's not to say every encounter with another editor I've had is positive, just that I think it works reasonably well on the whole, especially given the scope of the endeavor, which I would have guessed was frankly impossible, if you had asked me 10 years ago ("random people on the internet writing the superset of all subject encyclopedias?! won't it just be filled with cranks and nonsense?!"). Actually that's an interesting aspect of the HN reaction: HN is generally worried about Wikipedia being too closed, too deletionist, etc, whereas I'd guess that 90% of the world is worried about Wikipedia being too permissive, not fact-checking or insisting on sources strongly enough, etc. The most common criticism outside HN is pressure to add more reviewing and quality-control mechanisms, which gets especially strong in the wake of occasional hoaxes or libel scandals.


…focus my efforts on areas with the lowest ratio of editors to value of the content…

This sounds quite interesting. Do you do this by rough intuitive feel, or is there any quantitative/analytical data from Wikipedia that can be used to prioritize attention this way? (For example: a list of places where the imbalance between readers/inbound-Google-referrals and content/active-editors is greatest?)


I generally do it by feel, partly because I'm treating "value" somewhat subjectively, something like "intellectual value" or "scholarly value" rather than "number of hits". So I look for things that Wikipedians don't seem to be spending a lot of effort on, but which imo a good encyclopedia should cover.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: