Reading academic computer science papers

lazyjeff · on April 8, 2022

If you're looking for a "menu" of papers from various subfields, I've been collecting a list of all best paper awards from a set of 30 computer science conferences for the past 25 years. This is just a personal interest of mine, and is a long page with no ads or upsell.

https://jeffhuang.com/best_paper_awards/

These are papers that were deemed "best papers" in that year, though obviously may not have turned out to be as influential in retrospect, i.e. they're likely not the papers we consider "best" when looking back today.

nevster · on April 8, 2022

This Harvard course has a pretty good list : https://canvas.harvard.edu/courses/34992/assignments/syllabu...

Links to the papers: https://docs.google.com/spreadsheets/d/1wS6O7-ZoFL7Cfjgt-kdh...

I've been slowly reading through them - some easier than others. (Some I've pretty much skipped - way too long for my interests).

Vervious · on April 8, 2022

Tbh I always direct people who want to read old important papers to lecture notes or textbooks, at least in distributed algorithms or CS theory.

I've found that the original papers are always super dense and the material has usually evolved to be more explainable, especially when someone has put in the time to compile it as part of a course.

dachryn · on April 8, 2022

any recommendations? I read the Cormen introduction to algorithms 3rd ed book, and then a whole bunch of machine learning related books, but I always feel there is a world of algorithms out there that are pretty neat that I haven't uncovered yet.

playing_colours · on April 8, 2022

Advanced Algorithms and Data Structures from Manning publisher: https://www.manning.com/books/advanced-algorithms-and-data-s...

I have not finished it yet, just went through a few chapter, but I can definitely recommend it!

playing_colours · on April 8, 2022

The 4th edition of Cormen book got published recently! Now in colour! I think it added a bit to better comprehension. It is slightly thinner than 3rd edition as they removed computational geometry algorithms and added some ML stuff.

nevster · on April 8, 2022

To give you an indication of how technical the papers are, you can read some of my reviews: https://www.goodreads.com/review/list/5348644-neville-ridley...

nevster · on April 8, 2022

It's taking a while because I go off on tangents - spent ages looking at the Mother of all Demos and its offshoots. Lately I watched everything I could find by Bret Victor...

vardhanw · on April 8, 2022

Are the video lectures to the course available by any chance?

nevster · on April 8, 2022

Not publicly as far as I can tell

Ar-Curunir · on April 8, 2022

Yeah, I think test-of-time awards are a better indicator of which papers actually had solid impact; there's often very little overlap b/w best paper awards and test-of-time awards.

srvmshr · on April 8, 2022

Wow. I am so glad to see you here. I have been following your webpage for the last decade perhaps. Thank you for putting those paper references. (Also, on a different note how many different Jeff Huangs have contacted you so far?)

lazyjeff · on April 8, 2022

Thanks, I guess you're referring to my page from about 12 years ago when I offered to give emails to other people with my name? I think 5-6 other people reached out and I gave them emails that are still running now. I know one other Jeff Huang who I'm pretty friendly with (and we've had random encounters in person), and the others are strangers.

srvmshr · on April 9, 2022

>Thanks, I guess you're referring to my page from about 12 years ago when I offered to give emails to other people with my name?

Yep. I didn't realize so much has changed with the rest of the site There was a bit of slowdown I presume in mid-2010s where the table felt a tad behind schedule. I did think of personally writing you an email to help in updating if necessary. In fact, I secretly wanted to just mirror your publication page - but stealing someone's thunder wasn't my ballgame :)

bschne · on April 8, 2022

Another favs list: https://lethain.com/some-of-my-favorite-technical-papers/

galaxyLogic · on April 8, 2022

That's a great resource, thanks. Looking at the list the paper titles could be be more readable, perhaps bigger and perhaps not in blue. There's a lot of information but you can always scroll.

Simon_O_Rourke · on April 8, 2022

Nice, thanks for sharing.

Question though, have you got a process for reading individual papers? In college my old CS professors insisted on reading the abstract, then the conclusions, then the references, and then the rest of it.

rramadass · on April 8, 2022

See the famous How to Read a Paper by S.Keshav - https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/HTR...

Summary at - https://derekchia.com/how-to-read-a-research-paper-3-pass-ap...

More Advice from S.Keshav - https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Adv...

rfonseca · on April 8, 2022

Here's one way (YMMV) of reading academic papers. It usually takes time, and a couple of passes, and you may try to understand the ideas and impact first, to then move to the design/implementation, and the experiments that demonstrate the paper claims. http://ccr.sigcomm.org/online/files/p83-keshavA.pdf

patrick451 · on April 9, 2022

Caveat: my training is in control system,s not computer science. I assume these types of papers to be similar enough that the following will be useful.

I have heard that advice before, but I generally disagree. Unless you are doing a literature search, reading the references is a waste of time. And most conclusions aren't really worth reading either. They typically read as though the author has completely exhausted themselves by writing the rest of the paper, and simply restart the last paragraph of the introduction in new words.

If you are trying to get up to speed on a new field, a good introduction can really help with you reference search.

Part of "how to read a paper" depends on what you are trying to do. If it's a seminal paper that are are just going to read, that is a very different thing from reading a paper in search of a solution (or trying to figure out of your idea is novel).

In this second case, your main task is to decide if the paper is even worth reading. IMO, this takes a lot practice. Fully reading a paper to the extent you really understand it can be extremely time consuming. It's important to be willing to throw out the paper if at any point (no matter how much time you've put into it) it becomes clear it doesn't work for you.

I will generally skim the abstract and/or the last two paragraphs (or so) of the introduction. If it still sounds promising I look through the next section or two, which are usually some kind of "problem setup" and "proposed solution scheme". I skip things I don't immediately understand. If my interest is still piqued, I look for the results section where the plots are (if it's a practical paper). If I'm still interested, I go back to Section II and start reading more carefully. I'll spend more time with tricky math, but not too much. Save staring at the same four equations for two hours for at least the third pass.

Oh, and if a paper is tricky (for me) and seems worth my time, printing it out single sided and laying the pages out side by side on my desk can be really helpful.

whitten · on April 8, 2022

Did you focus on any particular sub field when you chose which CS conferences?

Also, do you think ACM opening their archives will have a high impact on which ones you recommend in the future?

lazyjeff · on April 8, 2022

About 13 years ago when I started this, I selected what I considered the most well-known broad conferences in each subfield (I'm trying to avoid using the words "top" or "best"), though the list is notably missing SIGGRAPH which didn't have such an award, and architecture conferences which I considered more in ECE than CS.

ACM archives -- not really, I haven't added any new conferences because it makes each year even more work to update. And I find that nearly all CS papers are accessible through various sources that you can find in Google Scholar or Semantic Scholar (e.g. author homepages, course websites, arXiv, etc.).

alimov · on April 8, 2022

I’ve seen this list shared before, and want to thank you for sharing it again! I had forgotten about this resource and am pretty stoked to be seeing it

toxik · on April 8, 2022

FYI the scroll is strange on iOS, in that the “go to top” doesn’t work (ie tapping the top of the screen.)

herodoturtle · on April 8, 2022

Nickname does _not_ check out.

This collection is great - thanks so much for your effort!

stewvsshark · on April 8, 2022

Unless you’re working on something that has the potential to be directly influenced by a paper, I’ve found trying to stumble through academic papers to be a huge waste of time. Remember that academic papers are by and large written for other academics, not for general purpose or even specialized engineers (obviously it depends how specialized you are). Take something like merge sort - I’d recommend the Khan academy video before recommending the academic paper if you want to understand how it works. I get that you should understand “why” it works from an academic perspective, but I don’t know that you really get much value from that level of understanding.

ramraj07 · on April 8, 2022

This is generally true, but some original papers are beautiful and elegant, and give you amazing insight even with little domain knowledge. Some of my favorites in biology include Shinya Yamanakas original iPS paper (1), Sydney Brenners paper on C. elegans (2). I also like the original BLaST paper.

Anyone with the money and general scientific interest should consider subscribing to Nature or Science. It’s a fun browse every week and you never know what fields interesting finding might fancy you until you look at the articles.

1. https://pubmed.ncbi.nlm.nih.gov/16904174/ 2. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1213120/?report...

Labo333 · on April 8, 2022

Anyone with common sense will read Nature or Science from sci-hub :)

ramraj07 · on April 8, 2022

Couldn’t give a shit if someone pirates or pays the publisher, I like to have a book in my hand in the loo and browsing through a magazine is in some ways much better to take all of the science in fast that I’ve never managed to replicate in a computer. Except with google reader but let’s not open that wound now shall we?

leemailll · on April 9, 2022

Reading from a printed issue is quite a different experience from PDFs

leemailll · on April 9, 2022

Reading the news on both nature and science websites could be fun. And it's kind of more accurate than those reported by news agencies and often lack the hyperbole from those nonsense PR

armchairhacker · on April 8, 2022

I'm a PhD student. Maybe I'm not a good PhD student. But honestly, when I read most papers I usually just end up confused.

The concepts are complicated enough, but then the writing style in papers is just really strange and unnecessarily verbose and over-complicates even simple concepts.

I end up getting most of my knowledge through blog posts and slides which cover the papers vs. the papers themselves.

vaylian · on April 8, 2022

My perspective: Most people are terrible writers. And that also applies to many scientists. And in addition, there are cargo cults of writing styles that people follow, just so that their writings are perceived as part of the scientific culture.

When regular people publish books, they get proofread and you actually get feedback from people who are trained in proofreading. In academia this not something you can expect from a publisher. And that's why we have lower standards for published academic writings.

geoalchimista · on April 8, 2022

Some journals have a proofreading stage before publication. But proofreading can't save bad writing. I blame the overemphasis on "critical thinking" in modern liberal arts education at the expense of fundamental writing skills and thinking in clear terms.

jltsiren · on April 8, 2022

The typical CS paper is written by a grad student who speaks English as a foreign language. They can't express their ideas as well in English as their native language, and they probably don't have that much experience in writing in any language. When they write, they often end up copying phrases and sentence structures from other research papers.

Most papers about new results are published in early stages of research. The authors barely understand the results enough to be convinced that they are correct. They can't explain the results well, because they don't understand them yet. If they continue working on the same topic, they will probably come up with a good explanation after a few years and two or three complete rewrites.

musicale · on April 8, 2022

> the writing style in papers is just really strange and unnecessarily verbose and over-complicates even simple concepts

Specialized terminology and complete descriptions can sometimes help to make writing more precise and correct. They can also make writing less clear - intentionally or unintentionally.

Why would the writer of a paper want it to be less clear? Perhaps to artificially inflate the importance of the work (and/or the length of the paper) or to make it seem non-obvious and "novel." Or perhaps to "fit in" with a certain style of writing or language ("academese" or "paperish") commonly used in the discipline.

My advice is: be as clear (and as simple) as possible without sacrificing precision or correctness. Clear papers with good and significant work are incredibly valuable and are likely to be more influential than unclear papers, even if the underlying work is good.

Since so many papers are unclear and poorly written, you may also find that clarity and good writing can help to differentiate your work.

cerved · on April 8, 2022

Academese is often overly verbose; with sentences that run on forever. Good writing, readable writing, requires effort. I've found this especially difficult for non native English speakers, myself included.

cxr · on April 8, 2022

You're not wrong. Most authors of academic papers are not good writers. Perversely, regardless of where they are on the spectrum spanning from good to mediocre to terrible, writers' output is usually made worse when they know they're writing for an academic context—this is why most of the blog posts you're thinking of, with their informal language, make better papers than the papers themselves.

I wish there were a culture of (a) always writing a blog post to accompany any paper, and (b) rewriting historical* papers to make them more accessible. On the latter point, undergrads should probably be doing this for at least one paper per semester, despite the general expectation that they ordinarily do not deal with "research".

* Doesn't have to be distinguished ones; just any paper you come across and feel is too hard for you/your peers to follow without more effort than should be necessary, or any paper you thought was interesting but didn't get its due in public, or ones that did but have fallen out of cross-generational memory.

patrick451 · on April 9, 2022

>You're not wrong. Most authors of academic papers are not good writers. Perversely, regardless of where they are on the spectrum spanning from good to mediocre to terrible, writers' output is usually made worse when they know they're writing for an academic context—this is why most of the blog posts you're thinking of, with their informal language, make better papers than the papers themselves.

This isn't necessarily the writer's fault at all. I have had reviewers complain that the language in my paper was too informal. I wasn't using slang or something like that. But in so many words, the reviewer wanted _more_ academese. It was the last paper of my grad school career and I was sick of academese. In so many words, I told them to pound sand. That was my only paper to never get published.

mschuetz · on April 8, 2022

That's normal, most papers are written unnecessarily complex, likely to make them appear more impressive than they actually are. Many blog posts that contain just as much information as papers, but presented in a much more intelligible way.

drBonkers · on April 8, 2022

After editing a lot of my peers papers before publication, I think the weird writing style comes from poor ESL skills.

vaylian · on April 8, 2022

ESL = English as a second language?

pid-1 · on April 8, 2022

Had a similar experience when working in a research lab.

I think the issue is that I have a hard time learning things without doing myself. And generally trying to reproduce stuff from a paper is really hard.

musicale · on April 8, 2022

I like it when the authors provide digital artifacts including code and data to make it easier to analyze and reproduce the work.

substructure · on April 8, 2022

There seems to be some improvements in this area. Machine learning papers sometimes provide their data sets and other artifacts on websites like paperswithcode.com.

Have you ever contacted the authors to request their data? I personally have not.

chaxor · on April 8, 2022

That should continue to increase. Hopefully it will be academic misconduct to not release code - but also further, it should be misconduct for the code not to be able to reproduce the paper's output with a one line command.

UK-AL · on April 8, 2022

I find academic papers optimise for conciseness, so they tend to use in group terms.

suyjuris · on April 8, 2022

Much depends on the subfield, but here are two shortcuts to understanding a dense paper:

1. Many CS papers are presented at conferences, and many of these talks are recorded and available.

2. Look for a paper citing the one you are interested in. The “related works” section often contains brief summaries, which are written with the benefit of hindsight.

Somewhat related, an earlier comment of mine on how to acquire copies of a paper without resorting to unauthorised copying [1]

[1] https://news.ycombinator.com/item?id=23711206

gandalfgeek · on April 8, 2022

Shameless plug but I've been making videos covering CS papers: https://youtube.com/c/VivekHaldar

Motivation is to get more people into it, and also give enough of a flavor of the paper for those who aren't used to reading academic papers, or don't have the time.

DeathArrow · on April 8, 2022

For someone who might be new to CS I don't think reading papers will be the optimum path towards gaining knowledge.

If the concepts are new and bleeding edge, the reading papers is the only way to become familiar to them. Otherwise, time is better spend reading books and courses, where the said concepts are explained in relation with the others, forming a bigger picture.

galaxyLogic · on April 8, 2022

Some say you should and some say you shouldn't. But it would be nice to have a resource that explained new CS research for practicing programmers. Something like "Popular Mechanics" or "Scientific American" for programmers. Is there such a thing?

bombcar · on April 8, 2022

It’s called Hacker News but I can’t remember the URL - something weird about combining whys.

texaslonghorn5 · on April 8, 2022

Haha, though for accuracy the website is named after the Y-combinator which is a combinator in Haskell Curry's system of combinatory logic (not to be confused with combinatorial [digital] logic).

jdkee · on April 8, 2022

I found this book to be a fantastic reference of the leading papers in computer science.

See https://mitpress.mit.edu/books/ideas-created-future

siraben · on April 8, 2022

I’ve been reading CS papers for a few years now (mostly FP or PL related) and wrote some summaries for a selection of them[0]. FP and PL literature is very accessible, many authors have a desire to communicate some aspect of programming languages or type systems that is readable even if you have only an introductory knowledge of the field. Also, papers are a great source for knowledge that is not easily found in books and documentation online.

[0] https://github.com/siraben/fp-notes/blob/master/Papers.md

sideproject · on April 8, 2022

I read research papers often and I found reading it together with other peers quite helpful when I did research.

So I recently created a small tool to read research papers together with other people with annotations and comments.

https://www.scholars.io

sudosysgen · on April 8, 2022

twominutepapers on YouTube is a great resource to get an intro to some interesting CS research, though it's mostly infographics, ML, and simulation focused.

kklisura · on April 8, 2022

Is it legal to publish links to research papers?

I was researching a lot about NASA's Mars helicopter - Ingenuity and found couple of research papers, so I though I'll publish the list on my blog. List would include title of papers with appropriate links to PDFs. The PDFs are hosted _somewhere_ and they are easily reachable if you just google the research title. I gave up eventually since I'm not sure if this is legal or not.

BlueTemplar · on April 8, 2022

Why would you assume it wouldn't be ?

A known counter-example was someone condemned mostly because of admitting browsing up the directory and realizing that the files were supposed to be protected by a login + password :

https://arstechnica.com/tech-policy/2014/02/french-journalis...

bryanrasmussen · on April 8, 2022

although some countries have these linking rules I don't think they generally apply to research, that is to say if they research is available it is legal to link to. Research from American governmental agencies is also paid for by American tax money and thereby, under the American model, owned by the public and publicly linkable.

tchalla · on April 8, 2022

The now discontinued "Morning Paper" has a great set of Archives.

https://blog.acolyer.org

behnamoh · on April 8, 2022

the following article talks about why you shouldn’t read some scientific papers:

https://medium.com/@parttimeben/how-not-to-read-scientific-p...

0x20cowboy · on April 8, 2022

Please correct me if I just haven’t found the right place, but when I’ve tried to read academic papers in the past I found I’d have to pay upwards of $50.00 to read anything more than just the abstract.

Is there a site that gives free access to these research papers?

uluyol · on April 8, 2022

Some conferences are open access and the materials are freely available.

Anything USENIX falls into this category. In my area (networking/distributed/operating systems) the other (non-USENIX) major venues are also open access (SOSP and SIGCOMM).

Ar-Curunir · on April 8, 2022

sci-hub, but generally most papers are available from the authors' websites, or a preprint server like ArXiv or IACR ePrint

the_benno · on April 8, 2022

The best source is usually authors' websites. You can find a free copy of just about any modern CS paper by googling its title in quotes with "filetype:pdf".

anonymousDan · on April 8, 2022

$200 a year gives you ACM membership with free access to the DL library (amongst other things including a monthly hardcopy of communications of the ACM). Great value IMHO.

schroeding · on April 8, 2022

Asking the author(s) for a copy (by writing a short mail to them) does also often work, if there is no other free source available :D

Spivakov · on April 8, 2022

Here is my kettle logic for you to not even try getting around that paywall:

Old papers that are not publicly accessible on the web are very likely just a pile of craps and not worth reading; even if it was worth reading, its information should have been compiled in better ways in textbooks/blog posts; if neither were applicable, then you are supposed to be some researcher working in academia and can ask your institute to offer the access.

Short answer: scihub

prionassembly · on April 8, 2022

Email the authors.

chrisseaton · on April 8, 2022

Authors put free copies on their websites.

Laakeri · on April 8, 2022

arxiv.org

substructure · on April 8, 2022

Solving problems by studying tomes of knowledge is the job description of wizards/witches. Large improvements towards optimality, for some problems, are effectively locked away in some of these papers. As the article points out, there generally isn't much benefit in the context of building CRUD apps.

Some contexts have larger research communities. For example, there isn't nearly as many papers on real-time path planning for agent mutable environments vs static environments. I assume this is because we still don't have Boston Dynamics robots in people's homes. If we could get the cost low enough it may be more profitable to send mining robots to mars than people, but I guess there are other applications as well.

I spent some months trying to find, understand, and implement the state-of-the-art algorithms in real-time path planning within mutable environments(Minecraft). I started with graph algorithms like A*[0] and their extensions. For my problem this was very slow. D* lite[1] seemed like an improvement, but it has issues with updates near its root. Sample based planners came next such as rrt[2], rrt*, and many others.

I built a virtual reality website to visualize and interact with the rrt* algorithm. I can release this if anyone is interested. I've found that many papers do a poor job describing when their algorithms perform poorly. The best way I've found to understand an algorithm's behavior is to implement it, apply it to different problems, and visualize the execution over many problem instances. This is time consuming, but yields the best understanding in my experience.

Sample based planners have issues with formations like bug traps. For my use case this was a large issue. Moving over to Monte Carlo Tree Search(MCTS)[3] worked very well given the nature of decision making when moving through an environment the agent can change. The way it builds great plans from random attempts of path planning is still shocking.

Someone must incorporate these papers' best aspects into novel solutions. There exists an opportunity to extract value from the information differential between research and industry. For some reason many papers do not provide source code. A good open source implementation brings these improvements to a larger audience.

Some good resources I've found are websites like Semantic Scholar[4] and arxiv[5] along with survey papers such as one for MCTS[3]. The later half of this article is what gets me excited to build new things. I would encourage people to explore the vast landscape of problems to find one that interests them then look into the research.

[0] https://en.wikipedia.org/wiki/A*_search_algorithm

[1] https://en.wikipedia.org/wiki/D\\\*

[2] https://en.wikipedia.org/wiki/Rapidly-exploring_random_tree

[3] https://www.semanticscholar.org/paper/A-Survey-of-Monte-Carl... /c37f1baac3c8ba30250084f067167ac3837cf6fd

[4] semanticscholar.org

[5] https://arxiv.org/

cyber_kinetist · on April 8, 2022

Maybe a bit tangential to the original comment, but have you also checked some practical implementations of path planning in Minecraft, such as Baritone (https://github.com/cabaletta/baritone)? Practical as in, it is actually deployed widely to automate various kinds of complex tasks (building structures, automate mining, killing other human players) for bots in anarchy servers?

Although the method uses a variant of A* and might not be that “fancy” in academia terms, it’s astonishing how far it can achieve (see demos like [1] and [2]) and might actually be far more useful to study it closely instead of more theoretical papers.

[1] https://youtu.be/c2OFyoECOhQ

[2] https://youtu.be/rQFPvU2ew9Q

wanderingmind · on April 8, 2022

Unless you are paid to read papers and implement cutting edge stuff, your time is better spent leetcoding that has a direct short term impact on maximizing your TC.

tomasreimers · on April 8, 2022

Can't tell if sarcasm.

If not, do you believe there is value in education beyond optimizing TC?

gautamdivgi · on April 8, 2022

Probably after you retire or if you don't have any dependents. The moment you have dependents (kids), parents you need to take care of and the subsequent load of mortgages, university fees, medical insurance and what not... optimizing TC suddenly looks like the best investment compared to any other education.

I'm speaking from experience. I got a PhD while working full-time. I enjoyed it and it gave me a lot of perspective. But - if I'd spent even 1/6th the time LC'ing I'd probably have a much higher TC at this point :). No regrets - but its an amusing thought that kinda lurks in the background.

nnadams · on April 8, 2022

Any advice on doing a PhD while working? I've been considering doing that in the back of my mind.

gautamdivgi · on April 8, 2022

Here you go.. I put this blog together a long time back - https://gautamdivgi.wordpress.com/2017/03/08/my-part-time-ph...

wanderingmind · on April 8, 2022

Belief in value of education beyond TC is (1) inversely proportional to the number of mouths to feed and (2) directly proportional to the outstanding equity in trust accounts

rfrey · on April 8, 2022

That's a weird thing to say on a website that values curiosity above almost everything else.

Silamoth · on April 8, 2022

But reading papers is fun and cool. You get to learn new things. If you read recent papers, you learn things that aren't just new to you but new to the entire world; you can literally learn cutting-edge things that were discovered recently. And if you read older papers, you can better understand why the world is the way it is right now.

Putting TC aside (don't most programmers make plenty anyways?), isn't it just cool to learn new and cutting-edge things? Personally, I think so. Curiosity is central to what it means to be a hacker.

sundarurfriend · on April 8, 2022

> on maximizing your TC.

What's TC in this context?

Shugarl · on April 8, 2022

Total compensation.

It really annoyed me to not know what it meant, so I spend a bit of time finding that

briffid · on April 8, 2022

Significant part of CS papers are master-bation.

sydthrowaway · on April 8, 2022

It's more $$$ rewarding grinding Leetcode