Behind the scenes: the struggle for each paper (2021) (opens in new tab)

(jeffhuang.com)

147 pointslazyjeff2y ago44 comments

44 comments

I had an assignment in the OMSCS course where we had to turn the results of a project into a paper and a presentation. It was eye opening on why so many CS papers are difficult to decipher.

I’m used to writing on the web where the scroll is unlimited and everything is hyperlink able and potentially interactive. Journal papers are limited by length and so was our assignment. I had to cut virtually all helpful explanation needed to reproduce my results which was deeply frustrating. We were implementing an algorithm based on another paper and it was hard because key details were omitted or assumptions not stated. After that exercise I have to think some of it was intentional to get it down to size.

I find most people aren’t good at technical communication and teaching others without a LOT of practice. Even then it requires feedback and iteration to make sure the ideas are communicated well. Forcing people to be more succinct and omit details makes the final product worse to consume. I don’t know how common such limitations are these days, but I do know that the average paper is still out of reach of the average programmer (where it would likely have the most benefit).

godelski2y ago

> Journal papers are limited by length and so was our assignment

I have always thought this was a bit silly and that it creates really weird effects that also decrease readability. An interesting point is that reviewers are not required to read the appendix of works. So everything is required to be in the front matter. This is a bit silly when we do things like research graphics or do generative works and such. You want to include images and samples but then your space is eaten up. What if you want to discuss analysis on those images and explore some? You could easily do this on a blog but you're forced to throw this into the appendix. But then a reviewer can ask a question that's explained there and your work can still get rejected because it isn't in the front matter. Another weird incentive is that people end up padding works to fit page limits. This is because if you turn in a shorter paper reviewers will frequently reject your work the same way your boss might not think you're working if they don't see you at your desk.

We live in the 21st century and we still publish like it's the 15th. Computers gave us the ability to embed images, which is why there are so many more graphs and charts now, and it's not like more pages cost more. So just remove it. Some papers should be only a few pages and there's nothing wrong with that. Some papers should be far larger and there's nothing wrong with that. It's just weird to set these up considering they were likely created under other constraints but momentum continued and we back justify the continued decisions (there is something to be said about readability, but that can just be a reason to reject).

Side note: CS groups typically publish in conferences

jltsiren2y ago

Page limits force you to focus. As a researcher, you are often expected to communicate your ideas in 1 page, 3 pages, 10 pages, or 30 pages, for various purposes. If a journal asks for a 10-page paper, you write a 10-page paper. If a conference asks for a 1-page abstract, you write a 1-page abstract. Most people reading a paper are not interested in going through all the details, and those details should usually not be in the main paper.

It's also easier to find reviewers for short papers than for long ones.

Some the issues you mention are specific to CS conferences. Because there is only time for 1-2 rounds of reviews, the reviews focus more on accepting/rejecting the paper and less on clearing any misunderstandings before judging it. Conferences are are also more likely to have one-size-fits-all page limits, while journals often have several catagories of papers with different expectations of length.

godelski2y ago

> Page limits force you to focus.

This can be solved in better ways, which is, in fact, reviewers. I'm okay with a soft requirement but a standardization is what I'm getting at as being problematic. Some papers are noisy because they should be 3 pages but are 10. Some papers are noisy because they are 10 pages and should be 30. There is no universal rule, and that's what I'm getting at.

> It's also easier to find reviewers for short papers than for long ones.

That's a separate problem that needs to be addressed, but is not easy.

> Some the issues you mention are specific to CS conferences.

Yes, but the author here is CS and we are on a CS focused website. But in general what I said isn't specific to conferences. If conferences are the problem then let's abandon them in favor of good science instead of keeping them around (or turn them into being meetup focused). Certainly the lack of back and forth between authors and reviewers is not a meaningful review process (most author rebuttals are limited to one page and often reviewers are not aligned in critiques). Are we all on the same team (better science) or strictly competing against one another?

1 more reply

sideshowb2y ago

I think desirability of page limits is very subject specific. Some people will just waffle if you don't give them a page limit. Other times it means there's not room for the technical details.

1 more reply

outrun862y ago

Distill.pub was one effort to modernize publishing in CS. Chris Olah wrote some thoughts [1] about why he didn’t feel it was tenable. Seems like the primary challenge was the additional effort and skill involved in crafting rich-content/interactive material.

[1] https://distill.pub/2021/distill-hiatus/

godelski2y ago

Honestly, I don't get why we don't just submit to OpenReview and call it a day. Paper is visible and distributed. There are comment sections where peer review can not just happen, but happen in the open (added bonus!). You can iterate and even see the difference between submissions. What is the conference/journal providing that isn't covered here? A stamp of approval? From a well known noisy system that creates other disincentives?

2 more replies

ketzo2y ago

What a great resource, both for self-reflection and for a student who wanted to chase a similar career. I should really do something similar for my history of paid work.

It’s not like I have a crazy illustrious career or anything, but it can feel like kind of a blur, just a rollercoaster that led inexorably towards the present, which couldn’t be further from the truth; I would love to be able to reflect on my successes (and failures!) and see the small, concrete steps I took towards each.

Even without writing it out, I know the connections I have made and the mentors / coworkers / friends who have helped me deserve much more credit than any individual strokes of brilliance on my part! Another thing that’s very easy for me to forget, day-to-day.

ShadowBlades5122y ago

I have started to at least write 4-5 bullet points per month of my job in my personal notes as a reminder of what stuff I have done. I find I will remember a lot of details as long as I have notes that remind me that a project even existed or an event happened. That has been enough for me.

ketzo2y ago

That's a great idea, I think I'll start doing that. Sounds super worthwhile for very little effort.

halgir2y ago

No way - reading this I thought I recalled one of the papers (Starcraft from the Stands). Pulled up my Zotero library, and sure enough, I cited it in my BA thesis almost ten years ago.

What a pleasant coincidence - thanks for the contribution!

amadeuspagel2y ago

There's this new thing that some academics are working on at CERN - kind of like academic papers, with references and so forth, but on the computer.

Once this is ready, people will just be able to publish their "papers" there. I guess they'll be called something else then. But this sort of struggle to publish a "paper" will no longer be necessary.

jll292y ago

Thanks for sharing a behind-the-curtain view on the history of your publications.

Thank you even more for publishing WebGazer and for following a "systems" approach in your research, when most people produce only papers. It's systems as research artifacts that encode the exact methods as described in the papers but in sufficient detail to be executable that drive innovation. Sadly, system papers are rather hard to publish, despite taking longer (software that is released needs to be much more polished than software that you are going to keep to yourself).

fallon542y ago

AKA why you probably don't want to be in academia

vladms2y ago

I think most of the things in life come with a lot of struggle, strange things, things that should be different and so on.

Making a startup? Go and check how hard and crazy that is. Make a family? Similar convoluted process with ups and downs.

What I think is wrong is that people have a very "idealized" image of a scientist scribbling on a board and equation and getting some prize (or defeating the aliens). These images are good for kids but after high-school I think people should give it a thought and say "ok, things are not exactly how I imagined in life, lets try to understand more what I like and want". You know the same process that makes people realize there is no Santa Claus.

ajsnigrutin2y ago

Yep... Everyone in academia complains about publishing papers, about the high prices of publishing, about "publish or perish", and then when they come high enough in "academia", require the same pain from the newcomers. It's like a closed circle of people both requiring papers for maintaining and advancing your carreer and at the same time complaining about those papers (and the publishing process), and not even thinking about some kind of "change".

academia_hack2y ago

I've giving "Accept - Minor Revisions" to every paper I've peer reviewed since getting my PhD other than two that were outright plagiarism. Figure it's important to the morale of grad students to get some positive validation and the vast majority of published research is garbage anyways so I don't feel particularly inclined to defend the trash heap as an unpaid reviewer. In practice, I find that I've tipped the scales in favor of a lot of borderline papers over the years and am quite happy about that.

JohnKemeny2y ago

This is only true to some extent, having myself been on a fair number of hiring committees.

While the institution and national agencies measure impact in terms of number of "level 1/level 2" papers, colleagues don't care at all about this value. What's important is number of single-author papers, number of papers without their advisor, number of different small group collaborations, and most of all, having papers accepted in the top venues.

A person with 50 shit papers will not even be considered for the job.

ajsnigrutin2y ago

Sure, but the same group of people that complains about (the publishing of) papers, is then in a position to change that, but doesn't. All those people that went through this process and complained, then (well.. a few years later) sit in univeristy comitees that decide what the hiring (and scoring) rules are, what are the requirements fo TAs, for professors and tenures, etc., and decide, that the "pain of publishing" is an ok thing to subject new generations to.

edit: i'm from slovenia, univeristies here are "autonomous" (not a direct part of the government.. except for being government funded), and they decide all the internal rules themselves.

1 more reply

godelski2y ago

> But this paper was critical to getting me accepted to a Ph.D. program. Why do I think that? Well I was rejected by every Ph.D. program I applied to before this publication (but that's another story), a story about people and opportunity.

This is an interesting note. We're talking about a student from one of the top CS schools (UIUC) and applying to another top school (UW). If you think about this a bit carefully, the paper being published did not change who he was or his capabilities, it was simply a difference in measured (distinct from measurable) signal.

It's incredible how many extremely noisy signals we use in academia but act as if we use a clear meritocracy. The review process is extremely noisy itself, with computer science in particular being generally more noisy given its preference of conferences over journals. I'm glad Jeff mentions people and opportunities, and it reminds me of the old saying about there being no self made man. But I think this is a very clear example of a instance where we need to think harder and more carefully. Counterfactually, it is almost certain that had that paper been rejected, but all else stays the same (i.e. getting into UW), his success story would also not change. Signals are definitely hard to measure and certainly schools are getting a lot of applicants, so I don't blame anyone for doing this, but I think it is incredibly important to remember these counterfactuals. To remember that metrics are guides and not causal variables themselves. Because there's a great irony in that metrics destroy meritocracies.

nkurz2y ago

Your point is correct, but I'm not surprised by the difference. I think "legibility" is the term of art here. Writing a paper like this makes it almost[1] certain to the institution he is applying to that he is capable of writing a paper of this quality, while all the other metrics (GPA, GRE, etc) are much more probabilistic. Since someone incapable of writing such a paper is probably unsuited for a PhD, it seems entirely appropriate to choose applicants who have demonstrated ability to clear this bar over those that have not.

[1] "Almost" to account for the slight chance that he didn't actually author the paper but somehow managed to get his name put on it anyway.

godelski2y ago

I agree and there's a lot in my comment to point to that. But my point is to distinguish between the metrics and the goals. I'm certain the author included in their CV that they had a pending paper when applying, so there is a signal, albeit a weaker but publishing is a weak signal to begin with.

I agree that you need to use metrics. But we need to be clear that metrics are not enough and very incomplete themselves. With something like admissions, I'm not sure there's anything except noisy signals and the strongest one by far is the interview.

> Since someone incapable of writing such a paper is probably unsuited for a PhD,

I very much disagree with this. The explicit purpose of schooling is to train people. Many undergrads are not going to have the opportunities to publish. It is not hard to train someone to write something publishable and this is not something I would be much concerned with myself given how much writing they're going to be doing over the next few years. The far more valuable skills are in being able to perform research which is quite ambiguous (there are at least 2 ways to read this sentence and both are correct: research type v measure). Your first 2 years of your PhD are almost exclusively training, with more class work and learning how to begin research. This isn't a job you're applying for, it is a training program.

nkurz2y ago

>> Since someone incapable of writing such a paper is probably unsuited for a PhD

> I very much disagree with this.

Your disagreement is justified. I phrased that poorly. I meant it as a shorthand for "incapable of being trained to write such a paper". Showing that you already have the skill is proof, everything else just points to the possibility with varying degrees of accuracy.

I in turn disagree that "the purpose of schooling is to train people", at least if "schooling" refers to PhD programs. I think it's more that there aren't enough applicants who are able to perform without extensive training, so in practical terms PhD programs need to be willing to provide training. But at the same time, it's perfectly understandable that they would prefer to take applicants who have demonstrated ability to perform over those with statistical potential.

I'd prefer something like "The purpose of PhD programs is to advance the field". I'm personally in the odd category that I've co-authored several computer science research papers despite having dropped out to become a programmer prior to my BA. I've demonstrated my ability to perform much of the role of a PhD while simultaneously demonstrating that I perhaps shouldn't be relied upon to finish!

2 more replies

BrandoElFollito2y ago

Another thing is that there is not enough pushback from the community at large.

My PhD thesis was less than 40 pages long. The introduction was 1/2 a page (basically "if you need an introduction you should not read this, here are 3,4 books to get you started").

Then I copied/pasted from my articles and then came the acknowledgments (which I actually fund valuable because I wanted to thank my advisor for his non-science-related help and a friend for her magnificent idea that turned around the thesis. And my parents, wife, dog etc.)

Then the conclusion ("brilliant work")

And then a discussion with myself about everything that I fucked up and what could be improved (my advisor fainted on that one).

The jury was 8 people. The younger/more dynamic ones were super happy (especially that they made their review a page long as well). The older ones were disgusted and said that clearly. I got my PhD.

I fought in Academia for a few years to bring some change but eventually left (also for other reasons). If I was to stay for my whole career I would have tried again and again to change the status quo.

dotnet002y ago

A friend who recently finished her PhD had a similar experience, where all the senior scientists at our lab were concerned because her thesis was "only" 100 pages long and she didn't go through a professional editor to have it perfected.

My preliminary defense thesis had to be 50+ pages, but during the presentation, it was pretty obvious that the committee had at best looked at the table of contents. It all feels like such an unnecessary waste of effort. Even with my own thesis, over half of it is just padding with very fundamental background information because the work isn't really so complicated as to require that many pages to discuss, it's just demonstrating more advanced simulation capabilities by implementing GPU acceleration for a niche but simulation heavy field.

BrandoElFollito2y ago

Theses at my time were about 200 pages long. A friend of mine wrote two tomes.

I clearly stated that I would not waste my time and the reviewers are free to provide comments and we will see during the defense.

I found out that a lot of these "rules" are traditions that one can challenge and suddenly they are not traditions anymore.

dotnet002y ago

Yes, the only hard rule in my department is having a minimum of 50 pages, the idea that 100 pages is not enough came from the scientists applying their own experiences from years ago. Technically there was nothing they could do about her thesis having fewer pages, but as inexperienced students, it's obviously a little scary when people you look up to sound concerned (since academia is full of all sorts of unproductive and unstated expectations).

A friend at another department only had a minimum requirement of 5 pages, and his thesis ended up being just a collection of his publications.

1 more reply

taopai2y ago

Papers... our new religion...

darthoctopus2y ago

[2021]

patrickmay2y ago

Nearly three times the number of papers published by Claudine Gay. Why isn't he President of Harvard?

vaidhy2y ago

I downvoted this comment because it is not pertinent to this topic and just flamebait.

lapinot2y ago

The tone is probably flamebait, but the content is on topic imho. 15 papers before getting a phd?! Being a phd myself and having serious issues with a lot of the academic practices, reading the title i thought i would identify with the author and situations, which i didn't. I don't think this kind of experience is representative with the common "struggle with papers" among young researchers i have seen around me.

j / k navigate · click thread line to collapse

44 comments

schneems2y ago

I had an assignment in the OMSCS course where we had to turn the results of a project into a paper and a presentation. It was eye opening on why so many CS papers are difficult to decipher.

godelski2y ago

> Journal papers are limited by length and so was our assignment

Side note: CS groups typically publish in conferences

jltsiren2y ago

It's also easier to find reviewers for short papers than for long ones.

godelski2y ago

> Page limits force you to focus.

> It's also easier to find reviewers for short papers than for long ones.

That's a separate problem that needs to be addressed, but is not easy.

> Some the issues you mention are specific to CS conferences.

1 more reply

sideshowb2y ago

I think desirability of page limits is very subject specific. Some people will just waffle if you don't give them a page limit. Other times it means there's not room for the technical details.

1 more reply

outrun862y ago

[1] https://distill.pub/2021/distill-hiatus/

godelski2y ago

2 more replies

ketzo2y ago

What a great resource, both for self-reflection and for a student who wanted to chase a similar career. I should really do something similar for my history of paid work.

ShadowBlades5122y ago

ketzo2y ago

That's a great idea, I think I'll start doing that. Sounds super worthwhile for very little effort.

halgir2y ago

No way - reading this I thought I recalled one of the papers (Starcraft from the Stands). Pulled up my Zotero library, and sure enough, I cited it in my BA thesis almost ten years ago.

What a pleasant coincidence - thanks for the contribution!

amadeuspagel2y ago

There's this new thing that some academics are working on at CERN - kind of like academic papers, with references and so forth, but on the computer.

jll292y ago

Thanks for sharing a behind-the-curtain view on the history of your publications.

fallon542y ago

AKA why you probably don't want to be in academia

vladms2y ago

I think most of the things in life come with a lot of struggle, strange things, things that should be different and so on.

Making a startup? Go and check how hard and crazy that is. Make a family? Similar convoluted process with ups and downs.

ajsnigrutin2y ago

academia_hack2y ago

JohnKemeny2y ago

This is only true to some extent, having myself been on a fair number of hiring committees.

A person with 50 shit papers will not even be considered for the job.

ajsnigrutin2y ago

edit: i'm from slovenia, univeristies here are "autonomous" (not a direct part of the government.. except for being government funded), and they decide all the internal rules themselves.

1 more reply

godelski2y ago

nkurz2y ago

[1] "Almost" to account for the slight chance that he didn't actually author the paper but somehow managed to get his name put on it anyway.

godelski2y ago

> Since someone incapable of writing such a paper is probably unsuited for a PhD,

nkurz2y ago

>> Since someone incapable of writing such a paper is probably unsuited for a PhD

> I very much disagree with this.

2 more replies

BrandoElFollito2y ago

Another thing is that there is not enough pushback from the community at large.

My PhD thesis was less than 40 pages long. The introduction was 1/2 a page (basically "if you need an introduction you should not read this, here are 3,4 books to get you started").

Then the conclusion ("brilliant work")

And then a discussion with myself about everything that I fucked up and what could be improved (my advisor fainted on that one).

The jury was 8 people. The younger/more dynamic ones were super happy (especially that they made their review a page long as well). The older ones were disgusted and said that clearly. I got my PhD.

dotnet002y ago

BrandoElFollito2y ago

Theses at my time were about 200 pages long. A friend of mine wrote two tomes.

I clearly stated that I would not waste my time and the reviewers are free to provide comments and we will see during the defense.

I found out that a lot of these "rules" are traditions that one can challenge and suddenly they are not traditions anymore.

dotnet002y ago

A friend at another department only had a minimum requirement of 5 pages, and his thesis ended up being just a collection of his publications.

1 more reply

taopai2y ago

Papers... our new religion...

darthoctopus2y ago

[2021]

patrickmay2y ago

Nearly three times the number of papers published by Claudine Gay. Why isn't he President of Harvard?

vaidhy2y ago

I downvoted this comment because it is not pertinent to this topic and just flamebait.

lapinot2y ago

j / k navigate · click thread line to collapse