In How I do research, TurnTrout writes:
[I] Stare at the problem on my own, ignoring any existing thinking as much as possible. Just think about what the problem is, what’s confusing about it, what a solution would look like. In retrospect, this has helped me avoid anchoring myself. Also, my prior for existing work is that it’s confused and unhelpful, and I can do better by just thinking hard.
The MIRI alignment research field guide has a similar sentiment:
It’s easy to fall into a trap of (either implicitly or explicitly) conceptualizing "research" as "first studying and learning what’s already been figured out, and then attempting to push the boundaries and contribute new content."
The problem with this frame (according to us) is that it leads people to optimize for absorbing information, rather than seeking it instrumentally, as a precursor to understanding. (Be mindful of what you’re optimizing in your research!) [...]
… we recommend throwing out the whole question of authority. Just follow the threads that feel alive and interesting. Don’t think of research as "study, then contribute." Focus on your own understanding, and let the questions themselves determine how often you need to go back and read papers or study proofs.
Approaching research with that attitude makes the question "How can meaningful research be done in an afternoon?" dissolve. Meaningful progress seems very difficult if you try to measure yourself by objective external metrics. It is much easier when your own taste drives you forward.
And I’m pretty sure that I have also seen this notion endorsed elsewhere on LW: do your own thinking, don’t anchor on the existing thinking too much, don’t worry too much about justifying yourself to established authority. It seems like a pretty big theme among rationalists in general.
At the same time, it feels like there are fields where nobody would advise this, or where trying to do this is a well-known failure mode. TurnTrout’s post continues:
I think this is pretty reasonable for a field as young as AI alignment, but I wouldn’t expect this to be true at all for e.g. physics or abstract algebra. I also think this is likely to be true in any field where philosophy is required, where you need to find the right formalisms instead of working from axioms.
It is not particularly recommended that people try to invent their own math instead of studying existing math. Trying to invent your own physics without studying real physics just makes you into a physics crank, and most fields seem to have some version of "this is an intuitive assumption that amateurs tend to believe, but is in fact wrong, though the reasons are sufficiently counterintuitive that you probably won’t figure it out on your own".
But "do this in young fields, not established ones" doesn’t seem quite right either. For one, philosophy is an old field, yet it seems reasonable that we should indeed sometimes do it there. And it seems that even within established fields where you normally should just shut up and study, there will be particular open questions or subfields where "forget about all the existing work and think about it on your own" ought to be good advice.
But how does one know when that is the case?
My field is theoretical physics, so this is where my views come from. (Disclaimer: I have not had a research position since finishing my PhD in General Relativity some 10 years ago.) Assuming you want to do original research, and you are not a genius like Feynman (in which case you would not be interesting in my views, anyway, what do you care what other people think?):
Map the landscape first. What is known, which areas of research are active, which are inactive. No need to go super deep, just get the feel for what is where.
Gain the basic understanding of why the landscape is the way it is. Why are certain areas being worked on? Is it fashion, ease of progress, tradition, something else? Why are certain areas being ignored or stagnate? Are they too hard, too boring, unlikely to get you a research position, just overlooked, or something else?
Find a promising area which is not well researched, does not appear super hard, yet you find interesting. Interdisciplinary outlook could be useful.
Figure out what you are missing to do a meaningful original contribution there. Evaluate what it would take to learn the prerequisites. Alternate between learning and trying to push the original research.
Most likely you will gain unexpected insights, not into the problem you are trying to solve, but into the reason why it’s not being actively worked on. Go back and reevaluate whether the area is still promising and interesting. Odds are, your new perspective will lead you to get excited about something related but different.
Repeat until you are sure that you have learned something no one else has. Whether a question no one asked, or a model no one constructed or applied in this case, or maybe a map from a completely unrelated area.
Do a thorough literature search on the topic. Odds are, you will find that someone else tried it already. Reevaluate. Iterate.
Eventually you might find something where you can make a useful original contribution, no matter how small. Or you might not. Still, you will likely end up knowing more and having a valuable perspective and a skill set. Physics examples: don’t go into QFT, String theory or Loop quantum gravity. No way you can do better than, say, Witten and Maldacena and thousands of theorists with IQ 150+ and the energy and determination of a raging rhino. Quantum foundations might still have some low-hanging fruit, but the odds are against it. No idea about the condensed matter research. A positive example: Numerical relativity hit a sweet spot about 15 years ago, because the compute and the algorithms converged, and there were only a few groups doing it. Odds are something similar is possible again, just need to find where. Also, Kaj, your research into multi-agent models of the mind, for example, might yield something really exciting and new, if looked at in a right way, whatever it is.
I basically disagree with the recommendation almost always, including for AI alignment. I do think that
Comment
I often see the sentiment, "I’m going to learn linear algebra, probability theory, computational complexity, machine learning and deep RL, and then I’ll have the prerequisites to do AI safety". (Possible reasons for this: the 80K > AI safety syllabus, CHAI’s bibliography, a general sense that you have to be an expert before you can do research.) This sentiment seems wrong to meSee also, my shortform post about this.
Comment
+1, I agree with the "be lazy in the CS sense" prescription; that’s basically what I’m recommending here.
IMO the correct rule is almost always: first think about the problem yourself, then go read everything about it that other people did, and then do a synthesis of everything you learned inside your mind. Some nuances:
Sometimes thinking about the problem yourself is not useful because you don’t have all the information to start. For example: you don’t understand even the formulation of the problem, or you don’t understand why it is a sensible question to ask, or the solution has to rely on empirical data which you do not have.
Sometimes you can so definitively solve the problem during the first step (unprimed thinking) that the rest is redundant. Usually this is only applicable if there are very clear criteria to judge the solution, for example: mathematical proof (but, beware of believing you easily proved something which is widely considered a difficult open problem) or something easily testable (for instance, by writing some code).
As John S. Wentworth observed, even if the problem was already definitively solved by others, thinking about it yourself first will often help you learning the state of the art later, and is a good exercise for your mind regardless.
The time you should invest into doing the first step depends on (i) how fast progress you realistically expect to make and (ii) how much progress you expect other people to have made by now. If this is an open problem on which many talented people worked for a long time, then expecting to make fast progress yourself is unrealistic unless you have some knowledge to which most of those people had no access, or your talent in this domain is truly singular. In this case you should think about the problem enough to understand why it is so hard, but usually not much longer. If this is a problem on which only few people have worked, or only for a short time, or it is obscure so you doubt it got the attention of talented researchers, then making comparatively fast progress can be realistic. Still, I recommend proceeding to the second step (learning what other people did) once you reach the point when you feel stuck (on the "metacognitive" level when you don’t believe you will get unstuck soon: beware of giving up too easily).
After the third step (synthesis), I also recommend doing some retrospective: what have those other researchers understood that I didn’t, how did they understand it, and how can I replicate it myself in the future.
This is demonstrably not (always) the case. Famously, Richard Feynman recommends that students always derive physics and math from scratch when learning. In fact his Nobel prize was for a technique (Feynman diagrams) which he developed on the fly in a lecture he was attending. What the speaker was saying didn’t make sense to him so he developed what he thought was the same theory using his own notation. Turns out what he made was more powerful for certain problems, but he only realized that much later when his colleagues questioned what he was doing on the whiteboard. (Pulled from memory from one of Feynman’s memoirs.)
One of the other comments here recommends against this unless you are a Feynman-level genius, but I think the causality is backwards on this. Feynman’s gift was traditional rationality, something which comes through very clearly in his writing. He tells these anecdotes in order to teach people how to think, and IMHO his thoughts on thinking are worth paying attention to.
Personally I always try to make sure I can derive again what I learn from first principles or the evidence. Only when I’m having particular trouble, or I have the extra time do I try to work it out from scratch in order to learn it. But when I do I come away with a far deeper understanding.
Comment
Comment
I think we may be talking past each other. You say
I like this answer, but do question the point about Feynman’s gift being mainly traditional rationality.
I agree that Feynman portrays it that way in his memoirs, but accounts from other physicists and mathematicians paint a different picture. Here are a few example quotes as evidence that Feynman’s gifts also involved quite a bit of "magic" (i.e. skills he developed that one would struggle to learn from observation or even imitation).
First, we have Mark Kac, who worked with Feynman and was no schlub himself, describing two kinds of geniuses of which Feynman was his canonical example of the "magician" type (source):
(Note that, according to this source, Feynman did actually have 35 students, many of whom were quite accomplished themselves. so the point about seldom having students doesn’t totally hold for him.)
Sidney Coleman, also no slouch, shared similar sentiments (source):
"There are lots of people who are too original for their own good, and had Feynman not been as smart as he was, I think he would have been too original for his own good," Coleman continued. "There was always an element of showboating in his character. He was like the guy that climbs Mont Blanc barefoot just to show that it can be done."
Feynman continued to refuse to read the current literature, and he chided graduate students who would begin their work on a problem in the normal way, by checking what had already been done. That way, he told them, they would give up chances to find something original.
"I suspect that Einstein had some of the same character," Coleman said. "I’m sure Dick thought of that as a virtue, as noble. I don’t think it’s so. I think it’s kidding yourself. Those other guys are not all a collection of yo-yos. Sometimes it would be better to take the recent machinery they have built and not try to rebuild it, like reinventing the wheel. Dick could get away with a lot because he was so goddamn smart. He really could climb Mont Blanc barefoot."
Coleman chose not to study with Feynman directly. Watching Feynman work, he said, was like going to the Chinese opera. "When he was doing work he was doing it in a way that was just—absolutely out of the grasp of understanding. You didn’t know where it was going, where it had gone so far, where to push it, what was the next step. With Dick the next step would somehow come out of—divine revelation."
In particular, note the last point about "divine revelation".
Admittedly, these are frustratingly mystical. Stephen Wolfram describes it less mystically and also sheds light on why his descriptions of his discoveries always made them seem so obvious after the fact even though they weren’t (source:
He always had a fantastic formal intuition about the innards of his calculations. Knowing what kind of result some integral should have, whether some special case should matter, and so on. And he was always trying to sharpen his intuition.
Now, it’s possible that all this was really typical rationality, but I’m skeptical given that even other all star physicists and mathematicians found it so hard to understand / replicate.
All that said, I think Feynman’s great as is deriving stuff from scratch. I just think people often overestimate how much of Feynman’s Feynman-ness came from good old fashioned rationality.
I suspect it’s mostly proportional to the answer to the question "how much progress can you expect to make building on the previous work of others?" in a particular field. This is why (for example) philosophy is weird (you can make a lot of progress without paying attention to what previous folks have said), physics and math benefit from study (you can do a lot more cool stuff if you know what others know), and AI safety may benefit from original thinking (there’s not much worth building off of (yet)).
I basically agree with Vanessa: the correct rule is almost always: > first think about the problem yourself, then go read everything about it that other people did, and then do a synthesis of everything you learned inside your mind.Thinking about the problem myself first often helps me understand existing work as it is easier to see the motivations, and solving solved problems is good as a training. I would argue this is the case even in physics and math. (My background is in theoretical physics and during my high-school years I took some pride in not remembering physics and re-deriving everything when needed. It stopped being a good approach for physics ca since 1940 and somewhat backfired.) The mistake members of "this community" (LW/rationality/AI safety) are sometimes making is skipping the second step / bouncing off the second step if it is actually hard. Second mistake is not doing the third step in a proper way, which leads to somewhat strange and insular culture which may be repulsive for external experts. (E.g. people partially crediting themselves for discoveries which are know to outsiders)
I think one important context for not reading the existing literature first is calibration. Examining the difference between how you are thinking about a question and how others have thought about the same question can be instructive in a couple of ways. You might have found a novel approach that is worth exploring, or you might be way off in your thinking. Perhaps you’ve stumbled upon an obsolete way of thinking about something. Figuring out how your own thinking process lines up with the field can be extremely instructional, and super useful if you want your eventual original work to be meaningful. At the very least, you can identify your own common failure modes and work to avoid them.
The fastest and easiest way to accomplish all this is by using a sort of research loop where you collect your own thoughts and questions, then compare them with the literature and try to reconcile the two, then repeat. If you just read all the literature first, you have no way to calibrate your explorations when you finally get there.
I think this is mainly a function of how established the field is and how much time you’re willing to spend on the subject. The point of thinking about a field before looking at the literature is to avoid getting stuck in the same local optima as everyone else. However, making progress by yourself is far slower than just reading what everyone has already figured out.
Thus, if you don’t plan to spend a large amount of time in a field , it’s far quicker and more effective to just read the literature. However, if you’re going to spend a large amount of time on the problems in the field, then you want to be able to "see with fresh eyes" before looking at what everyone else is doing. This prevents everyone’s approaches from clustering together.
Likewise, in a very well established field like math or physics, we can expect everyone to already have clustered around the "correct answer". It doesn’t make as much sense to try and look at the problem from a new perspective, because we already have very good understanding of the field. This reasoning break down once you get to the unsolved problems in the field. In that case, you want to do your own thinking to make sure you don’t immediately bias your thinking towards solutions that others are already working on.