By Michael Nielsen, December 9, 2022
It's conventional to think of discoveries as the basic unit of advance in science. In these notes I explore some limitations of this view, and implications for both metascience and individual contribution to proto-sciences. In particular, I explore the idea that the basic unit of advance in science is communities of practice. This is meant as an (under-explored) complementary point of view, not a complete substitute. The notes are more exploratory than my working notes usually are -- this is an experiment in exploring incomplete ideas in public, and as such it's all tentative, quarter-baked ideas! But if you're interested in the questions discussed here, then thoughtful comments are very welcome -- see the comment section at the end.
Suppose you were granted a very special, very limited time traveling ability: the ability to kill off one single cell in 25 year old Adolf Hitler's body. Which cell would you pick, in order to minimize the damage Hitler does to the world? It's a hard question to answer. Human beings seem remarkably resilient to the loss of nearly all our individual cells. Indeed, in speaking with biologists it seems hard to identify any individual cells whose sudden absence would make a huge difference in our lives1. And yet just because we're quite resilient to the loss of single cells doesn't mean we're resilient to the loss of all our cells, or even to the loss of a significant fraction of our cells2.
I was recently talking with a friend about the value of certain types of scientific research. They were pressing me for examples of discoveries and scientists that made a unique difference – a true "marginal impact"3. And for almost everything I named, they dismissed it with: well, if that discovery hadn't been made by that particular person, someone else would very soon have made the same or a very similar discovery. For the most part, I agreed with these assessments. Indeed, the phenomenon of independent scientists making the same discovery is so common that it has been named and extensively studied – it's the phenomenon of so-called "multiples"4 (as in multiple discoverers). Outside big science – where access to unique facilities can sometimes confer a long-lasting edge – I found it difficult to think of any truly important discovery where it seems unlikely that someone else would not have made it within a short time5. And, as a result, my friend seemed a little dismissive about the value of improving how we fund science.
This seems to me a mistake. It's much like saying cells aren't important in biology because none (or almost none) are crucial for a lifeform. However, it does suggest some important questions. Most of all: if you want to assess counterfactual or marginal advance in science, how to do it?6 In other words, for purposes of analysis, what's a good unit of advance in science?
This question is challenging at many levels. It's challenging for individual scientists in a very practical, every day sense: why are you doing your work if someone else will make the same discovery anyway7? Are you bringing anything unique to it? It can be unpleasant to feel as though you are merely an extension of a social machine that will do what you do, no matter what. It's also difficult for funders: why fund any particular individual project if the likely outcomes are either: the work is relatively unimportant (and may or may not be done also by others); or it's very important, but someone else will almost certainly make the same discovery8? More broadly, it's a problem for metascience: if metascience is to evaluate and amplify the best social processes in science9, then how should that evaluation be done? Suppose you develop a new approach to funding that is really good at funding very important discoveries; unfortunately, it also tends to fund discoveries that would certainly have been made anyway, at about the same time. How should such a funding scheme be compared to a funding approach that is really good at funding somewhat less important discoveries that were, however, considerably less likely to have been made by others? Which of these approaches should be considered better?
I won't solve this problem here, I'm just identifying it and exploring it in these notes. One idea is that what one wants to do is to give credit to the creation of the conditions that made the discovery near-inevitable. Perhaps the easiest approach is to accept that multiples are common, and to ignore that for purposes of evaluation. E.g., if some approach to funding results in a major breakthrough – say, CRISPR – then give it "full" credit for the discovery, even if other labs discover CRISPR at very nearly the same time – indeed, perhaps even earlier10. But at the same time also (independently) monitor the frequency of multiples, as a check you're not merely swooping in and claiming the credit for something that would have happened anyway. If some intervention seems to change the frequency of multiples a lot, then it's worth further investigation; otherwise, it seems reasonable to ignore the multiples.
A very different approach, suggested by Pierre Azoulay and Danielle Li11, is to change the unit of analysis:
It is also important to consider the unit of analysis. An individual-level analysis typically yields an estimate of the average effect of being “treated” by funding, that is, the impact of funding for a typical scientist. Funders, however, may be interested in understanding their impact on a field of research as a whole. In this view, it is not enough to compare treatment and control outcomes at the level of an individual scientist because two applicants may have similar ideas: if funding enables one scientist to publish her results ahead of another, that yields a big impact from the perspective of her individual output, but it may not yield as large an impact on her field because that research idea would have been performed regardless. In order to assess the impact of funding on an entire area, one can still apply the same techniques as the ones described above, but focusing on fields rather than individuals as the unit of “treatment.”
We can, for instance, imagine randomizing entire (proto-)fields into different funding regimes, and doing it at enough scale that we can start to learn things about what effect different approaches to funding have. Indeed, DARPA already does a (very weak!) version of this, placing considerable focus on unusual programs as their unit of analysis12. At the least, they seem to think in terms of "such-and-such a program was or was not successful [and here's why]" at least as much as "such-and-such a grant was or was not successful [and here's why]". I've spoken with other funders who at least say some of this, but in practice it rarely seems baked in in quite the same way.
I must admit, when considered at this scale, it's challenging to think of this as entirely practical, and that dampens my enthusiasm for thinking it through. A challenge with metascience is that it takes a long time and a lot of resources to evaluate alternate social processes, even if the unit of evaluation is individual projects. How much more challenging this becomes if the unit of analysis is entire fields! But sometimes it's worth thinking about even extremely unlikely ideas. In this case: people are (AFAICT) firmly anchored on the perspective that says discoveries are the main unit of advance in science. And so I wonder: if you change perspective, might that generate useful ideas? Even regardless of practicality. Not as a full substitute – discoveries are still important – but taking the idea of "fields as the unit of advance" seriously as a complement. Imagine you're the Directly Responsible Individual for evaluating field advance at some funder. What does your job look like? What do you care about that no-one else does? What do you disagree with or find annoying in other people's perspectives? At the very least it's easy to generate dozens of plausible hypotheses and concomitant challenges:
Maybe variance-based funding approaches produce new fields at a higher rate? (Plausibly true of many non-standard funding approaches, though of course the explanations may vary.)
Challenge: funders rarely operate in isolation. This makes analysis difficult: maybe exogenous factors were crucial? But it may also be an opportunity. For example, it's interesting to ponder randomizing field-founding activities between a US and an EU-based funder.
Other variables to consider: rate of growth of a field; rate of major discoveries produced by a field; whether the field spawns other new fields; whether the field eventually dies; whether the field gets stuck in ruts for long periods of time; diversity of approaches being pursued within a field.
In general, the advancement of fields in different countries (or at different universities) is an interesting question. The conventional presumption is that this is largely dependent upon how much money is invested. It's certainly not unrelated. But (a) it's not the only things that matters; and (b) I doubt it's the main causal factor. In particular, I wouldn't be surprised if money invested is mostly downstream of more important factors. What are those factors?
Is it possible to silo overlapping fields? It's conventional wisdom that it's "bad" for silos to exist. But a moment's thought shows this conventional wisdom is full of holes. Would it have been good if Blockbuster and Netflix had an enforced merger, early on? Or if Blockbuster executives had a say in the performance reviews of people at Netflix? Siloing can preserve crucial differences, and enable their consequences to be played out; think of the separate silos as wildlife preserves for endangered ideas to be developed13. In some (not all) ways it's a pity that it's so hard to maintain strong siloing in science, over extended periods.
I'm flailing here: there's some local points of interest, but I don't have much real purchase. It needs more ideas or a deeper context (one such would be to be at a funder who was doing this, which would make it all pretty real!) Basically: the starting problems need to be a lot sharper, to improve pressure for good solutions! In any case, it's an idea to flag for possible future development: if I wanted to come back to it, I'd need to talk it over with a half dozen people, and come up with 10x as many starting ideas, and develop the most striking much more aggressively. But instead of doing any of that, I will change angle on the problem, and come at it from a different direction.
In many fields, the earliest work – the work done while it's still a proto-field – often seems almost pre-scientific. Think about the Dynabook paper, or the proposal for the world wide web, or early writings on psychology and psychiatry, from William James to Sigmund Freud to Noam Chomsky. These and many similar works are often no more technical than a detailed popular science article. They may involve no data, but rather a combination of speculation and anecdotes. When experiments and data collection are done, the techniques are often14 primitive, cobbled together. For instance, when Eugene Garfield co-founded the field of scientometrics he compiled by hand some very rudimentary scientometric data on a handful of articles.
What distinguishes work in this proto-field stage from "just writing"? This is a problem of general interest; it's also a problem I've often felt acutely, personally, since much of my own work has been done in the proto-field stage: on quantum computing, open science, tools for thought, metascience15. With enough time that work became more obviously a contribution to a genuine field. But in the early days there's a great deal of fumbling around, attempting to figure out basic problems and concepts and connections. Sometimes this work is fruitful; much of it is not. And it's only in hindsight one can see what is important; indeed, it's only in hindsight that it's obvious the work is a contribution to an emerging field at all. It's emotionally strange too: work that seems of little import at the time, and of little interest to others, is sometime retroactively considered to have been seminal. Only a tiny few things in history need to have been different for TimBL's proposal for the web to look like an obscure footnote16; and many of those things were either out of his control entirely, or certainly not entirely under his control. Put another way: it wasn't just the quality of the work, but how it ultimately fit into a larger ecosystem of effort.
So what needs to happen to go from that proto-field stage to an actual field17? I won't try to define a field here – that's a much harder problem than I want to get into. But I do want to note a few things that need to happen: fields distill shared canons18 of techniques and results that experts are expected to master; many of those techniques and results are in some sense "composable" and "combinable"19. This is the praxis an expert applies in order to obtain new results – to do science20. And fields are carried by a community: a community that (collectively) trains new people to attain expertise, assesses their work, and continually updates the canon and training methods. This is the sense in which fields like physics or mathematics or (the more theoretical part of) economics are different from "merely" writing: they have a community-held shared praxis for what it means to make progress.
(Note that this does not at all imply that all such fields of "expertise" are equal. The description I've just given applies also to astrology and to many other very-to-moderately dubious endeavors as well: they also have shared canons of expertise, and so on. But there are enormous differences in how fields of expertise are grounded and the underlying soundness of their foundations. Theoretical physics was used to predict gravitational waves requiring accuracy of one part in 10 to the 22 to detect; astrology makes no worthwhile predictions at all.)
Those are the two points I want to emphasize here: some notion of developing a body of expertise; and that this notion is carried and certified and transmitted and maintained (and changed!) by a community. This does not mean that these things are fixed or immutable – not at all. The canon will typically change; what it means for someone to be an expert will change. Nor is the notion cut-and-dried. But it is often a good and useful approximate model of the real situation.
In many ways, our research institutions and social processes are a system for producing such communities, far more than any individual discovery. Put another way: they are a system for producing new types of expertise. It's in that sense that fields – or communities of practice – are the basic unit of advance in science. And it is what early work in proto-fields is aspiring toward: the (eventual) creation of a new type of expertise, and a community to carry that forward.
I realize this all seems rather obvious. "New fields produce new kinds of expertise" isn't exactly a shocking new insight! And I know, upon brief acquaintance, that there are many people who have though much more deeply about this process than I have, and entire disciplines of thought my naive account ignores. But I think it is interesting to shine a spotlight on this process, and to think about how it happens. Usually it's a combination of old types of expertise, as well as just plain speculative thinking. As I say: often – though not always – it's not so easy to say how it's different from just plain writing, since it may only draw loosely on existing bodies of expertise. This is often true even of quite technical subjects: you can look at field-founding documents for things like computer science, quantum computing, and gravitational wave astronomy, and they're surprisingly accessible. They do draw on past areas of expertise, but much of it is quite bare hands. For instance, Turing's founding paper for computer science was written when he was in his early 20s; it uses some mathematical notation, but it's mostly fairly incidental. Mostly, this is just a guy thinking, at a level not far removed from freshman mathematics. Which makes sense: Turing wasn't really all that far removed from freshman mathematics!
This is all quite vague! Part of this is due to my ignorance – I have no doubt I could be more precise and more accurate if I knew far more about science, metascience, the history of science, the philosophy of science, and the sociology of science. But I know enough about those fields to suspect that in fact there's much we simply don't know about these questions21. And I suspect part of the trouble is intrinsic: there is no sharp notion of a field, or a community of practice, or a canon; they are intrinsically somewhat vague.
For purposes of analysis of effectiveness of social processes, what is the right unit of analysis in science? Using discoveries is surprisingly difficult. But if not discoveries-as-the-unit-of-analysis, then what?
Using communities or fields as the unit of analysis is an interesting idea. It also runs up against many problems – of scale, of sharp definition, and of evaluating change. But at least it's different, and that is perhaps useful as a perspective shift!
Of course, the impact of discoveries (or any other unit of advance) is not the only things which matters. Understanding the molecular machinery of a cell usually does not tell you much about an animal's evolutionary fitness; but that certainly does not mean understanding the machinery of the cell is of no interest.
This issue of "unit of advance" can be safely ignored in the immediate term in my work on metascience. It may be valuable as stimulus – it's a nice point of view – but it doesn't need to be addressed. On the other hand, the transition from no-praxis to field-of-expertise really is deeply interesting.
A good question: can new fields be kickstarted through novel forms of online community? Wikipedia, Linux, Less Wrong, Effective Altruism, and many others have all done things which approximate this. Of course, for the most part they're not aiming at new types of creative or scientific expertise; insofar as they are, it seems likely that they're in some ways rather less successful than are traditional approaches. What are the ways they are better? Worse? (Some thoughts on the latter: they rely overmuch on online charisma; are insufficiently grounded; and typically have poor norms and mechanisms for the accumulation of knowledge.)
What steps does a proto-field go through to become a well-grounded, effective field? How should individuals who are participating think about their contribution? What are the crucial types of contribution?
Thanks to ChatGPT, Laura Deming, Tim Hwang, Adam Marblestone, Kanjun Qiu, and Toby Ord for conversations about related subjects.
There are many similar examples that may be used as the basis for similar analogies: any individual piece of a strong-coupled system with unexpected high-order collective effects is a good candidate. For example, if you remove a car from a traffic jam it typically won't affect the existence of the traffic jam. What matters are, instead, collective parameters. Of course, sometimes there are tiny pieces which are crucial. Alexander the Great was just a single man, but his death certainly had repercussions for ancient Macedonia and surrounds.↩︎
In the sense of economics – meaning the counterfactual difference generated by that work, compared to a world in which the work hadn't been done. Not "marginal" as in "small"(!)↩︎
See: Robert K. Merton, "Singletons and Multiples in Scientific Discovery: A Chapter in the Sociology of Science", Proceedings of the American Philosophical Society (1961).↩︎
There is a canonical example often given of such a discovery: Einstein's discovery of general relativity. (Sometimes muddled by the incorrect assertion that Hilbert "independently" discovered it: in fact, Hilbert learned most of the physical content of the theory from Einstein, and only put a few finishing touches on, a few days ahead of Einstein). But such examples seem relatively uncommon. The best modern example I can think of is Kitaev's invention of topological quantum computing, which I suspect accelerated that field by more than a decade, and made a considerable difference to the entire progress of science. Another example is Dave Wineland's work cooling single ions to their motional ground state. Many other laboratories worked hard to do this, but for more than a decade Wineland's lab was the only one capable, due to several competitive advantages. But in general it's surprisingly hard to think of examples.↩︎
It's fascinating to contrast to other creative areas, such as novel-writing and painting. It's difficult not to believe that only Jane Austen could have written the novels of Jane Austen. Of course, Austen owes much to the milieu in which she wrote, and to prior writers, but much of her greatness lies not merely in mastering existing forms, but in her unique individual power of observation and understanding. By contrast, it seems likely that the works of particular artists are often downstream of broader artistic movements. Cubism was "in the air", and if not Picasso someone else would have emerged as its greatest exponent – perhaps Braque, perhaps someone else. In this sense, writing novels is a moderately collective activity; painting is considerably more collective, and science may be even more collective still. Incidentally, there is, of course, an analogous problem in the startup ecosystem. Most successful large companies have relatively little marginal impact: if Google hadn't been started, someone would likely have done something very similar. The more correct the efficient market hypothesis is, the less the contributions of individual entrepreneurs make any meaningful difference.↩︎
Of course, there are many possible retorts to this question. One is the simple enjoyment of discovery and of understanding, even if one isn't "first". Contrariwise, I know scientists who love to compete, and strongly desire to be first in such races. A (rough) quote from a well-known scientist: "After all, isn't that why we do science? The competition and desire to be first?" He appeared unaware of the incredulous stares from several other people nearby. It's interesting that such people are often very successful as scientists.↩︎
I've talked with surprisingly many funders who feel this point about unique impact as an evaluation problem ("why fund work that will be done anyway by someone else?"), but seem surprisingly unaware of the psychological impact on individual scientists. But of course the latter tends to be far more keenly felt: it's not just an intellectual problem, but a deeply felt emotional problem.↩︎
See the following, and references therein: Michael Nielsen and Kanjun Qiu, "A Vision of Metascience: An Engine of Improvement for the Social Processes of Science", https://scienceplusplus.org/metascience/index.html (2022).↩︎
It's interesting to ponder what happens when someone is just slightly too late in making the discovery. Traditionally, they are given far less (or no) credit, except in unusual circumstances. (An example: Wolfgang Ketterle wasn't the first to do Bose-Einstein condensation, but made so many seminal contributions that he was jointly awarded the Nobel together with the people who made Bose-Einstein condensates first.) But for metascientific purposes, in many cases I suspect it makes a lot of sense to give a lot of "partial credit". This is still tricky: it can be easy to say with hindsight that you were near making a discovery when you were missing something crucial, and might have continued to miss it for quite some time.↩︎
Pierre Azoulay and Danielle Li, "Scientific Grant Funding", NBER Working Paper (2020).↩︎
Other funders are certainly very aware of individual programs, especially operationally. But DARPA seems unusually focused on programs – including unusual programs, which wouldn't be funded by anyone else – as their unit of analysis. This impression may merely reflect my limited experience.↩︎
I've adapted phrasing from Murray Gell-Mann's "wildlife preserve for endangered theorists".↩︎
Though far from always. Sometimes a new field is created when a new means of experiment is created: the first radio telescopes, for instance, or synchrotrons, and so on.↩︎
Note that quantum computing is a science, while work on open science, tools for thought, and metascience has a somewhat different character. I think of all three as part of a broader field: tools to enhance human creativity and discovery. That's in considerable part a design science or design field, but also has other aspects. For instance, there is an important component of activism in work on open science.↩︎
Most obviously: if CERN hadn't allowed him to open the protocol, but had instead gone a more commercial route – as several similar protocols did at the same time – I expect the web would be a footnote, just as those protocols are now considered.↩︎
More generally, it's interesting to ponder field dynamics: how fields are created, splinter, merge, and die.↩︎
What actually lasts in science? For many discoveries I suspect a body of living, engaged experts is almost necessary for those discoveries to retain their full meaning. I wonder if there are parts of our scientific history that are now more or less inert, with no living experts? I presume it would be easier to re-animate them than to rediscover them, but I'm not sure.↩︎
Or at least the aspiration to composable results. Something that bothers me a lot about the replication crisis in psychology is this: why wasn't replication happening as a matter of course, not because it was being bureaucratically imposed, but for the more prosaic reason that it was needed en passant to do the work people wanted to do? In physics, wrong papers are written and published all the time. They're not usually "found out" upon failed replication. Rather, when someone discovers something important, other people try to improve on it, use those ideas as part of more ambitious projects. Often, they don't do exact replications, but rather simply try to leap to the better thing, a kind of super-replication. "Oh, you've built a laser emitting coherent light? Now, what happens when you shine it on atoms?" "Oh, now you've got really good lasers and understand how they interact with atoms, can you use that to cool those atoms down?" And so on and so forth – a line of ideas building upon one another. Why wasn't that happening to anywhere near the same extent in psychology? Presumably the answer is that the the psychologists aren't trying to compose and build upon and combine ideas in quite the same way. But I don't understand this in much detail at all.↩︎
Incidentally, funders and wealthy individuals sometimes think they can use capital to establish new fields, more or less by fiat. "This is an interesting question, I will throw a bunch of money at it, and call it a field." Certainly, money can be used to create Potemkin fields. But no amount of money can magically result in deep ideas whose time has not yet come. Confusing power with understanding is surprisingly common.↩︎
The most relevant works I know of for these questions are perhaps Randall Collins's "Sociology of Philosophies" and Andrew Abbott's "The System of Professions". But neither seems all that relevant for the specific questions I have been most concerned with.↩︎