Ep 110: Tempest in a barcode: how rapidly can we (and should we) identify new species? (with Michael Sharkey)
How do biologists categorize species? What’s the best and quickest way to describe millions of unknown species?
On this episode, we talk with Michael Sharkey, an entomologist and taxonomist who spent much of his career at the University of Kentucky, and is now the director of the Hymenoptera Institute. Since its inception, taxonomy has relied on careful morphological analysis of specimens to delineate species. In the past few decades, the COI “barcode” region of the mitochondrial genome has become a key additional piece of genetic evidence used to characterize species. In a much-discussed 2021 paper, Michael and colleagues used barcoding to identify over 400 new species of braconid wasps. The backlash from scientists who adhere to traditional taxonomic methods was swift, and at times harsh, with critics claiming that relying primarily on COI to define species is simply unacceptable. Sharkey, however, remains convinced that taxonomy should embrace molecular tools, especially because millions of species are yet to be discovered and rates of extinction are ramping up. We talk with Michael about how many insect species there are, how barcoding can make taxonomy accessible to more scientists, and what the future of taxonomy might look like.
Cover photo: Keating Shahmehri
-
Michael Sharkey 0:08
Hey Art, do you like jelly beans?
Art Woods 0:11
They're okay. But why do you ask?
Cameron Ghalambor 0:12
Well, since I was a kid, whenever I get jelly beans, I always sort them into their different colors. And then I decide which order I want to taste them, you know, so I can kind of combine the flavors and eat them in a particular order.
Art Woods 0:24
I do the same thing!
Cameron Ghalambor 0:27
I think we as humans are great at categorizing our worlds into these like sort of nested subsets.
Art Woods 0:33
Yeah, take some foods that you see regularly at the grocery store. Cherries are well, cherries, but they're also stone fruits, which includes plums, and apricots and stone fruits are part of a larger grouping that we call simply fruits.
Cameron Ghalambor 0:47
The problem, of course, is that these categories can lead to confusion, especially once you start thinking carefully about where they come from, and what their boundaries are. For example, is a bell pepper, fruit or a vegetable?
Art Woods 1:00
Most of us would say vegetable, but of course, technically, it's a fruit because it's a seed containing reproductive structure.
Cameron Ghalambor 1:06
I think the same is true of tomatoes, also a fruit.
Art Woods 1:09
So perhaps a biological misclassification there, even if it's perfectly appropriate based on things you might cook or eat together.
Cameron Ghalambor 1:16
In many ways, species suffer from exactly the same problems.
Art Woods 1:20
Yep. Most of us characterize species in the same way that Supreme Court Justice Potter Stewart famously characterized pornography in 1964. Quote, "I know it when I see it."
Cameron Ghalambor 1:31
For example, we know that a starling is a starling when we see it, because it's more or less distinct from other medium sized blackish birds out there. And a giraffe is a giraffe, because it's got long legs a long neck and the right kind of spots. Again, distinct from everything else out there.
Art Woods 1:48
This "I know it when I see it" approach suffers from at least two major problems. The first is that our visually based categories often misrepresent groups based on other reasonable ways of putting them together, such as whether or not individuals in the group can mate with one another, or whether they share DNA sequences that are closely related enough.
Cameron Ghalambor 2:06
For example, what we used to call just giraffes have recently been recategorized into eight or more species based on combinations of morphology, ecology and genetic evidence. Turns out this is common, we find out that what we thought was one species is actually a cryptic species complex.
Art Woods 2:23
Second problem is that biologists have described only a fraction of the biodiversity on Earth.
Cameron Ghalambor 2:28
Indeed, it's possible that most biodiversity still remains to be discovered. And that traditional ways of doing taxonomy, which involve careful observations and description, may be moving too slowly in the face of anthropogenic disruptions and the looming extinction crisis.
Art Woods 2:44
Today on the show, we talk about species, how to identify them, and how many there are with Michael Sharkey, a retired professor from the University of Kentucky and director of the Hymenoptera Institute.
Cameron Ghalambor 2:56
In his career, Michael has worked mostly on the taxonomy and biodiversity of braconid wasps, a large group of species that make their living as parasitoids, in which wasp larvae grow up inside other insects.
Art Woods 3:09
A major inspiration for the bad guys in the Alien franchise.
Cameron Ghalambor 3:12
Although he started out doing traditional taxonomy based on morphology, he later started adding information from DNA sequences.
Art Woods 3:20
In particular, he used what's called molecular barcoding, which treats the sequence of the cytochrome oxidase one gene like a scannable barcode on that red pepper fruit in your grocery basket,
Cameron Ghalambor 3:31
Ugh. Are red peppers still fruit?
Art Woods 3:33
That genetic barcode lets you assign individuals to known species groups or can suggest that it's part of something new.
Cameron Ghalambor 3:39
A few years ago, Michael and colleagues published several papers that caused a big stir among taxonomist
Art Woods 3:45
In a 2021 paper, he and coauthors described over 400 new species of braconid wasps from Costa Rica, based essentially on barcodes and photographs alone.
Cameron Ghalambor 3:55
To put it mildly,any taxonomists were not amused. In multiple journal articles, the pushback was pretty intense.
Art Woods 4:02
In the show today, we talk with Michael about this controversy, along with a host of fundamental questions about the diversity of life on Earth.
Cameron Ghalambor 4:09
These include: What is a species? How many species of insects are there on earth, and how do we know? And also what's the best and most efficient way to describe new species rapidly, especially before they go extinct due to climate and land use changes?
Art Woods 4:23
Along the way, we learn some surprising facts about the basics of biodiversity.
Cameron Ghalambor 4:27
For example, it turns out that God is only reasonably fond of beetles with inordinate fondness reserved for flies.
Art Woods 4:36
Think of that next time you find a maggot in your plum.
Cameron Ghalambor 4:39
I'm Cameron Ghalambor.
Art Woods 4:40
And I'm Art Woods.
Cameron Ghalambor 4:41
And you're listening to Big Biology.
Art Woods 4:43
So Michael Sharkey, thanks so much for joining us on Big Biology. It's a pleasure to have you on the show.
Michael Sharkey 5:01
Well, thanks for having me.
Art Woods 5:02
Yeah, we're gonna talk about some, I think, profoundly interesting issues today about what species are. How much biodiversity there is in the world. Some controversies about how one should go about doing taxonomy and figuring out what levels of biodiversity are in different places and in and among different groups of organisms.
Cameron Ghalambor 5:21
Yeah. So Michael, I think one of the groups of loss that we'll talk a lot about today are the Braconidae, or what we call Braconids. Can you tell us a little bit about the mechanics and how they're different from other kinds of wasps?
Michael Sharkey 5:37
Okay. Well, in the Hymenoptera, you have, termed by laypeople, bees, wasps and ants, and sawflies. What I work with are called parasitic Hymenoptera, and that's about 70% of all wasps are parasitoid wasps. And this includes the Braconids. And what they do is they lay their eggs on top of or usually inside of other insects and arthropods. And those eggs then develop and consume and kill the host insect or other arthropod. So that's what Braconids do. They mostly attack beetles and caterpillars, caterpillars of moths and butterflies. So they're very important in controlling other insects. They're typically used in biological control. So if there's a pest outbreak in an orange grove, for example, they may release Braconids among other Hymenopterans to attack the pest insects, whatever they might be.
Art Woods 6:47
I had a couple of follow up questions about just the parasitoid lifestyle. So I've seen a lot of parasitism in my own work, working on caterpillars of different kinds, and especially in the southwestern US. And it feels like parasitoid wasps are one of the main sources of mortality, along with some parasitoid flies. Just walk us through this parasite or parasitoid life cycle. It's just fascinating to me things like you know how do the female wasps find the host to oviposit into? How do they defeat the host immune system? Like how does the wasp larva grow up inside the host?
Michael Sharkey 7:23
Well, we'll talk perhaps about... I mean, there's so many variations in that. I'll give you one generalized example. So how do they find their host? Well, typically, parasitoid wasps are fairly host specific. So they're going to be feeding on, in this case, a caterpillar that's going to be feeding on a particular species or genus of tree or plant. So they're tuned to that plant first. They'll be hanging around that particular plant. And many of them have the ability to recognize the chemicals that are emitted by the plant when the caterpillar chews on a leaf, for example. Tuning into the smell, their antennae smell, and they'll be looking for damaged leaves. And when they get that close, then they use visual cues to look for the caterpillar.
Art Woods 8:21
What about vibrations? Can they detect vibrations?
Michael Sharkey 8:24
Oh they sure can, yeah. So there are many beetles and some Lepidoptera, for example, that bore into wood. And most of the wood feeding beetles and, to a lesser extent, Lepidoptera are not actually feeding on the wood, they're feeding on fungus that is inside the wood. In a general sense, I think that the braconids are typically looking for that fungus and for the feces, with smell, and that gets them close. And then once they're there, they have these specialized organs, usually on their hind legs and on their antennae, and they use those to feel vibrations. And you have antennae in the front and legs spread wide apart in the back, they can actually absolutely precisely locate the beetle larva that's inside the wood. Then they drill through the wood with their ovipositors.
Or even more interestingly, the biggest ones like Megarhyssa, which is an enormous parasitoid hymenopteran, and in the sister group to the braconids, the Ichneumonoidea, it exudes an enzyme and presses its ovipositor into the wood, it's ovipositor is about as thin as a horse hair, and the enzymes break down the lignin and cell walls of the wood. And it just turns the wood into butter until it reaches its host.
We were working on that for a little while trying to find the enzymes, find the gene that produces the enzymes and then clone that into bacteria so that it could produce on an industrial scale, lye wood for alcohol production. But it became too complicated. It looks like in fact, there are several genes and several enzymes involved.
Art Woods 10:16
Oh so you guys never identified the set of enzymes that are responsible.
Michael Sharkey 10:20
Well, we found one enzyme, and we saw that it worked, but it didn't work efficiently enough. And we figured there were other enzymes too. So that's one case, but you could, but every species has a different story basically. They are as diverse as possible.
Cameron Ghalambor 10:36
So Michael, I, you know, kind of grew up I think in the time when the lore was that beetles were, you know, far and away the most diverse group of insects or maybe even the most diverse group of animals on earth. And I think most biologists are familiar with Haldane's famous quote, you know, when asked about what he's learned about the nature of the world, he said something along the lines that, you know, he learned that God had an inordinate fondness for beetles. But am I correct that the thinking has now changed in that the braconids are actually more diverse or thought to be more the most diverse group of insects?
Michael Sharkey 11:20
You're correct in that we no longer think beetles are the most diverse group. It switched from beetles to Hymenoptera. That is, you know, bees, wasps and ants. And there was an interesting publication on that. And people for a short time thought that Hymenoptera were the most diverse, followed by beetles and Diptera. But there's been a lot of work done now barcoding insects in the Barcode of Life Project at the University of Guelph. And now we're absolutely convinced that Diptera are much more diverse than both the Hymenoptera and the beetles.
Art Woods 12:00
And how many fold more diverse are flies than wasps, do you think?
Michael Sharkey 12:05
Yeah, again, we're still guessing on that, but I would say it's like, if flies were 100, Hymenoptera would be 60 or 70, and beetles would be around 50. So, Paul Hebert at the University of Guelph at the Center of Biodiversity Genomics, I think it's called, he did a prediction that one family, the Cecid gall flies have 2 million species. This was published about 10 years ago. That's just one family. There's hundreds of families of Diptera, just amazingly diverse.
Cameron Ghalambor 12:41
So one thing that really fascinates me about that is it seems that some of these very speciose groups are also parasitoids. So either in terms of like attacking galls or attacking other insect host species. Is that a fair generalization? Is there something about the parasitoid, parasite lifestyle that sort of favors this kind of hyper diversity?
Michael Sharkey 13:09
You could say that. It depends on how you consider gall flies, for example, Cecid flies, they're mostly gall makers, but they'd be more like a parasite, really, they don't kill the tree, they just live on the tree. But in the case of the Hymenoptera, other than the ants, it's the parasitoids that are the most diverse, so there must be something going on there. The other thing that seems linked to diversity is the haplo-diploid lifestyle where females are diploid and males are haploid. We're not sure, at least I'm not sure, what kind of advantage this gives them. But some of the most diverse groups of flies, wasps, and some other groups are, are haplo-diploid. So there's certainly a correlation there.
Art Woods 13:56
Let me float an idea that kind of follows up on Cam's question about diversity and maybe the intimacy of these parasitic or parasitoid lifestyle. So does that drive diversification because it also requires sort of extreme specialization to exist within the tissues of another organism, you know, either a plant or an animal. And something about that requires extreme specialization in a way that drives macro diversification of clades.
Michael Sharkey 14:28
Well, that would make sense. If you look at the parasitoid lifestyle, they're very host specific, usually with one or just a very small handful of species. So the correlation is certainly there. So you have a tremendous diversity of beetles and moths and butterflies and flies. And so you have this enormous diversity of potential hosts, it lends itself to tremendous species diversity in the parasitoids. We do have generalists parasitoids in the Hymenoptera. And we find that those generalists are not very species rich. So that also lends credence to your thought.
Cameron Ghalambor 15:15
So if we had to put a number, a ballpark estimate, how many different insect species do you think there actually are right now? If we just took everything together and combine them and have to, you know, come up with some estimate for a total number do we have such estimates?
Michael Sharkey 15:33
We have lots of estimates, actually.
Art Woods 15:38
Everybody has one, even I do.
Michael Sharkey 15:42
That's right. Yeah. So you know, typically, numbers like 6 to 10 million are published. And now if you're looking at the literature, most people are saying it's 10 million. But right now, there's a group of us working on insects, based in Costa Rica, where there's been a tremendous amount of research done by Dan Janzen and his wife Winnie. And they've been malaise trapping insects and rearing Lepidoptera, rearing caterpillars, for 40 years, and have barcoded literally millions of specimens. And so we now have a fairly good handle on how many insects are in Costa Rica, at least in the far northwestern corner of Costa Rica, where Dan and Winnie work.
Michael Sharkey 16:40
So I've been crunching numbers from their malaise traps and rearings. And the number that I come up with is 30 million species of insects, which would then mean something like 50 million species of terrestrial arthropods. And this is even larger than some of the biggest numbers that have been proposed. And I'm quite convinced, and a lot of people that worked with Dan Janzen and the tremendous number of species that he's barcoded in Costa Rica, are in accord with me.
Art Woods 17:18
To be clear here, are you suggesting 30 million species in Costa Rica? Or are you using the Costa Rican work to extrapolate to the world?
Michael Sharkey 17:25
Yes, looking at Costa Rican work to extrapolate to the world.
Art Woods 17:29
Maybe walk us through that extrapolation process, cause that's really interesting. So how do you study the diversity in one place like that, and then extrapolate to much bigger geographic regions?
Michael Sharkey 17:39
This is still a work in progress, but what we're doing is looking at the number of species of a very species-rich group of Braconids, this is the Microgastrinae, and they attack caterpillars. So there are about, in Dan's little corner of Costa Rica, there are about 1400 species of these Microgastrinae Braconids. So he's reared a thousand of these. And he's malaise trap a thousand of these, more or less. We looked at the overlap between the ones that Danny has reared, versus the ones that we have captured in malaise traps, and they hardly overlap at all. They overlap like about 20%.
So using that, we can predict how many are yet to be reared. And we get a number of something like 4000 species for Dan's little area of Guanacaste in Costa Rica. Now Guanacaste has about 60% of the butterflies and moths and trees of Costa Rica. So we can then extrapolate by using that ratio, presuming it's the same for microgastrines. And Costa Rica has about 4% of the trees and butterflies and moths of the world, these groups that are well studied, and so we can use that number to extrapolate to the world for for microgastrines. Then we've in multiple parts of the world looked at what proportion of the microgastrines there are compared to all other insects, and we using that number we can predict the number of insects in the world.
Cameron Ghalambor 19:37
This might be a good place to sort of transition and talk a little bit about the sort of traditional ways that taxonomists describe species based on primarily either on morphology or genetic differences. And can you maybe just give us a little brief overview of this kind of traditional process of identifying what a species actually is?
Michael Sharkey 20:02
Well, it really differs from group to group. So we're talking now about how taxonomists that work on various species-rich groups go about doing their taxonomy. And when I started doing this, it was entirely based on morphological evidence. And unfortunately, in the braconids, we don't have a really nice tool like male genitalia, that act as very good species delimiters. In other groups, like lepidoptera, and diptera, in moths and butterflies, and in flies. We don't have that very nice character suite like that, to see what's different between different species. What we used to do was, go ask various museums across the world to send us all of their specimens in genus X. And we put those all in our collection and look at them, one by one and group them, the yellow ones here, and the blue ones there, and the big ones there, and the small ones there, and start to write a key of some sort that will help to sort them out. We just basically visually tried to separate all those that are different. So when it came down to specimens that look sort of similar, we would simply make a guess and say, "Oh, well, ok I'll split them right here." And other people would lump them there. So it was very much guesswork. In the case of the Ichneumonoidae, which is part of the braconids, and ichneumonids make up the Ichneumonidae, the error rate there was over 50%
Art Woods 21:45
And by error rate you mean, what?
Michael Sharkey 21:48
Oh by error rate I mean- this is now in retrospect, we have more molecular data that allows us to test our old species concepts, and we now know that what we used to do was lump species together, that's one sort of error. The other sort of error is splitting species up so that you called same species, several different names. And then there are cases where you both lump and split, so the species is all over the place.
Michael Sharkey 22:14
So actually, this is what turned me on to using molecular data. I did a big revision for my PhD thesis. This took me eight years to actually publish. Three years to do my PhD, but I needed to travel to all the major museums in the United States and Europe in order to look at type specimens to compare my species concepts with the type specimens. So it was a very expensive and time consuming production. So then, I don't know, 10 years ago I did a revision of the same group that I did my PhD on but used molecular data. And I discovered how horrible my species concepts were. My error rate, there was 50 percent.
Art Woods 23:04
Must have been a shock.
Michael Sharkey 23:06
I think I was depressed for about two weeks. And then I thought: "Well, eureka!" you know. And it changed my whole mind. I mean, it's so thoroughly made me convinced that the molecular data are far richer and more precise than the morphological data.
Art Woods 23:23
So maybe for the listener, can you describe the molecular approach? So what do you do? What sorts of sequences do you get? And how do you use them to determine whether you've got good species or not?
Michael Sharkey 23:35
Well, the molecular approach is basically using one gene. It's called cytochrome oxidase one. And it's a mitochondrial gene, we use a portion of that for barcoding. So we take a leg off of an insect, typically, and obtain cytochrome, COI, I'll call it and we get a read out of its 658 base pairs. And now there are four different nucleotides that are there. So it's, there's an enormous amount of variation. And it turns out that in most groups of insects and animals, for that matter, the barcoding regions of COI is pretty species specific, there's a little bit of variation, but there are big gaps between species and very little variation within species. So there are good markers for species. They use of the term barcoding because this long set of nucleotides is basically like a barcode you'd see in a supermarket. This is something that Paul Hebert, who invented the idea, came up with.
Cameron Ghalambor 24:46
So I guess I'm kind of curious, a little bit about COI. So cytochrome oxidase one is a mitochondrial gene. And, you know, so my understanding of like, what makes a good, a good marker for delineating species is that it should be neutral, so it should evolve under sort of neutral processes. And it should be conservative enough that it's not changing too fast, but it should have enough variation that it sort of falls in this like, sweet spot where, you know, it becomes very species specific. But then it's also tied up within this very important organelle, the mitochondria, which is presumably under pretty strong natural selection for, you know, maintaining various metabolic associated processes. So does COI one sort of check all the boxes, as far as we know, or are there reasons why, you know, other markers potentially might be better down the line in retrospect.
Michael Sharkey 25:52
Well COI one is the best one that we've discovered to this point. And it works for the vast majority of animal life forms. There are some groups that it doesn't work for, and then other genes have been searched for to work for those particular groups. But yeah, that COI checks all the boxes and most groups. And what convinced me of this and it's convinced other people that have worked with Dan Janzen's rearing material. So, Dan sends me 1000 specimens from his rings, all of which have been barcoded.
Michael Sharkey 26:30
And what I typically do them as I look at the barcodes and separate the specimens based on their barcodes, put them into separate little containers, little trays inside my collection and then I look at them for morphological conformity, you know. So COI is telling me that's one species, and I look at them to see if morphologically, they'll get one species. And if they don't, I kind of separate them inside that box and look again. And then I look at the biological data. Now, that's an interesting thing that we have with Dan's reared material, is that they're reared from different caterpillars. And all those caterpillars have been barcoded and identified to the species level. So we can then see if, okay, well, that then that particular box that I thought might be one species, there are in fact, three different host species of caterpillars. So let's have a look at them that way. And then I look at altogether, the COI data, the rearing data, and the morphology, and determine whether or not they're species. And in my particular group, the COI is over 90% accurate in determining the species. And rarely the morphology combined with the biology will upset that.
Cameron Ghalambor 27:54
And so collectively, that is what we refer to as integrative taxonomy, when you're using all of those kinds of lines of information?
Michael Sharkey 28:02
Yes, that's the big hot word now: "integrative taxonomy."
Art Woods 28:06
You know, when you talk about COI having 658 base pairs, I think immediately, wow, it seems like it's become so much easier now to get a lot of sequence data, that why not, why not use a broader swath of the mitochondrial genome? Why not also include nuclear genetic information? If you include nuclear genetic information, does it conflict with the mitochondrial information? Or would it increase your resolving power?
Michael Sharkey 28:34
Well, I mean, if we could afford it, we’d do entire genomes for all the specimens. I mean, that's the in fact-
Art Woods 28:41
Right, right. There's obviously a cost here.
Michael Sharkey 28:44
That's the only thing I think, eventually, that's what we'll be doing. I expect it will be very cheap to get whole genomes in a decade or two, maybe a much shorter time period than that. If you want to get any sort of handle on the 30 million species of insects that are extant, before 30, or 40% of them become extinct, we better do it fast and cheap. And so that's why we've settled only on COI. But there are some groups where COI is not as accurate. And in those cases, people are now looking at multiple genes. But typically, when you look at those publications, they're not describing 400 species of insects, they're describing 10 or 12, and spending a year to get that done. So it's not efficient, if you really want to describe the life that is yet to be described on earth.
Cameron Ghalambor 29:42
You know, when you use like whole genome sequences, you'd have a lot of information and dealing with that level of that much information, especially if you have different genes telling you sort of potentially different patterns of relatedness among the different groups, it can get a bit more complicated. And maybe it's based on more assumptions, where I guess, with using COI, you basically have a library of existing barcodes, and you generate one for your specimen, and then you see whether it matches something that's already in the library, or if it's something unique, and then that's a fairly straightforward process, than, you know, looking at even low coverage, whole genome sequencing can be computationally pretty intense.
Michael Sharkey 30:32
Yeah, absolutely. We've noticed that with phylogenomics. You know, we're getting whole genomes now for organisms. And we can't quite use those to determine phylogenies. What people do is they pick and choose, you know, dozens or hundreds of genes out of the whole genome to do their phylogenetic work, rather than the whole genome itself. And so, you know, there's, obviously, some shortcomings to just using COI. And you're gonna get a handful of cases and any sort of, taxonomic approach using COI, where you're gonna get COI sequences between the two groups of potential species that are not quite differentiated, and morphologically, you won't be able to tell, and you're gonna have to sort of almost guess are these are two different species, or it might be one species.
So we have a choice to make, we can either dive into that particular problem and solve that one species problem, spending a lot of time and effort and money. Or we can do massive amounts of work and get most of the species resolved with just certain problems. So it's a matter of what you one wants to do. Does one want to describe the bulk of life on Earth in a short period of time? Or does one want to solve these smaller species problems first and go piece by piece, and take what I estimate to be 4000 years to finish the job?
Art Woods 32:19
That's a long time.
Cameron Ghalambor 32:20
So what I was going to ask is the distinction between taxonomy and phylogenetics. And maybe this is just because I'm, I'm also confused on the distinction. But my understanding is what taxonomy is primarily to delimiting what are different species, whereas phylogenetics is actually more interested in what the relationships are, and trying to, for example, uncover the most recent common ancestor of different groups. Is that a correct distinction between the two fields?
Michael Sharkey 33:00
Well, yeah, that's fairly accurate. I might just add to that, though, in the taxonomy, it's also not just about differentiating species, and delimiting species. It's also about how to cut up the pie. For example, in groups that I work with, I have a particular working with a particular genus, I see that there are three or four distinct groups within that genus. And by distinct, I mean, monophyletic. And by monophyletic, I mean, they have they share a common ancestor or a unique common ancestor. So I have the option then of breaking that genus up into four new genera, or one old genus and three new ones, or I can keep it all as one genus. So making these higher level decisions is also basically a thing of taxonomy. And in this aspect of taxonomy, these higher level taxonomic jobs are dependent on phylogeny. But they're still taxonomy decisions.
Art Woods 34:03
Yeah. Yeah it feels like there's some kind of a large area of overlap there between taxonomy and phylogeny. And that, you know, in a significant way they're aimed at at similar ends.
Michael Sharkey 34:16
They are. I think that phylogeny is basically the evidence that one uses to make higher taxonomic decisions.
Art Woods 34:36
Okay, so, Michael, you've alluded to some of the issues that I think we're about to discuss. But you have published a really interesting paper in 2021, and the journals Zookeys. And we became aware of you and your work by reading about this, from this really interesting article by Brooke Jarvis in Wired Magazine, that sort of laid out, you know, what, that paper was about and the ensuing controversy, which was relatively intense. And let's just talk about what you did in that paper. And I'll just say you identified around 400 species of braconid wasps from Costa Rica. But you did it in a way that seemed to set off lots of resistance, if not even vitriol among the taxonomy community. So what did you do and why was it so controversial?
Michael Sharkey 35:27
Well, typically, what descriptive taxonomy like that article does is it describes species morphologically, and includes an identification key. So an identification key would be something like several different couplets. The first one would be over six millimeters in length, or less than six millimeters in length. And that would take you to two different places on the key-
Art Woods 35:54
The so-called "dichotomous key."
Michael Sharkey 35:56
Which it can have trichotomies in it, but that's still a dichotomous key for some reason. So everyone does that. If you're going to do descriptions, and the second thing they do is they do little morphological diagnosis. This species can be separated from all other species in this genus, in this country, with the following combination of characters, and they list a short list of five or six characters that allow you to distinguish it from other species. And then that's followed by often a full page or two pages of details on the number of bristles that are on the prothorax and the sculpture of the propodeum. And the-
Art Woods 36:38
So this is very traditional taxonomic approach.
Michael Sharkey 36:41
This is the old taxonomic approach. Instead of doing that, what I did was, I looked at the COI sequences for each species, and presented a consensus sequence for all the specimens that I thought belonged to that species. And then I took a picture, an image of that, the holotype for that species. And that was pretty well, it, I just left it at that. Now, people can identify those species, you know, if they have specimens, and that any of the genera that I dealt with, they can simply obtain COI and compare it to more or less BLAST it in on BOLD, which is the database at the Center for Biodiversity Genomics. And they'll get an automatic identification if they have a specimen of a species that I described. So it's extremely efficient in terms of being able to identify specimens.
Michael Sharkey 37:47
Now, if it were the other way around, and you did a morphological key, you would never know if you arrive at a particular endpoint in the key, if that's your species, or not, without carefully reading the description, and even then you wouldn't be sure, because there might be because you only have a small number of the species in a particular genus, you would have no idea. So that's the problem. People in these various species rich groups never know if they're arriving at the correct answer.
Art Woods 38:20
Okay, so this sounds really reasonable to me. And it sounds like the kind of high throughput method that you need for groups that are speciose, and for groups that you fear, may be losing biodiversity, and so we need to know now, Why was there so much outrage about this paper and this approach?
Michael Sharkey 38:38
I don't know.
Art Woods 38:41
Maybe let's just talk about some of the, you know, the complaints. So there were a number of essays that were written and published in different venues by different groups of taxonomists. That were sort of decrying this idea of identifying a bunch of new species based on a, you know, a consensus sequence and, and a photograph. Let me just read you a quote from Zamani et al. They say: "Overall, Sharky Baker et al, and Sharky Brown et al are asking taxonomists to abandon their own scientific and intellectual goals, because the requirement of research scholarship and deep thought is inconvenient. While we do believe that barcode clusters are indeed useful as gripping statements, there is no compelling reason why they should be described as species. Quite the opposite. We present compelling reasons for not doing so." I mean, to me, that's like pretty strong academic language. And you can just sort of feel the weight of their emotions about this. And that just surprises me because your approach seems so. So reasonable.
Michael Sharkey 39:40
Yeah, there has been an emotional response. There's no question about that. So why do people have emotional responses? People have emotional responses because they feel threatened. And I think that it's very difficult for people to switch paradigms, to switch, you know, decades of their own research and work, and admit to themselves that what they've been doing is not the right way to do things. So that's, I think, the source of the problem. That's really something not too long ago about Einstein when he presented his Emc squared theory. I think there were 100 physicists, who wrote a paper in a major journal, calling him out for being, you know, just horribly wrong. And he was right, of course. And I think the point of that is that whenever there's a paradigm switch, or potential paradigms, which we're not sure how this molecular thing is going to go. People feel threatened.
Cameron Ghalambor 40:48
But at the same time, it sounds like you may have anticipated that some of these critiques were coming because I noticed in the introduction to your 2021 paper, you've included in the introduction, some of the sorts of critiques and explicitly list them and address them, which I thought was really unusual. Maybe because I don't read taxonomy literature as much as I should, but I found it unusual to see in the introduction of a paper, sort of the anticipation of the criticisms that were coming in and addressing them. So is that fair? Did you anticipate that there was going to be some pushback?
Michael Sharkey 41:28
Well, there was a previous publication, a graduate student of mine, Sarah Meierotto published a paper using the same methodology in 2019. But because she described something like 19 or 20 species, it wasn't very threatening, and there wasn't much of a to-do about it. But enough of a backlash that I was familiar with what would come after this big paper was published, so that's why I was a little bit prepared. And the other thing was that I had sent that paper to several different journals in various formats, or at least one other journal. And it was vehemently rejected. So that also got me into a point of being a little bit defensive in the introduction.
Art Woods 42:19
So in some of the response papers that I read, to your 2021 paper, and I guess, also the 2019 one, I heard the read the phrase quite often, quote unquote, "taxonomic impediment." What does that mean?
Michael Sharkey 42:33
Well, it means two things, really. The fact that we do not have a methodology to deal with the tremendous diversity of species that are yet to be described. So that's the first impediment is well, what's preventing us from doing that what's impeding us from doing that? And, you know, typically taxonomists will say, well, we don't have enough money, that's what's impeding us or we don't have enough taxonomists, that's what's impeding us, or we can't collect in these countries that's impeding us. Those are the standard reasons for that part of the impediment. The other impediment is, if you want to describe a pile of new species, you have to look at type specimens in various museums, and to visit those museums and look at the specimens and compare them to yours is extremely expensive and time consuming. So the taxonomic work that has already been completed, is in fact, impeding progress that we're trying to make today. So there's sort of two different angles to impediment.
Art Woods 43:42
You know, one of the one of the other complaints about the barcoding approach to taxonomy is that it's elitist, right? So it requires access to machines and money that, you know, maybe many scientists in many parts of the world where biodiversity is the greatest, that they don't have access to that. But it almost feels like what you just said is that there's a kind of elitist angle to doing traditional taxonomy in the sense that it requires lots of travel and time and money to go see all the type specimens.
Michael Sharkey 44:13
It is and I don't get that at all. Everyone in taxonomy is now telling us that we have to do the integrated approach, which means using morphology, and typically multiple genes to delimit species. This is the standard now, the new standard. And of course, that's even much more expensive than strictly COI molecular approach. The other thing is that there's a new tool now put out by Oxford Nanopore, called min ion technology. And this costs about $1,000. And a bit more for the ingredients that you need. And without you can barcode tens of thousands of specimens.
Cameron Ghalambor 45:03
Yeah. So and I may have missed this in some of the critiques, and the back and forth with the controversy regarding your 2021 paper. But one thing that, you know, I had sort of pounded into my head when I was looking at phylogenetic trees, was that these trees represent hypotheses for what we think are the relationships among these different these different taxa. And isn't it the same for taxonomy to say that, you know, we are using these DNA barcodes, and we're making, we're putting forth a hypothesis that, that these are actually different species. And if people want to, you know, pursue this further, and test whether this is the right hypothesis or not, you know, they're, they're more than welcome to do so. But as a, as a sort of a first step in, you know, proposing potential differences based on some evidence, in this case, COI and a photograph that seems fairly reasonable to me, if it's framed in the context of, you know, with some uncertainty that that we're hypothesizing these are differences as opposed to saying, you know, we are 100% confident that these are, you know, true species. Is that a fair way of depicting kind of the approach?
Michael Sharkey 46:30
Oh, Cameron, you're generalizing my research much better than I am. So appreciate that. I'm sure you're making it very clear to listeners. So that is very accurate. And I just wrote and published a paper about a month ago on Costa Rican ichneumonids. And I said exactly that in the introduction to that paper. I talked about several species, where I thought the COI data and rearing data and morphological data were not convincing. I said, just as in any science, we have some strong hypotheses and some weak hypotheses. And I pointed out some of the weak ones for future investigation.
Cameron Ghalambor 47:21
Oh, good. Yeah. That, to me, it just seems very reasonable. But I'm not a taxonomist. And I know, taxonomy and file genetics and systematics has sort of a long history of, you know, people being quite ornery. And having lots of disagreements.
Michael Sharkey 47:40
Yeah, we do have the reputation, don't we? Yeah. And why is that do you think?
Cameron Ghalambor 47:46
Well, my feeling not being a taxonomist is that, you know, when I look at, let's say, for example of systemic tests, it feels like it's very methodologically oriented. And so different groups gravitate towards different methodologies, you know, maximum likelihood versus Bayesian, and they have strong reasons, you know, why they choose one or the other? And, and so then the disagreements often arise, because, you know, you're using one approach, and I'm using a different approach and then these camps form. And then I think, what happens is that to, to acknowledge that somebody is using a different method than you suggest, maybe that what you're doing could be wrong. And, and that may be very threatening, and that might lead to a lot of the acrimony between them. But as an outsider, it's like, well, shouldn't you just use both approaches and see if they tell you the same thing, and if they're different, you know, why are they different? I've never really fully understood it. I've just kind of been fascinated.
Michael Sharkey 48:55
Yeah, well, it gets to be quite complicated. And if you're going to do parsimony and maximum likelihood and every other possible thing, then you have thousands of different ideas, and it just becomes super complicated. You said both, but you could do very intense morphological data and add that to the COI approach the barcoding approach, but it would take forever. So my personal feeling is taxonomy, since the time of Linnaeus, was based on just a typological approach, and experts in the field would say this is a genus. And another expert would say this is a genus. And the third expert would say this is a genus. And they all disagree with each other. And there's no empirical test to decide from the three, which one is the best. So it's basically driven by how popular the, or how important the taxonomist was. So this, I think, set the stage for acrimony in taxonomy from the get go. And we're just carrying on the tradition.
Cameron Ghalambor 50:05
But in a lot of ways, then the COI approach is also kind of is very egalitarian maybe in that way, because, you know, anybody can go into the field, collect some specimens, barcode them and find differences, and not be an expert. Is that threatening to traditional taxonomy?
Michael Sharkey 50:27
I don't know. Nobody's doing that yet. Yeah, we don't really see that quite yet. But I can see that coming. So I worked at the Canadian National Collection, and people from all over the country and all over the world, to some extent, would send specimens to me of braconids. And I would identify them. And I'm sure he did a horrible job because I didn't have molecular techniques. But nonetheless, I did the best I could. And now these very same clients could simply obtain the COI very inexpensively, and search for, you know, put their sequence on BOLD and search, it just take seconds, and get a identification as close as anybody could possibly get one. So it would certainly tell them what family it is, it would almost certainly tell them what genus it is. And if there were species data in the BOLD database, then they would get a species name with some sort of idea of how likely that species name is to be an accurate one. So yeah, it's gotta go that way.
Art Woods 51:43
I'd like to ask some maybe even more forward looking questions and sort of build on that. And, you know, it feels it feels to me like, there's just two super major problems here, right? One is that we know that there's a lot of species out there that we haven't identified, and we need to do that. And the other is that, you know, we're in the Anthropocene with lots of ecological changes stemming from climate change, and changes in land use and that sort of thing. And so that we know that a lot of biodiversity out there is at risk of disappearing soon, and so it's really pressing to do this now. And if you had to envision, like the highest of high throughput ways to approach the species identification and taxonomy problem, what would it be, you know, if you, if you were given a bunch of money at NSF to spearhead this, what would you do?
Michael Sharkey 52:33
Well, it's being done now, more or less here in California. In California, amazingly, the state government has given taxonomists in various institutions and universities in California $10 million, over three years to inventory the insects of California. And that's not nearly enough to do a complete job, but it'll put a big dent in insect diversity of in California. Various institutions, which are spread throughout California, set up teams to collect insects, malaise traps, so you're catching thousands of insects. And they're also using sweep nets and, you know, various kinds of traps to catch insects and sending these all to BOLD, and having them barcode the entire samples. And the result will be some several million specimens being identified to a barcode level identification. So that's the best that can be done at this point in time.
Art Woods 53:40
Okay, so it sounds like you're advocating sort of, you know, expanding the scope of this effort to do barcoding and using easy to get genetic information. Are there other crazier ideas on the horizon? That would involve other like non genetic approaches? Like I could imagine high throughput imaging that used AI systems to try to help taxonomists do rapid descriptions or see subtle differences among insects that are hard to pick up by the human eye. Is that in the offing?
Michael Sharkey 54:13
Yeah, there are several groups around the world that are working on this right now. It's a group in Germany that's doing it. And there's another group in London, Ontario, I believe, that's doing the same thing and are probably others that I'm not aware of. And so what they're doing is, at the same time that they're barcoding, they're also imaging the specimens and then trying to use AI to be able to identify the insects automatically. But the problem is, the first thing that they have to do is decide on species limits. And then they have to teach the AI system. Okay, these are the limits of the species. But that, of course, would be wonderful.
Michael Sharkey 54:59
The other thing they're doing is environmental DNA and meta barcoding. So people are collecting samples in the air of all kinds of insect parts that are floating around, and then barcoding that entire, whatever is caught in the filter. And so with a good library, in the future, we'll be able to inventory insects rapidly using techniques like that.
Michael Sharkey 55:27
Or the other thing they're doing is metabarcoding malaise trap samples. So you take the malaise trap sample, and instead of separating each specimen, putting it in a different well for molecular analysis, you grind up, this is simplifying, but you grind up the whole shooting match, and barcode everything that's in it. So you'll never know, you'll never be able to associate necessarily the specimen with the barcode, but you'll know all the species that were caught in that malaise trap. And this will allow us to very quickly differentiate different ecosystems or, you know, ecological areas. These are things that will become more prevalent. Yeah.
Cameron Ghalambor 56:07
So looking also, I guess, more to the future. I mean, you know, I think we haven't talked about it enough. We've touched on it art mentioned, you know, being in the age of the Anthropocene and and you've mentioned, you know, the pressing need for quantifying all of this biodiversity that has yet to be described. I'm curious as you look to the future, do you see kind of the next generation of taxonomist coming, is it a is it a field that is generating a lot of excitement among, you know, graduate students and people interested in going into this line of research? Or, you know, do you feel like a field that has kind of been underappreciated and maybe not as popular as other fields of biology?
Michael Sharkey 56:59
Well, even you, Cam, have probably read about articles where taxonomists are whining about how the field is degenerating. And we're losing taxonomists, and we're losing expertise. And we're not training the next generation. That's been something that I've heard my entire career. And there's some definite truth to that. And in the United States, I don't see any Renaissance coming at all.
But the good news is that in Europe, things are really coming along very well. So you have massive projects in Norway, and in Germany, and other countries as well, where tens of millions of dollars is being invested into barcoding and other molecular approaches, and AI for taxonomy, they seem to really be much more concerned about the loss of biodiversity and the impact of humans on the environment than we are here in the United States. I expect that this very phenomenon will infect the United States in the near future. And we can expect the same thing to happen here. You know, you need a few key people to cause a revolution, you know, at a bureaucratic and institutional level.
Art Woods 58:32
Well, it sounds like we need the California approach, but, you know, elsewhere. Be nice to get the money and the people in the institutions to activate on that.
Art Woods 58:41
Okay, well, thanks, Michael so much. That was a really fun conversation.
Cameron Ghalambor 58:44
Yeah, thanks. Thanks so much. We really enjoyed talking to you.
Michael Sharkey 58:48
Likewise, thanks for both of your insights.
Cameron Ghalambor 59:01
Thanks for listening. If you like what you hear, let us know via X, Facebook, Instagram, or leave a review wherever you get your podcasts. And if you don't, we'd love to know that too. Write us at info at big biology .org.
Art Woods 59:14
Thanks to Steve Lane, who manages the website and Molly Magid for producing the episode.
Cameron Ghalambor 59:19
Thanks also to Dayna De La Cruz for her amazing social media work. Keating Shahmehri produces our awesome cover art.
Art Woods 59:26
Thanks to the College of Public Health at the University of South Florida and the National Science Foundation for support.
Cameron Ghalambor 59:31
Music on the episode is from Poddington Bear and Tieren Costello.