An overview of the number of unique categories and the number/proportion of expert-labeled samples within the BIOSCAN-1M Insect dataset at each taxonomic rank. The bottom row provides Barcode Index Number (BIN) information, a genetic alternative to taxonomic labels (species proxy). All samples have an associated BIN, with roughly 10× more unique BINs than species labels.
|
Phylum |
Class |
Order |
Family |
Subfamily |
Tribe |
Genus |
Species |
BIN |
Categories |
1 |
1 |
16 |
491 |
760 |
535 |
3,441 |
8,355 |
90,918 |
Labeled Samples |
1,128,313 |
1,128,313 |
1,128,313 |
1,112,968 |
265,492 |
60,477 |
254,096 |
84,397 |
1,128,313 |
Labeled (%) |
100.0 |
100.0 |
100.0 |
98.6 |
23.5 |
5.4 |
22.5 |
7.5 |
100.0 |