Machine learning (ML) models are already powerful tools for identifying features, like species, in large biodiversity datasets. But while they’re advancing rapidly, there’s still a long way to go for complete accuracy. So, accurate and detailed annotation of vast biodiversity datasets requires a global network of ecosystem experts to quality control ML results and reveal the true state of nature.
Picture a forest. Sunlight dapples through diverse canopy layers. Towering trees with thick trunks stretch skywards, branches teeming with mosses and lichens. Below, the understory is abundant with a multitude of shrubs and small trees. Bird calls echo through the vegetation. On the forest floor the fallen leaves and branches are home to a thriving community of fungi and insects, as small mammals dart in the undergrowth.
The diversity found in that snapshot appears almost infinite: indeed, a rainforest like the Amazon is home to as many as 3 million species. Imagine a technology that could capture and analyse that forest snapshot, identifying every bird, plant, tree, fungus, and mammal, while delivering metrics on the diversity and abundances of different populations – and how this reflects ecosystem health and integrity. With all that data presented on clear and intuitive dashboards.
With advances in machine learning (ML) this ecologist’s dream may be a reality in a few years. Over the last decade these models have grown better and better at identifying species and habitat features in digital biodiversity data, and we deploy them every day at Pivotal to make our analytics faster, better, and more scalable.
However, while some of today’s ML models are good at identifying common species and features – the observations they’re most exposed to – they are currently not so suitable for identifying the rare things. As any ecologist will know,when it comes to measuring and tracking ecosystem health, it’s the rare species that really matter. Most of the species in an ecosystem or community are locally rare, and to identify these we need to combine ML models with expert human intelligence.
Rare species have a greater range of roles (functions) within an ecosystem than common species, and therefore contribute disproportionately highly to healthy ecosystem functioning, which in turn is critical to delivery of ecosystem services. Rare species are also often the most sensitive to environmental changes, making them good indicators of ecological shifts or degradation: the disappearance of rare species can be an effective early warning system for larger ecological problems.
To provide high-quality and accurate biodiversity data, we need to be able to identify a wide range of species across our datasets, including the rare ones. It’s vital we develop the ability to do this now, even while training machine learning models to reliably do more and more of the analysis over time. This is where our global community of world-class ecosystem experts comes in – already 250 people strong and growing fast.
A global network of ecosystem experts
Biodiversity is exquisitely location specific. Our planet is home to immense diversity – this is part of what makes nature so dynamic and beautiful. That means finding the right expertise for each dataset, and its specific taxa and geography. An expert botanist who knows the plants of the Iberian Peninsula might know very little about the plants of the Americas. So, while our datasets are widely dispersed across nearly every continent, our expertise is highly concentrated.
Over the last year, we’ve built a network of rare expertise by recruiting from first-rate institutions around the world. Among them are professors, senior consultants, world renowned field guides and leading natural history experts. Collectively, their expertise spans the globe, covering a panoply of bird, bat, plant, tree and mammal species.
“Pivotal is building the largest database of accurate biodiversity data, consisting of georeferenced images that are timestamped, processed by data scientists and verified by biologists,” explains Joao Santos, one of our botanists from Portugal, “these databases are extremely important as they allow both the study of present organism distributions, but also enable us to build models based on temporal successions to observe changes in distribution, occurrence of invasive species, or the extinction of native species.”
Experts like Joao analyse datasets remotely through Pivotal’s online portal. This means they can continue doing their important work, research and teaching from wherever they’re based. Collaborating with a remote network also means we’re able to overcome the global shortfall in expertise of unique ecosystems, giving our customers access to specialised knowledge that can effectively and accurately provide insights on the health and integrity of ecosystems.
So, how does it work?
Working remotely with precision: how our experts annotate biodiversity datasets on an unprecedented scale
The first step is sourcing vast amounts of data with our customers, where possible working with a local community, using affordable and easily deployable digital devices like camera traps, drones, high-res cameras, acoustic sensors, and eDNA, alongside remote sensing data.
Once the data is collected, our machine learning models efficiently sort, cluster and categorise features in the images, video and audio. Then our experts dig in remotely, using our proprietary online platform to analyse, annotate and verify species and habitat features in the data. It takes time: each dataset requires a unique set of expertise, and our experts go into micro detail. As a result, they help train and quality control our machine learning models to continually improve the accuracy and efficiency of analysing biodiversity data captured on the ground.
“I was able to remotely identify Trifolium vesiculosum, also known as Arrowleaf Clover” explains Joao, “the plant is found in thick grassland and Montado systems in Portugal. It is a minute species, but important to identify, especially due to ecological and agricultural significance.”
But we know it’s not only ML models that make errors – humans do too – so we need measures to quality control the quality controllers. Firstly, we check every expert’s capability in a competency test, which they must pass before they work on a dataset. Then, to minimise bias, we always get multiple experts to quality control each set of machine learning outputs. They work independently of each other, with cross-checks where there’s any disagreement.
Other experts have been able to use audio recordings to correctly annotate a migratory bird that mimics local bird species. Something that only an expert in that biome can even expect to listen out for. Being able to identify these nuances remotely is what makes our network so special.
Our testing showed there’s also a risk of leading the witness, so when our automated workflows send out data to our experts, it’s always sent ‘blind’ – meaning we always ask, “what is this?” not “is this species X?”.
The result is robustly vetted, ‘ground-truthed’ intelligence on the state of nature that’s affordable and scalable. This approach opens up a whole new way of working for some in our network:
“Pivotal provided a chance to contribute to science at a scale I had not had access to before in the public sector,” says Pablo de la Fuente Brun, one of our experts based in Spain, “using technology and working remotely, I can identify species that inform the understanding of the state of nature and evidence real changes in biodiversity.”
Pushing the boundaries for biodiversity data analytics
A global network of experts working alongside the latest machine learning technology allows us to push the boundaries for the analysis of large biodiversity datasets. In the coming years, as we further integrate expert knowledge and advanced machine learning, the breadth and depth of our data will grow, and make ground-level outcome monitoring increasingly accessible and scalable.
In-depth insights on the health, integrity and functioning of ecosystems are critical for reversing nature decline and ensuring sustainable practices that promote and foster biodiversity. What our network of ecosystem experts makes possible through their remote annotation, lays the groundwork towards advancing a deeper, and more actionable understanding of our planet’s invaluable biodiversity.
If you’re interested in joining Pivotal’s network of world-class ecological experts, please contact us on info@pivotal.earth.
To discuss how we can help you secure real, auditable evidence of on-the-ground changes in nature, so you can make solid decisions and trustworthy claims, get in touch.