Skip to content

Meowmixforme/Squirrel-Red

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Red Squirrel Population Prediction in Scotland

Project Overview

470070590-ccf8ebc6-914f-483c-bdd4-5ea52e24a3a1

This project uses machine learning and spatial regression techniques to predict red squirrel populations across Scottish local authorities using mixed-source citizen science data. The system includes a two-stage duplicate detection algorithm and a Random Forest Regressor model to process and analyse data from the Scottish Squirrel Database.

Key Features

  • Two-Stage Duplicate Detection: Identifies and removes duplicate observations using both exact spatial matching and proximity-based detection that accounts for coordinate uncertainty
  • Spatial-Temporal Modelling: Predicts future squirrel populations using Random Forest Regressor with spatial and temporal features
  • Interactive Visualisation: Streamlit web application with dynamic maps, charts, and prediction tools
  • Uncertainty Quantification: Provides confidence intervals for predictions to support conservation decision-making
466092226-5d2d5f58-db8b-434a-a3ca-6e2df35a7db6

Technologies Used

  • Python: Primary programming language
  • MongoDB: Database storage for squirrel observation records
  • Streamlit: Web application framework for interactive visualisation
  • Scikit-learn: Implementation of Random Forest Regressor model
  • GeoPandas: Spatial data processing and analysis
  • Folium: Interactive mapping
  • Plotly: Statistical visualisations

Dataset

The system uses the Scottish Squirrel Database obtained via the NBN Atlas, containing over 93,000 squirrel sightings recorded between 1905 and 2021. The data is filtered, cleaned, and processed to focus on red squirrels in Scotland from 2010-2021.

Results

  • Successfully identified and removed 5.36% of records as potential duplicates
  • Achieved an R² score of 0.849 during cross-validation and 0.79 on the test set
  • Identified current year density (50.8%) and previous year count (35.5%) as the most influential predictors
  • Created an intuitive application for exploring population trends and generating future predictions
466092660-ad141ba5-f1a9-4651-9ec2-0c79d5b95c7f

Installation and Usage

  1. Clone this repository
  2. Install required packages: pip install -r requirements.txt
  3. Configure MongoDB connection in config.py
  4. Run the application: streamlit run app.py

GDPR Compliance

This project adheres to GDPR principles by:

  • Anonymising all personal data
  • Implementing appropriate security measures
  • Using data only for the specific purpose of wildlife population modelling
  • Displaying reduced precision coordinates in visualisations

Future Development

Potential enhancements include:

  • Dynamic coordinate uncertainty based on environmental factors
  • Observer identity integration for improved duplicate detection
  • Environmental covariates to improve prediction accuracy
  • Real-time data collection interface for citizen scientists

Acknowledgments

  • The citizen scientists who contributed observations to the Scottish Squirrel Database

  • NBN Atlas for providing access to the dataset

    Click on the thumbnail to view the Demonstration video

YouTube Video Thumbnail

About

My final year Bachelor’s university main project to predict future red squirrel populations in Scotland using Spatial-Temporal Modelling. Identifies and removes duplicate observations using both exact spatial matching and proximity-based detection that accounts for coordinate uncertainty.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages