Riverfish Computer Vision

Overview

RiverEye is my personal project, which aims to improve the monitoring of local populations of freshwater fish in midwest rivers using deep learning based object detection and image classification models.

Target Problem

Manual monitoring of fish populations is invasive, time-consuming, and prone to human inconsistency. This tool would allow researchers and Natural Resource Departments to gather accurate data on local populations without the need for manually filtering through large swaths of video or physically netting fish. This saves time, and ensures that researchers can focus on interpreting data and making informed ecological decisions, while minimizing the impact of monitoring on aquatic wildlife. Additionally, enabling Natural Resource Departments to more quickly and effectively identify and quantify populations of invasive species ensures that they can act rapidly to protect native species.

Goals

This project has three main goals:
1. Develop an object detection model with high accuracy and precision for low-medium quality images with varying light and visibility conditions (Target mAP >= 0.98).
2. Develop an image classification model that can identify common Midwestern species, game species, and invasive species reliably (Target mAP >= 0.95).
3. Deploy both models into an application for use by researchers and Natural Resource Departments.

Dataset

Data was acquired from Dr. Cory Suski's Lab (Justin Lombardo, MS Student) in the Natural Resources Department at the Univeristy of Illinois. The dataset included ~2 hours of total footage. Footage was captured using a stationary camera aimed at Bass Nests and using an underwater camera while diving. This dataset was specifically acquired to be diverse in terms of quality, lighting, and general visibility so that the model would be able to maintain accuracy in these situations. Data for the preliminary classifier model was found primarily online from reputable, legal sources.

After receiving the raw video data, I split the videos into frames, and ran software to remove images with a high degree of similarity (>0.85). After creating a base set of images, I uploaded the dataset to CVAT where I began the labeling process for the object detection model. The preliminary object detection model training set consisted of 500 labeled images.

Data for the preliminary classifier was separated into 10 folders; One for each species of fish. Each folder contained 15 images for a total training set of 150 images.

Preliminary Models

Preliminary models for both object detection and classification were trained using an 80-20 train/val split and used a pretrained YOLOv8 model as a foundation. They were trained on 30 epochs and 60 epochs respectively.

Validation on the preliminary object detection model showed mean average precision at 0.91 (mAP = 0.91). This demonstrates proof of concept and indicates that expansions on the dataset and fine-tuning of parameters will yield subsequent performance improvements.

Preliminary classifier validation also demonstrated proof of concept and potential for improvements, with the model struggling when species were closely related, but performing reliably on unique target species such as the invasive Asian carp. Results are summarized in the adjacent confusion matrix.

Current Objectives

Data labeling is the current objective of RiverEye. I have ~5000 images that will make up the dataset for my newest detection model version, which will be trained locally using YOLOv11. They contain images with variable quality, environments, and fish from sections of the Illinois River, which I hope will contribute strongly to the robustness of subsequent detection models.