A Study on Recognizing Mosquito Sounds in Audio Using Convolutional Neural Networks

Zhou Zhou1


1 Beijing University of Posts and Telecommunications

Introduction

The project aims to use Convolutional Neural Networks (CNN) to identify mosquito sounds.

  • Audio data processing
  • Efficient detection with CNN models

Methods

  1. Data Collection: HumBugDB: A large-scale acoustic mosquito dataset.

  2. Spectrogram Conversion:The sound of mosquitoes has distinct frequency characteristics.

spectrogram

Figure 1: spectrogram

  1. Program & CNN Model Design: Figure 2 shows a Convolutional Neural Network (CNN) architecture used to detect the presence of mosquito sounds in audio.
Architecture

Figure 2: Architecture

Experiment and Performance

To address the data sample imbalance, two different methods were employed to train the model.

Model 1: Trained on small datasets using 6-fold cross-validation.

Model 2: Trained on large-scale data, with adjustments made to the weights of different categories.

Performance

model loss and performance

Figure 3: model loss and performance

Model 1 has weaker performance but trains very quickly, whereas Model 2 has slower training speed but better overall performance, particularly in mosquito sound detection.

Tips

There are several tips to speed up model training:

Speed up computation

  • Using GPU instead of CPU

Improve I/O speed

  • Using Linux instead of Windows
  • Use multithreaded read/write operations

By applying these methods, the model can converge within one minute on a small dataset of 1800 images and within approximately five minutes on a large-scale dataset of 30000 images.

Next Steps

To further enhance the performance and accuracy of the model, we will implement a series of optimization measures.

  • Simultaneously use narrowband spectrograms and wideband spectrograms.
  • Use more advanced models, such as Transformer.
  • Incorporate data augmentation methods .