Paper Key : IRJ************462
Author: Varun Toshniwal,Danish Nagaonkar,Vrushabh Ghodke,Tushar Shinde,Prof. Tanuja Mulla
Date Published: 13 Nov 2024
Abstract
This study explores the integration of audio and image-based recognition techniques to identify bird species through a comprehensive workflow utilizing spectrograms and Convolutional Neural Networks (CNNs). As biodiversity continues to decline, effective monitoring of avian populations is crucial for conservation efforts. Our approach leverages audio recordings of bird calls, which are transformed into spectrogramsvisual representations of soundenabling the extraction of temporal and frequency features. Simultaneously, images of birds captured in their natural habitats are analyzed to complement the audio data.The workflow begins with the collection of diverse datasets comprising both audio and image samples of various bird species. Audio data isprocessed to generate spectrograms using Short-Time Fourier Transform (STFT), while images are preprocessed for uniformity. A CNN model is then designed to process these spectrograms and images, with a dual-input architecture that allows simultaneous learning from both modalities. This model is trained using transfer learning techniques on pre-trained networks to enhance performance and reduce computational requirements.
DOI Requested