Paper Key : IRJ************534
Author: Nidhi Pathak,Sunita Joshi ,Dhanshri Londhe,Aditi Satav
Date Published: 09 Nov 2024
Abstract
This paper presents the development and implementation of an AI-powered Image and Speech Recognition Device using the GPT-4 model, designed to provide users with a seamless, interactive experience. The device is built on a Raspberry Pi 4 microcontroller and equipped with a camera, microphone, speaker, and touchscreen display. Users can interact via typed or spoken commands, and the device processes these inputs through GPT-4 to deliver real-time, accurate responses on the display or speaker. The project explores the application of GPT-4 for handling visual and auditory data, allowing it to function as both an image processing system and a virtual assistant. This device can provide live updates such as weather, news, sports scores, and election results, making it versatile and useful in assistive technology and daily tasks. Testing confirmed the systems efficiency in response accuracy and processing speed, demonstrating its potential for practical, real-world applications. The findings highlight the utility of combining image and speech recognition in a single, portable AI-powered device, offering users a reliable and interactive tool.
DOI LINK : 10.56726/IRJMETS63334 https://www.doi.org/10.56726/IRJMETS63334