Using large language models for video processing in the agricultural industry
Main Article Content
Abstract
Modern artificial intelligence technologies, particularly large language models, are increasingly being applied in agriculture to enhance automation, decision-making, and sustainability. This study presents a comprehensive analysis of large language models and their integration with computer vision and video processing for real-time livestock monitoring. A software system was developed that utilizes multimodal large language models to analyze poultry behavior from video streams, enabling the detection of anomalies, prediction of potential health issues, and automatic generation of recommendations for farmers. The system is based on a modular architecture and combines technologies such as OpenCV, FastAPI, and Streamlit. Comparative evaluation of models including GPT-4o, Claude 3.7, and LLaVA demonstrates their suitability for different agricultural tasks. The results confirm the effectiveness of large language model-based solutions in improving operational efficiency, reducing human intervention, and supporting precision agriculture. Despite high computational demands, the proposed approach significantly simplifies the deployment of intelligent monitoring systems and opens new opportunities for smart farming innovations.