Using large language models for video processing in the agricultural industry
DOI:
https://doi.org/10.15276/aait.08.2025.7Keywords:
artificial intelligence, large language models, multimodal models, computer vision, video analysis, poultry monitoring, behavior recognition, precision agriculture, OpenCV, streamlit applicationAbstract
Modern artificial intelligence technologies, particularly large language models, are increasingly being applied in agriculture to enhance automation, decision-making, and sustainability. This study presents a comprehensive analysis of large language models and their integration with computer vision and video processing for real-time livestock monitoring. A software system was developed that utilizes multimodal large language models to analyze poultry behavior from video streams, enabling the detection of anomalies, prediction of potential health issues, and automatic generation of recommendations for farmers. The system is based on a modular architecture and combines technologies such as OpenCV, FastAPI, and Streamlit. Comparative evaluation of models including GPT-4o, Claude 3.7, and LLaVA demonstrates their suitability for different agricultural tasks. The results confirm the effectiveness of large language model-based solutions in improving operational efficiency, reducing human intervention, and supporting precision agriculture. Despite high computational demands, the proposed approach significantly simplifies the deployment of intelligent monitoring systems and opens new opportunities for smart farming innovations.