Optical information support for detection of artifacts against comples backgrounds by a robotized system with neural elements
ISARD-2025-remote004
An unmanned aerial vehicle (UAV) should be considered as the artifact. Large-sized fixed-wing UAVs, in terms of their revealing characteristics, are comparable to traditional aerodynamic targets. Due to the limited effectiveness of radar methods, a promising direction is the use of robotic optical-electronic reconnaissance (OER) means. The most promising OER tools operate in the infrared (IR) range, as they are capable of functioning around the clock.
In this regard, a neural network model based on the YOLOv11 architecture has been developed, capable of automatically detecting and classifying rotary-wing and fixed-wing UAVs, as well as birds, in the IR spectrum. The training dataset included an extended set combining both open-source data and proprietary video recordings obtained with the mIR-1280C12S thermal camera against uniform and complex backgrounds (a complex background refers to emissions from inhomogeneities of the background radiation field (BRF), cloudiness with gaps), as well as under various weather conditions and distances.
The trained model demonstrates an average precision of 82.7% with a recall of 82%, successfully identifying rotary-wing and fixed-wing UAVs, as well as birds. However, detection efficiency for subpixel-sized objects (<50 pixels) remains limited, especially against complex (inhomogeneous) BRF formed by cloudiness of various classes with gaps and/or significant gradients of humidity with small angular sizes.
To improve detection accuracy of small-sized targets, the SAHI (Slicing Aided Hyper Inference) method was investigated, enabling video stream analysis by segmenting frames (dividing video frames into segments).
To select the optimal segment size, experimental studies were conducted on the spatial structures of cloud radiation in the IR range. Based on collected statistical data on spatial spectra of BRF inhomogeneities, it was established that a characteristic feature of the spatial radiation structure of various cloud classes and types is the angular sizes of emitting inhomogeneities in vertical and horizontal directions, determined by mutual correlation function values exceeding 0.5 – the correlation radius.
A remarkable property of these patterns is that the found correlation radii represent angular intervals within which mutual correlation coefficients are significant, i.e., this defines a spatial region where the spatial radiation structure of BRF inhomogeneities by elevation angle and azimuth does not undergo sharp changes, i.e., it is uniform in nature.
To evaluate the effectiveness of the proposed method, thermal video footage was used, recording the trajectory of a UAV moving at a constant altitude of 500 m at a uniform speed, reducing the distance from 3400 m to 1200 m relative to the observation point.
Based on this study, the segment size on the video was chosen as 512 pixels, with a 10% overlap between adjacent segments.
As a result, it was established that segment-based processing with the chosen parameters increased detection probability by 25-30% compared to the traditional frame-by-frame method. This improvement was especially pronounced at distances over 2500 meters, where the detection efficiency of the segmented approach reached 70-80%, whereas standard frame-by-frame methods did not exceed 30%.