AI Summary • Published on Jan 8, 2025
Knife safety is a critical but often neglected aspect of food preparation, leading to a high risk of injuries. Traditional methods for detecting unsafe knife handling often face challenges such as varying lighting conditions, object obstructions, poor image quality, and misidentification with similar kitchen tools. Existing computer vision models have struggled with accuracy, processing time, and the need for constant manual supervision in recognizing complex hazards like improper finger placement or blade contact during food preparation.
This study utilized YOLOv7 for identifying knife handling hazards. A dataset was created from 6,004 frames extracted from video captured with an Apple iPhone 15 Pro, with a resolution of 1920x1080 pixels. These frames were manually annotated in Label Studio into six categories: cutting board, hands, vegetable, knife, hazard 1 (curl finger), and hazard 2 (hand touching blade). To enhance model robustness and generalization, extensive data augmentation techniques were applied, including horizontal and vertical flipping, 20% cropping, 15-degree rotations, 15% grayscale conversion, 10-degree horizontal and vertical shearing, +/-25-degree hue shifts, +/-25% saturation adjustments, +/-15% brightness adjustments, +/-10% exposure adjustments, 2.5px Gaussian blur, 1.01% noise, and 10% cutout augmentation. The YOLOv7 model was trained and validated in a PyTorch environment using NVIDIA GPUs over 40 epochs. The training employed the AdamW optimizer with a learning rate of 0.001 and a momentum factor of 0.9, applying L2 regularization to biases with a decay rate of 0.0005 to prevent overfitting. The architecture leverages Extended Efficient Layer Aggregation Network (E-ELAN) as its computational block in the backbone, with a Feature Pyramid Network (FPN) and multiple detection heads for object classification, bounding box regression, and objectness score predictions.
YOLOv7 demonstrated strong performance in detecting kitchen knife safety hazards. The model achieved its best overall performance at epoch 31, with a mAP50-95 score of 0.7879, a precision of 0.9063, and a recall of 0.7503. Across all classes, the model achieved a mean Average Precision (mAP) of 0.821 at an Intersection over Union (IoU) threshold of 0.5. The precision-confidence curve showed perfect precision (1.00) for all classes at a confidence threshold of 0.874, with "cutting board" showing consistently high precision and "hazard 2 - hand touching blade" having reduced precision at lower thresholds. The recall-confidence curve achieved a maximum recall of 0.94 at a confidence threshold of 0.000, with "cutting board" and "hand" maintaining high recall. The F1-Confidence curve indicated a peak F1 score of 0.75 at a confidence threshold of 0.102. The confusion matrix revealed excellent accuracy for "cutting board" (1.00) and "hand" (1.00), and high performance for "vegetable" (0.99) and "knife" (0.85). However, some misclassifications were noted, such as "knife" sometimes being misclassified as "vegetable" (0.32) and "hazard 1 - curl fingers" as "hand" (0.12), indicating areas for improvement.
The robust performance of YOLOv7 highlights its significant potential for real-time kitchen safety systems by accurately identifying knife-related hazards. This can lead to the development of improved safety alerts and preventative measures, thereby reducing accidents and injuries during food preparation. Future research directions include expanding the dataset to cover more knife shapes, food substances, and additional hazards, as well as investigating the model's adaptability to scenarios where kitchen tools obstruct visibility. The study also suggests extending this type of investigation to other sectors like renewable energy, healthcare, and wireless sensor networks for wide-scale edge-based deployments.