Automating Manufacturing Surveillance Processes Using External Observers
MetadataShow full item record
An automated assembly system is an integral part of various manufacturing industries as it reduces production cycle-time resulting in lower costs and a higher rate of production. The modular system design integrates main assembly workstations and parts-feeding machines to build a fully assembled product or sub-assembly of a larger product. Machine operation failure within the subsystems and errors in parts loading lead to slower production and gradual accumulation of parts. Repeated human intervention is required to manually clear jams at varying locations of the subsystems. To ensure increased operator safety and reduction in cycle-time, visual surveillance plays a critical role in providing real-time alerts of spatiotemporal parts irregularities. In this study, surveillance videos are obtained using external observers to conduct spatiotemporal object segmentation within: digital assembly, linear conveyance system, and vibratory bowl parts-feeder machine. As the datasets have different anomaly specifications and visual characteristics, we follow a bottom-up architecture for motion-based and appearance-based segmentation using computer vision techniques and deep-learning models. To perform motion-based segmentation, we evaluate deep learning-based and classical techniques to compute optical flow for real-time moving-object detection. As local and global methods assume brightness constancy and flow smoothness, results showed fewer detections in presence of illumination variance and occlusion. Therefore, we utilize RAFT for optical flow and apply its iteratively updated flow field to create a pixel-based object tracker. The tracker differentiates previous and current moving parts in different colored segments and simultaneously visualizes the flow field to illustrate movement direction and magnitude. We compare the segmentation performance of the optical flow-based tracker with a space-time graph neural network (ST-GNN), and it shows increased accuracy in boundary mask IoU alignment than the pixel-based tracker. As the ST-GNN addresses the limited dataset challenge in our application by learning visual correspondence as a contrastive random walk in palindrome sequences, we proceed with ST-GNN to perform motion-based segmentation. As ST-GNN requires a first-frame annotation mask for initialization, we explore appearance-based segmentation methods to enable automatic ST-GNN initialization. We evaluate pixel-based, interactive-based, and supervised segmentation techniques on the bowl-feeder image dataset. Results illustrate that K-means applied with watershed segmentation and gaussian blur reduces superpixel oversegmentation and generates segmentation aligned with parts boundary. Using Watershed Segmentation on the bowl-feeder image dataset, 377 parts were detected and segmented of total 476 parts present within the machine. We find that GLCM and Gabor filter perform better in segmenting dense parts regions than graph-based and entropy-based segmentation. In comparison to entropy-based and graph-based methods, the GLCM and Gabor filter segment 467 and 476 parts, respectively, of total 476 parts present within the bowl-feeder. Although manual annotation decreases efficiency, we see that the GrabCut annotation tool generates segmentation masks with increased accuracy than the pre-trained interactive tool. Using the GrabCut annotation tool, all 216 parts present within the bowl-feeder machine are segmented. To ensure segmentation of all parts within the bowl-feeder, we train Detectron2 with data augmentation. We see that supervised segmentation outperforms pixel-based and interactive-based segmentation. To address illumination variance within datasets, we apply color-based segmentation by conversion of image datasets to HSV color space. We utilize the images, converted within the value channel of HSV representation, for background subtraction techniques to detect moving bowl-feeder parts in real-time. To resolve image registration errors due to lower image resolution, we create Flex-Sim synthetic dataset with various anomaly instances consisting of multiple camera viewpoints. We apply preprocessing methods and affine-based transformation with RANSAC for robust image registration. We compare color and texture-based handcrafted features of registered images to ensure complete image alignment. We evaluate the PatchCore Anomaly detection method, pre-trained on MVTec industrial dataset, to the Flex-Sim dataset. We find that generated segmentation maps detect various anomaly instances within the Flex-Sim dataset.
Cite this version of the work
Gauri Sharma (2022). Automating Manufacturing Surveillance Processes Using External Observers. UWSpace. http://hdl.handle.net/10012/18851
Showing items related by title, author, creator and subject.
Faris, Nesma (University of Waterloo, 2013-04-26)Speech Endpoint Detection, also known as Speech Segmentation, is an unsolved problem in speech processing that affects numerous applications including robust speech recognition. This task is not as trivial as it appears, ...
Chiu, Bernard (University of Waterloo, 2003)Prostate segmentation is a required step in determining the volume of a prostate, which is very important in the diagnosis and the treatment of prostate cancer. In the past, radiologists manually segment the two-dimensional ...
Color Image Edge Detection and Segmentation: A Comparison of the Vector Angle and the Euclidean Distance Color Similarity Measures Wesolkowski, Slawomir Bogumil (University of Waterloo, 1999)This work is based on Shafer's Dichromatic Reflection Model as applied to color image formation. The color spaces RGB, XYZ, CIELAB, CIELUV, rgb, l1l2l3, and the new h1h2h3 color space are discussed from this perspective. ...