Show simple item record

dc.contributor.authorLi, Chao
dc.date.accessioned2022-09-28 17:45:19 (GMT)
dc.date.available2022-09-28 17:45:19 (GMT)
dc.date.issued2022-09-28
dc.date.submitted2022-09-24
dc.identifier.urihttp://hdl.handle.net/10012/18837
dc.description.abstractObject detection technology has been widely used in many real world applications. With the development of the deep learning method, the accuracy and speed of object de tection method have been improved significantly, demonstrating great promises to increase the efficiency of security-related business activities. Nevertheless, the robustness of the existing object detection methods on security video datasets is still lacking. This could substantially reduce performance in complex application scenarios, such as changeable tar get size, target occlusion and bad weather. This cannot be solved perfectly by image-based object detection because a single image’s information is limited. On the other hand, the video dataset consists of a series of still images of rich temporal and spatial information, which could be used as supplements for the detection methods. Based on this idea, this thesis proposes an incremental optimization method that solves the existing problems of the object detection method. We first improve the accuracy of the image-based object detection method by adding new features, and then aggregate the temporal and spatial information of the target to enhance the performance of the video-based object detec tion method. Furthermore, a multi-layer feature cascade aggregation pyramid structure is adopted based on the traditional Faster-RCNN model. The Faster-RCNN is one of the most famous convolution neural networks used in object detection and recognition tasks, which was firstly proposed in 2016. It replaced the traditional selective search method with the region proposal network (RPN), which improved detection speed significantly. Because of its excellent detection performance, many recent proposed approaches still selected it as the backbone network. The new multi-layer feature cascade aggregation feature pyra mid network (MCA-FPN) combines the deep and shallow semantic feature information to optimize feature utilization and improve the feature representation ability of any size. In order to address the negative effects generated by the imbalanced distribution of samples, a sample asymmetric weighted loss function (SAW-Loss) is proposed, which improves the efficiency of the network training. Experimental results show that the proposed MCA FPN and SAW-Loss modules can improve the mAP of traditional FPN by 2.4% and 1.5% respectively, and the final improved object detection algorithm with both of two modules obtains a mAP of 86.0% on Pascal VOC dataset which is higher than the mAP of 82.1% tested from FPN. The proposed method performs significantly better than most of the existing method, such as FCOS with a mAP of 78.7%, RFBNet with an mAP of 82.2% and PFPNet with a mAP of 84.1%. Video-based methods may make use of two types of information: local information which is obtained from adjacent frames and global information which is extracted from whole video series. We propose two types of information aggregation methods, namely local information aggregation and global information aggregation based on the feature similarity and the attention mechanism, and so as to aggregate features selectively by including more of the correlated feature information and less of the uncorrelated feature information. As such, the network could extract and learn more useful target features and abandon the interfered features. The accuracy of the proposed local global information aggregation methods could be improved by 0.9% and 1.1%, respectively compared with one of the most advanced video-based object detection methods MEGA. By adding both two modules, the mAP of the proposed method reaches 84.6% on the public dataset ImageNet VID, which is 1.7% higher than the mAP of MEGA. The proposed method also demonstrates potentials to detect occluded targets with high confidence.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.titleVideo-Based Object Detection in Security Monitoring Systemen
dc.typeMaster Thesisen
dc.pendingfalse
uws-etd.degree.departmentElectrical and Computer Engineeringen
uws-etd.degree.disciplineElectrical and Computer Engineeringen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeMaster of Applied Scienceen
uws-etd.embargo.terms0en
uws.contributor.advisorBan, Dayan
uws.contributor.advisorWang, Zhou
uws.contributor.affiliation1Faculty of Engineeringen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages