Ternary Weight Pruning for SqueezeNet

  • Followers
About this project

Convolutional Neural Networks are backbones of modern computer vision, but currently require much computation and power in training and in inference. There are many uses for an architecture that would perform well on low power, low computational-speed, and low memory mobile devices. Thus, we propose a new fully-convolutional architecture that builds upon SqueezeNet and is optimized for mobile usage, with smaller model size, faster inference, and comparable accuracy. We show that this compressed model architecture is state of the art for object detection and image classification, is able to be implemented on mobile devices, and is capable of real-time recognition. We also further design a lighter model optimized for inference that attains a lower accuracy but is capable of real-time recognition while running on cheap, light hardware, such as a raspberry pi.

Project Members
  • Last Year

  • Aaeaaqaaaaaaaau7aaaajdi4ywm4ztkzltdmnjetndi1yi04mtqylwy5ody0mtm4otbkzq

    Kunal wrote an update for Ternary Weight Pruning for SqueezeNet.

    We have been working on building a generalized method to ternary prune a network, and have been working on testing and benchmarking our approach. We have been surprised by the efficiency and power of ternary pruning and through our benchmarks saw a drop in accuracy from 98.16% to 98.07%, with a 16x reduction in model size. We also saw 83% accuracy on ResNet18 on the Cifar10 Dataset, compared to a baseline score of 77% without Ternary Pruning.