A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
machine-learning
sparsity
compression
deep-learning
tensorflow
optimization
keras
ml
pruning
quantization
model-compression
quantized-training
quantized-neural-networks
quantized-networks
-
Updated
Mar 1, 2021 - Python
The idea is to have a more advanced Filter Pruning method to be able to show SOTA results in model compression/optimization.
I suggest reimplementing the method from here: https://github.com/cmu-enyac/LeGR and reproduce baseline results for MobileNet v2 on CIFAR100 as the first step.
cc'ed @vshampor, @vanyalzr.