SUMMARY
The main computational argument for quantization is that quantized weights and activations occupy much less memory space while trading-off model performance. Most of these works focus on the performance of the proposed DL model when all layers are quantized with the same quantization value. As a result, there are no methodologies or guidelines to design quantized DL models where the designer can trade-off both main optimization objectives, the models` performance (e_g, accuracy) against the model complexity (e_g, model size). To the best of the authors` knowledge, this is the first work that proposes a methodology . . .
If you want to have access to all the content you need to log in!
Thanks :)
If you don't have an account, you can create one here.