This week focus on paper reading and basic knowledge learning.
First,I summarised the difference of QAT and PTQ.
Quantization-Aware Training (QAT) Core Method
Simulated Quantization During Training,Full Model Fine-Tuning,Quantizing Weights and Activations.
Post-Training Quantization (PTQ) Core Method
Quantization After Training,Static Quantization,Dynamic Quantization.
Here is a chart about comparison of them.
| Feature | Quantization-Aware Training(QAT) | Post-Training Quantization (PTQ) |
|---|---|---|
| Training | Requires retraining with quantization | No retraining needed, applied aftertraining |
| Use Case | Best for high-precision tasks on resource.constrained devices | Fast deployment, suited for simplerquantization tasks |
| Accuracy Loss | Minimal, close to floating-point accuracy | Potential for higher accuracy loss |
| Efficiency Gains | High efficiency on low-precision hardware | Also boosts efficiency, but lessoptimal for some models |
| Complexity | Higher complexity due to simulatedquantization during training | Simpler implementation |
Secondly,I go over knowledge of transfomer,whose note is on the goodnotes.
我总感觉像生活在大海上,受到威胁,然而心中存有巨大的幸福。————加谬