Skip to content

tools:Move Embed/MHA/RNN/LSTM/GRU weight scale generation to ncnn2table#6688

Open
Roundaboutt wants to merge 8 commits intoTencent:masterfrom
Roundaboutt:opt-quantize-int8
Open

tools:Move Embed/MHA/RNN/LSTM/GRU weight scale generation to ncnn2table#6688
Roundaboutt wants to merge 8 commits intoTencent:masterfrom
Roundaboutt:opt-quantize-int8

Conversation

@Roundaboutt
Copy link
Copy Markdown

Description

This PR moves static weight scale generation for several non-convolution layers from ncnn2int8 to ncnn2table, following the same table-driven workflow already used by other quantized layers.

Changes

  • Add Embed and MultiHeadAttention weight scale generation to ncnn2table
  • Add RNN, LSTM, and GRU weight scale generation to ncnn2table
  • Update ncnn2int8 to read these scales from the calibration table instead of recomputing them locally
  • Make calibration dataset optional for models that only need static weight scales and do not require activation calibration
  • Keep SDPA unchanged, since it uses dynamic activation quantization in forward_int8

Test

using minimal RNN,LSTM,GRU,Eembed-Attn network to test:

Eembed-Attn

quantized param files:

7767517
3 3
Input                    in0                      0 1 in0
Embed                    embed_0                  1 1 in0 1 0=8 1=16 3=128 18=2
MultiHeadAttention       attention_1              1 1 1 out0 0=8 1=2 2=64 3=8 4=8 6=5.000000e-01 18=2

precision analysis:

fp32 model : tiny_embed_attn.ncnn.param/.bin
int8 model : tiny_embed_attn_int8.ncnn.param/.bin
samples    : 100
seq_len    : 4
input_size : 8
seed       : 0

overall metrics
  max_abs  = 0.00712827
  mean_abs = 0.00212720
  rmse     = 0.00247913

RNN

quantized param files:

7767517
3 3
Input                    in0                      0 1 in0
RNN                      rnn_1                    1 1 in0 1 0=8 1=64 8=2
Gemm                     gemm_0                   1 1 1 out0 3=1 5=1 6=1 7=4 8=4 9=8 10=4 18=2

precision analysis:

fp32 model : tiny_rnn.ncnn.param/.bin
int8 model : tiny_rnn_int8.ncnn.param/.bin
samples    : 100
seq_len    : 4
input_size : 8
seed       : 0

overall metrics
  max_abs  = 0.04329279
  mean_abs = 0.00797669
  rmse     = 0.01239488

GRU

quantized param files:

7767517
3 3
Input                    in0                      0 1 in0
GRU                      gru_1                    1 1 in0 1 0=8 1=192 8=2
Gemm                     gemm_0                   1 1 1 out0 3=1 5=1 6=1 7=4 8=4 9=8 10=4 18=2

precision analysis:

fp32 model : tiny_gru.ncnn.param/.bin
int8 model : tiny_gru_int8.ncnn.param/.bin
samples    : 100
seq_len    : 4
input_size : 8
seed       : 0

overall metrics
  max_abs  = 0.00559735
  mean_abs = 0.00107971
  rmse     = 0.00136703

LSTM

quantized param files:

7767517
3 3
Input                    in0                      0 1 in0
LSTM                     lstm_1                   1 1 in0 1 0=8 1=256 3=8 8=2
Gemm                     gemm_0                   1 1 1 out0 3=1 5=1 6=1 7=4 8=4 9=8 10=4 18=2

precision analysis:

fp32 model : tiny_lstm.ncnn.param/.bin
int8 model : tiny_lstm_int8.ncnn.param/.bin
samples    : 100
seq_len    : 4
input_size : 8
seed       : 0

overall metrics
  max_abs  = 0.00386286
  mean_abs = 0.00055465
  rmse     = 0.00072828

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant