tools:Move Embed/MHA/RNN/LSTM/GRU weight scale generation to ncnn2table by Roundaboutt · Pull Request #6688 · Tencent/ncnn

Roundaboutt · 2026-04-20T06:08:01Z

Description

This PR moves static weight scale generation for several non-convolution layers from ncnn2int8 to ncnn2table, following the same table-driven workflow already used by other quantized layers.

Changes

Add Embed and MultiHeadAttention weight scale generation to ncnn2table
Add RNN, LSTM, and GRU weight scale generation to ncnn2table
Update ncnn2int8 to read these scales from the calibration table instead of recomputing them locally
Make calibration dataset optional for models that only need static weight scales and do not require activation calibration
Keep SDPA unchanged, since it uses dynamic activation quantization in forward_int8

Test

using minimal RNN,LSTM,GRU,Eembed-Attn network to test:

Eembed-Attn

quantized param files:

7767517
3 3
Input                    in0                      0 1 in0
Embed                    embed_0                  1 1 in0 1 0=8 1=16 3=128 18=2
MultiHeadAttention       attention_1              1 1 1 out0 0=8 1=2 2=64 3=8 4=8 6=5.000000e-01 18=2

precision analysis:

fp32 model : tiny_embed_attn.ncnn.param/.bin
int8 model : tiny_embed_attn_int8.ncnn.param/.bin
samples    : 100
seq_len    : 4
input_size : 8
seed       : 0

overall metrics
  max_abs  = 0.00712827
  mean_abs = 0.00212720
  rmse     = 0.00247913

RNN

quantized param files:

7767517
3 3
Input                    in0                      0 1 in0
RNN                      rnn_1                    1 1 in0 1 0=8 1=64 8=2
Gemm                     gemm_0                   1 1 1 out0 3=1 5=1 6=1 7=4 8=4 9=8 10=4 18=2

precision analysis:

fp32 model : tiny_rnn.ncnn.param/.bin
int8 model : tiny_rnn_int8.ncnn.param/.bin
samples    : 100
seq_len    : 4
input_size : 8
seed       : 0

overall metrics
  max_abs  = 0.04329279
  mean_abs = 0.00797669
  rmse     = 0.01239488

GRU

quantized param files:

7767517
3 3
Input                    in0                      0 1 in0
GRU                      gru_1                    1 1 in0 1 0=8 1=192 8=2
Gemm                     gemm_0                   1 1 1 out0 3=1 5=1 6=1 7=4 8=4 9=8 10=4 18=2

precision analysis:

fp32 model : tiny_gru.ncnn.param/.bin
int8 model : tiny_gru_int8.ncnn.param/.bin
samples    : 100
seq_len    : 4
input_size : 8
seed       : 0

overall metrics
  max_abs  = 0.00559735
  mean_abs = 0.00107971
  rmse     = 0.00136703

LSTM

quantized param files:

7767517
3 3
Input                    in0                      0 1 in0
LSTM                     lstm_1                   1 1 in0 1 0=8 1=256 3=8 8=2
Gemm                     gemm_0                   1 1 1 out0 3=1 5=1 6=1 7=4 8=4 9=8 10=4 18=2

precision analysis:

fp32 model : tiny_lstm.ncnn.param/.bin
int8 model : tiny_lstm_int8.ncnn.param/.bin
samples    : 100
seq_len    : 4
input_size : 8
seed       : 0

overall metrics
  max_abs  = 0.00386286
  mean_abs = 0.00055465
  rmse     = 0.00072828

…from the table in ncnn2int8

Roundaboutt added 7 commits April 17, 2026 22:03

tools: add embed weight calibration and int8 quantization

5173dfb

tools: add embed weight calibration and int8 quantization

db9850a

tools:add MutiHeadAttention layers' weight scales in ncnn2table

0207b6b

tools:add weight-only mode without calibration in ncnn2table

08988f2

tools:Change the MultiHeadAttention layer scaling factors to be read …

bd39d13

…from the table in ncnn2int8

complete rnn,gru,lstm layers

94c834a

supplement documents and printing information

43caf20

github-actions Bot added tool doc labels Apr 20, 2026

apply code-format changes

fe82759

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tools:Move Embed/MHA/RNN/LSTM/GRU weight scale generation to ncnn2table#6688

tools:Move Embed/MHA/RNN/LSTM/GRU weight scale generation to ncnn2table#6688
Roundaboutt wants to merge 8 commits intoTencent:masterfrom
Roundaboutt:opt-quantize-int8

Roundaboutt commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Roundaboutt commented Apr 20, 2026

Description

Changes

Test

Eembed-Attn

RNN

GRU

LSTM

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant