1、加载模型,查看日志 INFO:tensorflow:** Trainable Variables **** INFO:tensorflow: name = bert/embeddings/word_embeddings:0, shape = (21128, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/embeddings/token_type_embeddings:0, shape = (2, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/embeddings/position_embeddings:0, shape = (512, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/embeddings/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/embeddings/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/self/query/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/self/query/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/self/key/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/self/key/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/self/value/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/self/value/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/attention/self/query/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/attention/self/query/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/attention/self/key/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/attention/self/key/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/attention/self/value/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/attention/self/value/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_1/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/attention/self/query/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/attention/self/query/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/attention/self/key/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/attention/self/key/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/attention/self/value/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/attention/self/value/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_2/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/attention/self/query/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/attention/self/query/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/attention/self/key/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/attention/self/key/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/attention/self/value/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/attention/self/value/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_3/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/attention/self/query/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/attention/self/query/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/attention/self/key/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/attention/self/key/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/attention/self/value/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/attention/self/value/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_4/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/attention/self/query/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/attention/self/query/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/attention/self/key/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/attention/self/key/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/attention/self/value/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/attention/self/value/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_5/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/attention/self/query/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/attention/self/query/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/attention/self/key/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/attention/self/key/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/attention/self/value/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/attention/self/value/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_6/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/attention/self/query/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/attention/self/query/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/attention/self/key/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/attention/self/key/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/attention/self/value/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/attention/self/value/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_7/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/attention/self/query/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/attention/self/query/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/attention/self/key/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/attention/self/key/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/attention/self/value/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/attention/self/value/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_8/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/attention/self/query/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/attention/self/query/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/attention/self/key/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/attention/self/key/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/attention/self/value/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/attention/self/value/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_9/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/attention/self/query/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/attention/self/query/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/attention/self/key/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/attention/self/key/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/attention/self/value/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/attention/self/value/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_10/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/self/query/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/self/query/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/self/key/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/self/key/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/self/value/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/self/value/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_11/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/pooler/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/pooler/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = rnn_layer/bidirectional_rnn/fw/basic_lstm_cell/kernel:0, shape = (896, 512) INFO:tensorflow: name = rnn_layer/bidirectional_rnn/fw/basic_lstm_cell/bias:0, shape = (512,) INFO:tensorflow: name = rnn_layer/bidirectional_rnn/bw/basic_lstm_cell/kernel:0, shape = (896, 512) INFO:tensorflow: name = rnn_layer/bidirectional_rnn/bw/basic_lstm_cell/bias:0, shape = (512,) INFO:tensorflow: name = project/hidden/W:0, shape = (256, 128) INFO:tensorflow: name = project/hidden/b:0, shape = (128,) INFO:tensorflow: name = project/logits/W:0, shape = (128, 11) INFO:tensorflow: name = project/logits/b:0, shape = (11,) INFO:tensorflow: name = crf_loss/transitions:0, shape = (11, 11)* *
从日志中可以看到bert模型: (1)、初始化这三个的参数word_embeddings token_type_embeddings position_embeddings INFO:tensorflow: name = bert/embeddings/word_embeddings:0, shape = (21128, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/embeddings/token_type_embeddings:0, shape = (2, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/embeddings/position_embeddings:0, shape = (512, 768), INIT_FROM_CKPT (2)、连续经过12层的encoder: INFO:tensorflow: name = bert/embeddings/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/embeddings/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/self/query/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/self/query/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/self/key/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/self/key/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/self/value/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/self/value/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/output/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/output/LayerNorm/beta:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/attention/output/LayerNorm/gamma:0, shape = (768,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/intermediate/dense/kernel:0, shape = (768, 3072), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/intermediate/dense/bias:0, shape = (3072,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/output/dense/kernel:0, shape = (3072, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_0/output/dense/bias:0, shape = (768,), INIT_FROM_CKPT (3)、获取最后一层encoder的输出: INFO:tensorflow: name = bert/pooler/dense/kernel:0, shape = (768, 768), INIT_FROM_CKPT INFO:tensorflow: name = bert/pooler/dense/bias:0, shape = (768,), INIT_FROM_CKPT (4)、通过双向LSTM,获取表征输出。 INFO:tensorflow: name = rnn_layer/bidirectional_rnn/fw/basic_lstm_cell/kernel:0, shape = (896, 512) INFO:tensorflow: name = rnn_layer/bidirectional_rnn/fw/basic_lstm_cell/bias:0, shape = (512,) INFO:tensorflow: name = rnn_layer/bidirectional_rnn/bw/basic_lstm_cell/kernel:0, shape = (896, 512) INFO:tensorflow: name = rnn_layer/bidirectional_rnn/bw/basic_lstm_cell/bias:0, shape = (512,) (5)、经过两次全连接,256–>128–>11,最终生成11个分类的表征向量 INFO:tensorflow: name = project/hidden/W:0, shape = (256, 128) INFO:tensorflow: name = project/hidden/b:0, shape = (128,) INFO:tensorflow: name = project/logits/W:0, shape = (128, 11) INFO:tensorflow: name = project/logits/b:0, shape = (11,) (6)、通过CRF层,获取最终的损失函数。 INFO:tensorflow: name = crf_loss/transitions:0, shape = (11, 11)
2、计算参数量:由于第二列很多都是空,利用Excel将这个列中的数值置为1: 公式为:=IF(H1=“”,1,H1) 3、对最后一列进行求和1,0322,0612 可是官方宣称bert参数量是1.1亿,为什么这里少了这么多呢? 答:因为官方vocab数量是30522,而不是咱们平时用到的bert中的词表是21128。明白了吧。 4、那么通过bert实现的ner以后的效果怎么样呢? 在测试集上面的准确率召回率:
eval_f = 0.95210534
eval_precision = 0.960108
eval_recall = 0.944235
global_step = 195600
loss = 4.2875824
看到了这个效果以后,我还是挺震撼的, 我想象中的效果应该是80%左右,没有想到bert这么强悍。让我重新对认识transformer的能力了。
而在训练集上面,直接100%
eval_f = 1.0
eval_precision = 1.0
eval_recall = 1.0
global_step = 195600
loss = 0.00012448594
|