系列文章目录
命名实体识别NER探索(1) https://duanzhihua.blog.csdn.net/article/details/108338970 命名实体识别NER探索(2) https://duanzhihua.blog.csdn.net/article/details/108391645 命名实体识别NER探索(3)-Bi-LSTM+CRF模型 https://duanzhihua.blog.csdn.net/article/details/108392532 命名实体识别NER探索(4) 通过scikit-learn、pytorch实现HMM 及CRF模型 https://duanzhihua.blog.csdn.net/article/details/108650903 Viterbi算法实战案例(天气变化、词性预测) https://duanzhihua.blog.csdn.net/article/details/104992597本文内容
通过Bert+Bi-LSTM+CRF模型探索中文关键信息实体识别。
- 使用BERT预训练模型,获取每一个标识的向量表示特征
- 输入BiLSTM模型学习文本之间的关系
- 通过CRF层获取每个标识的分类结果
BERT+BiLSTM+CRF模型图
数据集
数据集用的是客服热线的内部话单数据,将客服人员接听的语音数据自动翻译为文本数据,然后从文本数据中提取具体的地址信息。数据记录格式如下:
,工号8888,为您服务,,唉,您好,我想咨询一下,就是这种呃建筑工地深跟半夜还在施工,噪音这种,呃有什么规定和要求吗?能,呃有什么方式能让他反映给
他们,处理嘛,这种,对对,,对的,对的对的,,地址是在普陀区东兴路这里,东,爱心叫新旧的新,,唉,东兴路呃,88弄,,唉,,呃,新湖明珠,,新旧的新,河,
湖水的湖,,唉,明白的明唉,明珠的朱,,对对对,对,他现在是有一半,我们是住在对面嘛,他现在那边还正在,,呃就是还在施工,就是现在这会还在施工,唉,
,早上有早上大概8888点钟就有了,,呃,中坚,,唉,晚上晚上现在到88点多还没停对的,,嗯嗯,那声音比较吵那个,,嗯嗯,,能够对对对对对因为或者是至少
有个,您好,请问什么帮您,?呃您好女士这边您主要是反映,嗯就是说是建筑工地施工噪音扰民对吧?嗯,那么我想问一下他这个施工的时间段具体呃就地址是哪
里,,普陀区东兴路是东西南北的东,呃,新是哪个新啊,?新旧的新噢,东兴路哪里呢,?是88弄的啊,呃,叫什么名字呢他,新湖名松筠就地深,,湖水的湖呃,,
明珠,到上面做民族,,呃这个是小区是吧,它新建的这个工地的,,现在那么他这个是,嗯嗯是施工时间大概是从早上有来,早上有时候吗?还是,早上这边嗯,,
五六点钟,嗯,然后呢一直到设备嗯,,到现在还没有听到吧,,噢,好的好的,噢,那知道了,我帮您反映一下,那就是来电,你的诉求是希望管理部门,能够吃制
标注集在标准BIOES(B表示实体开头,E表示实体结尾,I表示在实体内部,O表示非实体)基础上,采用内部的地址实体标注。例如’Q-B’, ‘Q-I’, ‘Q-E’,分别表示区的开头,区的中间词,区的结尾
浦 Q-B
东 Q-I
新 Q-I
区 Q-E
.....
, M
, M
就 M
是 M
说 M
呃 M
徐 Q-B
汇 Q-I
区 Q-E
康 Z-B
健 Z-I
街 Z-I
道 Z-E
桂 L-B
平 L-I
路 L-E
, M
Bert+BiLSTM+CRF模型代码
基于github网上大佬的Bert+BiLSTM+CRF基线模型改进,关键代码如下:
import tensorflow as tf
from tf_utils.bert_modeling import BertModel, BertConfig, get_assignment_map_from_checkpoint
from tensorflow.contrib.crf import crf_log_likelihood
from tensorflow.contrib.layers.python.layers import initializers
from tf_utils import rnncell as rnn
class Model:
def __init__(self, config):
self.config = config
# 模型的数据占位符
self.input_x_word = tf.placeholder(tf.int32, [None, None], name="input_x_word")
self.input_x_len = tf.placeholder(tf.int32, name='input_x_len')
self.input_mask = tf.placeholder(tf.int32, [None, None], name='input_mask')
self.input_relation = tf.placeholder(tf.int32, [None, None], name='input_relation') # 实体NER的真实标签
self.keep_prob = tf.placeholder(tf.float32, name='dropout_keep_prob')
self.is_training = tf.placeholder(tf.bool, None, name='is_training')
# BERT Embedding
self.init_embedding(bert_init=True)
output_layer = self.word_embedding
# 超参数设置
self.relation_num = self.config.relation_num
self.initializer = initializers.xavier_initializer()
self.lstm_dim = self.config.lstm_dim
self.embed_dense_dim = self.config.embed_dense_dim
self.dropout = self.config.dropout
self.model_type = self.config.model_type
print('Run Model Type:', self.model_type)
# idcnn的超参数
self.layers = [
{
'dilation': 1
},
{
'dilation': 1
},
{
'dilation': 2
},
]
self.filter_width = 3
self.num_filter = self.lstm_dim
self.embedding_dim = self.embed_dense_dim
self.repeat_times = 4
self.cnn_output_width = 0
# CRF超参数
used = tf.sign(tf.abs(self.input_x_word))
length = tf.reduce_sum(used, reduction_indices=1)
self.lengths = tf.cast(length, tf.int32)
self.batch_size = tf.shape(self.input_x_word)[0]
self.num_steps = tf.shape(self.input_x_word)[-1]
if self.model_type == 'bilstm':
lstm_inputs = tf.nn.dropout(output_layer, self.dropout)
lstm_outputs = self.biLSTM_layer(lstm_inputs, self.lstm_dim, self.lengths)
self.logits = self.project_layer(lstm_outputs)
elif self.model_type == 'idcnn':
model_inputs = tf.nn.dropout(output_layer, self.dropout)
model_outputs = self.IDCNN_layer(model_inputs)
self.logits = self.project_layer_idcnn(model_outputs)
else:
raise KeyError
# 计算损失
self.loss = self.loss_layer(self.logits, self.lengths)
def biLSTM_layer(self, lstm_inputs, lstm_dim, lengths, name=None):
"""
:param lstm_inputs: [batch_size, num_steps, emb_size]
:return: [batch_size, num_steps, 2*lstm_dim]
"""
with tf.name_scope("char_BiLSTM" if not name else name):
lstm_cell = {}
for direction in ["forward", "backward"]:
with tf.name_scope(direction):
lstm_cell[direction] = rnn.CoupledInputForgetGateLSTMCell(
lstm_dim,
use_peepholes=True,
initializer=self.initializer,
state_is_tuple=True)
outputs, final_states = tf.nn.bidirectional_dynamic_rnn(
lstm_cell["forward"],
lstm_cell["backward"],
lstm_inputs,
dtype=tf.float32,
sequence_length=lengths)
return tf.concat(outputs, axis=2)
def project_layer(self, lstm_outputs, name=None):
"""
hidden layer between lstm layer and logits
:param lstm_outputs: [batch_size, num_steps, emb_size]
:return: [batch_size, num_steps, num_tags]
"""
with tf.name_scope("project" if not name else name):
with tf.name_scope("hidden"):
W = tf.get_variable("HW", shape=[self.lstm_dim * 2, self.lstm_dim],
dtype=tf.float32, initializer=self.initializer)
b = tf.get_variable("Hb", shape=[self.lstm_dim], dtype=tf.float32,
initializer=tf.zeros_initializer())
output = tf.reshape(lstm_outputs, shape=[-1, self.lstm_dim * 2])
hidden = tf.tanh(tf.nn.xw_plus_b(output, W, b))
# project to score of tags
with tf.name_scope("logits"):
W = tf.get_variable("LW", shape=[self.lstm_dim, self.relation_num],
dtype=tf.float32, initializer=self.initializer)
b = tf.get_variable("Lb", shape=[self.relation_num], dtype=tf.float32,
initializer=tf.zeros_initializer())
pred = tf.nn.xw_plus_b(hidden, W, b)
return tf.reshape(pred, [-1, self.num_steps, self.relation_num], name='pred_logits')
def IDCNN_layer(self, model_inputs, name=None):
"""
:param idcnn_inputs: [batch_size, num_steps, emb_size]
:return: [batch_size, num_steps, cnn_output_width]
"""
model_inputs = tf.expand_dims(model_inputs, 1)
with tf.variable_scope("idcnn" if not name else name):
shape = [1, self.filter_width, self.embedding_dim,
self.num_filter]
print(shape)
filter_weights = tf.get_variable(
"idcnn_filter",
shape=[1, self.filter_width, self.embedding_dim, self.num_filter],
initializer=self.initializer
)
layerInput = tf.nn.conv2d(model_inputs,
filter_weights,
strides=[1, 1, 1, 1],
padding="SAME",
name="init_layer")
finalOutFromLayers = []
totalWidthForLastDim = 0
for j in range(self.repeat_times):
for i in range(len(self.layers)):
dilation = self.layers[i]['dilation']
isLast = True if i == (len(self.layers) - 1) else False
with tf.variable_scope("atrous-conv-layer-%d" % i,
reuse=tf.AUTO_REUSE):
w = tf.get_variable(
"filterW",
shape=[1, self.filter_width, self.num_filter,
self.num_filter],
initializer=tf.contrib.layers.xavier_initializer())
b = tf.get_variable("filterB", shape=[self.num_filter])
conv = tf.nn.atrous_conv2d(layerInput,
w,
rate=dilation,
padding="SAME")
conv = tf.nn.bias_add(conv, b)
conv = tf.nn.relu(conv)
if isLast:
finalOutFromLayers.append(conv)
totalWidthForLastDim += self.num_filter
layerInput = conv
finalOut = tf.concat(axis=3, values=finalOutFromLayers)
keepProb = tf.cond(self.is_training, lambda: 0.8, lambda: 1.0)
# keepProb = 1.0 if reuse else 0.5
finalOut = tf.nn.dropout(finalOut, keepProb)
finalOut = tf.squeeze(finalOut, [1])
finalOut = tf.reshape(finalOut, [-1, totalWidthForLastDim])
self.cnn_output_width = totalWidthForLastDim
return finalOut
def project_layer_idcnn(self, idcnn_outputs, name=None):
"""
:param lstm_outputs: [batch_size, num_steps, emb_size]
:return: [batch_size, num_steps, num_tags]
"""
with tf.name_scope("project" if not name else name):
# project to score of tags
with tf.name_scope("logits"):
W = tf.get_variable("PLW", shape=[self.cnn_output_width, self.relation_num],
dtype=tf.float32, initializer=self.initializer)
b = tf.get_variable("PLb", initializer=tf.constant(0.001, shape=[self.relation_num]))
pred = tf.nn.xw_plus_b(idcnn_outputs, W, b)
return tf.reshape(pred, [-1, self.num_steps, self.relation_num], name='pred_logits')
def loss_layer(self, project_logits, lengths, name=None):
"""
计算CRF的loss
:param project_logits: [1, num_steps, num_tags]
:return: scalar loss
"""
with tf.name_scope("crf_loss" if not name else name):
small = -1000.0
# pad logits for crf loss
start_logits = tf.concat(
[small * tf.ones(shape=[self.batch_size, 1, self.relation_num]), tf.zeros(shape=[self.batch_size, 1, 1])],
axis=-1)
pad_logits = tf.cast(small * tf.ones([self.batch_size, self.num_steps, 1]), tf.float32)
logits = tf.concat([project_logits, pad_logits], axis=-1)
logits = tf.concat([start_logits, logits], axis=1)
targets = tf.concat(
[tf.cast(self.relation_num * tf.ones([self.batch_size, 1]), tf.int32), self.input_relation], axis=-1)
self.trans = tf.get_variable(
name="transitions",
shape=[self.relation_num + 1, self.relation_num + 1], # 1
# shape=[self.relation_num, self.relation_num], # 1
initializer=self.initializer)
log_likelihood, self.trans = crf_log_likelihood(
inputs=logits,
tag_indices=targets,
# tag_indices=self.input_relation,
transition_params=self.trans,
# sequence_lengths=lengths
sequence_lengths=lengths + 1
) # + 1
return tf.reduce_mean(-log_likelihood, name='loss')
def init_embedding(self, bert_init=True):
"""
对BERT的Embedding降维
:param bert_init:
:return:
"""
with tf.name_scope('embedding'):
word_embedding = self.bert_embed(bert_init)
print('self.embed_dense_dim:', self.config.embed_dense_dim)
word_embedding = tf.layers.dense(word_embedding, self.config.embed_dense_dim, activation=tf.nn.relu)
hidden_size = word_embedding.shape[-1].value
self.word_embedding = word_embedding
print(word_embedding.shape)
self.output_layer_hidden_size = hidden_size
def bert_embed(self, bert_init=True):
"""
读取BERT的TF模型
:param bert_init:
:return:
"""
bert_config_file = self.config.bert_config_file
bert_config = BertConfig.from_json_file(bert_config_file)
# batch_size, max_seq_length = get_shape_list(self.input_x_word)
# bert_mask = tf.pad(self.input_mask, [[0, 0], [2, 0]], constant_values=1) # tensor左边填充2列
model = BertModel(
config=bert_config,
is_training=self.is_training, # 微调
input_ids=self.input_x_word,
input_mask=self.input_mask,
token_type_ids=None,
use_one_hot_embeddings=False)
layer_logits = []
for i, layer in enumerate(model.all_encoder_layers):
layer_logits.append(
tf.layers.dense(
layer, 1,
kernel_initializer=tf.truncated_normal_initializer(stddev=0.02),
name="layer_logit%d" % i
)
)
layer_logits = tf.concat(layer_logits, axis=2) # 第三维度拼接
layer_dist = tf.nn.softmax(layer_logits)
seq_out = tf.concat([tf.expand_dims(x, axis=2) for x in model.all_encoder_layers], axis=2)
pooled_output = tf.matmul(tf.expand_dims(layer_dist, axis=2), seq_out)
pooled_output = tf.squeeze(pooled_output, axis=2)
pooled_layer = pooled_output
# char_bert_outputs = pooled_laRERyer[:, 1: max_seq_length - 1, :] # [batch_size, seq_length, embedding_size]
char_bert_outputs = pooled_layer
if self.config.use_origin_bert:
final_hidden_states = model.get_sequence_output() # 原生bert
self.config.embed_dense_dim = 768
else:
final_hidden_states = char_bert_outputs # 多层融合bert
self.config.embed_dense_dim = 512
tvars = tf.trainable_variables()
init_checkpoint = self.config.bert_file # './chinese_L-12_H-768_A-12/bert_model.ckpt'
assignment_map, initialized_variable_names = get_assignment_map_from_checkpoint(tvars, init_checkpoint)
if bert_init:
tf.train.init_from_checkpoint(init_checkpoint, assignment_map)
tf.logging.info("**** Trainable Variables ****")
for var in tvars:
init_string = ""
if var.name in initialized_variable_names:
init_string = ", *INIT_FROM_CKPT*"
print(" name = {}, shape = {}{}".format(var.name, var.shape, init_string))
print('init bert from checkpoint: {}'.format(init_checkpoint))
return final_hidden_states
BERT模型运行日志
nohup: ignoring input
WARNING:tensorflow:From /data/Test/12345-BERT-NER/optimization.py:155: The name tf.train.AdamOptimizer is deprecated. Pleas
e use tf.compat.v1.train.AdamOptimizer instead.
WARNING:tensorflow:From /data/Test/12345-BERT-NER/bert/tokenization.py:125: The name tf.gfile.GFile is deprecated. Please u
se tf.io.gfile.GFile instead.
WARNING:tensorflow:From train_fine_tune.py:41: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto i
nstead.
WARNING:tensorflow:From train_fine_tune.py:43: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
2020-10-21 20:56:51.457173: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this T
ensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-10-21 20:56:51.476659: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2020-10-21 20:56:51.482845: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4faeb90 initialized for platfor
m Host (this does not guarantee that XLA will be used). Devices:
2020-10-21 20:56:51.482889: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Ve
rsion
2020-10-21 20:56:51.486817: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcuda.so.1
2020-10-21 20:56:51.629666: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4faad50 initialized for platfor
m CUDA (this does not guarantee that XLA will be used). Devices:
2020-10-21 20:56:51.629747: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Tesla P40, Compu
te Capability 6.1
2020-10-21 20:56:51.632698: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties:
name: Tesla P40 major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:b6:00.0
2020-10-21 20:56:51.633321: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcudart.so.10.0
2020-10-21 20:56:51.637701: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcublas.so.10.0
2020-10-21 20:56:51.641439: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcufft.so.10.0
2020-10-21 20:56:51.641968: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcurand.so.10.0
2020-10-21 20:56:51.645511: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcusolver.so.10.0
2020-10-21 20:56:51.648178: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcusparse.so.10.0
2020-10-21 20:56:51.654540: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcudnn.so.7
2020-10-21 20:56:51.657031: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0
2020-10-21 20:56:51.657081: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcudart.so.10.0
2020-10-21 20:56:51.658901: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor wit
h strength 1 edge matrix:
2020-10-21 20:56:51.658922: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186] 0
2020-10-21 20:56:51.658936: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0: N
2020-10-21 20:56:51.661417: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localh
ost/replica:0/task:0/device:GPU:0 with 21625 MB memory) -> physical GPU (device: 0, name: Tesla P40, pci bus id: 0000:b6:00
.0, compute capability: 6.1)
WARNING:tensorflow:From /data/Test/12345-BERT-NER/model.py:13: The name tf.placeholder is deprecated. Please use tf.compat.
v1.placeholder instead.
WARNING:tensorflow:From /data/Test/12345-BERT-NER/tf_utils/bert_modeling.py:175: The name tf.variable_scope is deprecated.
Please use tf.compat.v1.variable_scope instead.
WARNING:tensorflow:From /data/Test/12345-BERT-NER/tf_utils/bert_modeling.py:416: The name tf.get_variable is deprecated. Pl
ease use tf.compat.v1.get_variable instead.
WARNING:tensorflow:From /data/Test/12345-BERT-NER/tf_utils/bert_modeling.py:497: The name tf.assert_less_equal is deprecate
d. Please use tf.compat.v1.assert_less_equal instead.
WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
* https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
* https://github.com/tensorflow/addons
* https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.
WARNING:tensorflow:From /data/Test/12345-BERT-NER/tf_utils/bert_modeling.py:364: calling dropout (from tensorflow.python.op
s.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From /data/Test/12345-BERT-NER/tf_utils/bert_modeling.py:874: dense (from tensorflow.python.layers.core)
is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.Dense instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/layers/core.py:187: Layer.apply (from
tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
WARNING:tensorflow:From /data/Test/12345-BERT-NER/tf_utils/bert_modeling.py:282: The name tf.erf is deprecated. Please use
tf.math.erf instead.
WARNING:tensorflow:From /data/Test/12345-BERT-NER/model.py:288: The name tf.trainable_variables is deprecated. Please use t
f.compat.v1.trainable_variables instead.
WARNING:tensorflow:From /data/Test/12345-BERT-NER/model.py:292: The name tf.train.init_from_checkpoint is deprecated. Pleas
e use tf.compat.v1.train.init_from_checkpoint instead.
WARNING:tensorflow:From /data/Test/12345-BERT-NER/model.py:294: The name tf.logging.info is deprecated. Please use tf.compa
t.v1.logging.info instead.
#### /data/Test/12345-BERT-NER
GPU ID: 0
Model Type: bilstm
Fine Tune Learning Rate: 5e-05
Data dir: ./data/12345_entity_recog/clear_csv_data/
Pretrained Model Vocab: ./data/pretrained_model/BERT/vocab.txt
bilstm embedding 256
...
798
Get the train iter data and dev iter data........!
...
198
name = bert/embeddings/word_embeddings:0, shape = (21128, 768), *INIT_FROM_CKPT*
name = bert/embeddings/token_type_embeddings:0, shape = (2, 768), *INIT_FROM_CKPT*
name = bert/embeddings/position_embeddings:0, shape = (512, 768), *INIT_FROM_CKPT*
name = bert/embeddings/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/embeddings/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_0/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_1/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_2/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_3/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_4/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_5/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_6/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_7/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_8/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_9/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_10/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_10/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
name = bert/encoder/layer_10/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
name = bert/encoder/layer_10/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*WARNING:tensorflow:From /data/Tes
t/12345-BERT-NER/model.py:93: bidirectional_dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed
in a future version.
Instructions for updating:
Please use `keras.layers.Bidirectional(keras.layers.RNN(cell))`, which is equivalent to this API
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/rnn.py:464: dynamic_rnn (from ten
sorflow.python.ops.rnn) is deprecated and will be removed in a future version.
......
Bert+BiLSTM+CRF模型运行结果
文本信息:
[CLS],工号8079,为您服务,,呃,现在想跟你们反映一下,就是淮海中路41568号,,呃,他们的房子啊,有人在改造,,呃,,,我看也沟通了,,刚才呢跟76589打电话说要跟我们79299联系,7489,那个电话打不通,那么跟你们反映一下,,请有关部门来来看一下,呃,这样做是不是和服务有关的规定,没有,A,,呃黄埔区的,,呃,淮海中路啊,,还在海上呢,还还是,,呃项怀忠的淮啊,三点水,,现在有人在条889847号的强辩在改造房屋在条房屋墙,这个,,唉,这个谁来管一下,,看一下,A,房管部门啊,,258236,,呃骸,,唉,55745打不通,,对,,唉,再跟你们反映一下,,呃,,唉,,好的,,呃,最好给我一个回复吧,,85164036,,没有家里电话,唉,,唉,我叫吴总件,,您好,请问什么可以帮您,,请问一下,这个事情之前有向我们15543电话反映过吗,?你刚才跟我说是淮海中路039弄8号对吗,?请问什么区,,淮海中路的写法是淮海战役的淮海,忠心的忠,对吗,?你要投诉它什么呢,?噢,就投诉他教会群众想了,,你刚才说你刚才说向哪里啊?894947反映啊,,87217是什么,房?管是59531,,呃,154297,然[SEP]
打标文本的地址信息:
[{'word': '淮海中路', 'start': 30, 'end': 34, 'type': 'L-E'}, {'word': '黄埔区', 'start': 154, 'end': 157, 'type': 'Q-E'},{'word': '039弄8号', 'start': 391, 'end': 397, 'type': 'H-E'}]
预测的地址信息:
[{'word': '淮海中路', 'start': 30, 'end': 34, 'type': 'L-E'}, {'word': '41568号', 'start': 34, 'end': 40, 'type': 'H-E'}, {'word': '黄埔区', 'start': 154, 'end': 157, 'type': 'Q-E'}, {'word': '039弄8号', 'start': 391, 'end': 397, 'type': 'H-E'}]
总结
本文简单介绍了Bert+BiLSTM+CRF模型的概念,及Bert+BiLSTM+CRF模型的案例应用。
本人从事大数据人工智能开发和运维工作十余年,码龄5年,深入研究Spark源码,参与王家林大咖主编出版Spark+AI系列图书5本,清华大学出版社最新出版2本新书《Spark大数据商业实战三部曲:内核解密|商业案例|性能调优》第二版、《企业级AI技术内幕:深度学习框架开发+机器学习案例实战+Alluxio解密》,《企业级AI技术内幕》新书分为盘古人工智能框架开发专题篇、机器学习案例实战篇、分布式内存管理系统Alluxio解密篇。Spark新书第二版以数据智能为灵魂,包括内核解密篇,商业案例篇,性能调优篇和Spark+AI解密篇。从2015年开始撰写博文,累计原创1059篇,博客阅读量达155万次