命名实体识别NER探索(5) Bert+BiLSTM+CRF模型实战应用

系列文章目录

命名实体识别NER探索(1) https://duanzhihua.blog.csdn.net/article/details/108338970 命名实体识别NER探索(2) https://duanzhihua.blog.csdn.net/article/details/108391645 命名实体识别NER探索(3)-Bi-LSTM+CRF模型 https://duanzhihua.blog.csdn.net/article/details/108392532 命名实体识别NER探索(4) 通过scikit-learn、pytorch实现HMM 及CRF模型 https://duanzhihua.blog.csdn.net/article/details/108650903 Viterbi算法实战案例(天气变化、词性预测) https://duanzhihua.blog.csdn.net/article/details/104992597


本文内容

通过Bert+Bi-LSTM+CRF模型探索中文关键信息实体识别。

  • 使用BERT预训练模型,获取每一个标识的向量表示特征
  • 输入BiLSTM模型学习文本之间的关系
  • 通过CRF层获取每个标识的分类结果
    BERT+BiLSTM+CRF模型图
    在这里插入图片描述

数据集

数据集用的是客服热线的内部话单数据,将客服人员接听的语音数据自动翻译为文本数据,然后从文本数据中提取具体的地址信息。数据记录格式如下:

,工号8888,为您服务,,唉,您好,我想咨询一下,就是这种呃建筑工地深跟半夜还在施工,噪音这种,呃有什么规定和要求吗?能,呃有什么方式能让他反映给
他们,处理嘛,这种,对对,,对的,对的对的,,地址是在普陀区东兴路这里,东,爱心叫新旧的新,,唉,东兴路呃,88,,唉,,呃,新湖明珠,,新旧的新,河,
湖水的湖,,唉,明白的明唉,明珠的朱,,对对对,对,他现在是有一半,我们是住在对面嘛,他现在那边还正在,,呃就是还在施工,就是现在这会还在施工,,
,早上有早上大概8888点钟就有了,,呃,中坚,,唉,晚上晚上现在到88点多还没停对的,,嗯嗯,那声音比较吵那个,,嗯嗯,,能够对对对对对因为或者是至少
有个,您好,请问什么帮您,?呃您好女士这边您主要是反映,嗯就是说是建筑工地施工噪音扰民对吧?嗯,那么我想问一下他这个施工的时间段具体呃就地址是哪
里,,普陀区东兴路是东西南北的东,呃,新是哪个新啊,?新旧的新噢,东兴路哪里呢,?是88弄的啊,呃,叫什么名字呢他,新湖名松筠就地深,,湖水的湖呃,,
明珠,到上面做民族,,呃这个是小区是吧,它新建的这个工地的,,现在那么他这个是,嗯嗯是施工时间大概是从早上有来,早上有时候吗?还是,早上这边嗯,,
五六点钟,嗯,然后呢一直到设备嗯,,到现在还没有听到吧,,噢,好的好的,噢,那知道了,我帮您反映一下,那就是来电,你的诉求是希望管理部门,能够吃制

标注集在标准BIOES(B表示实体开头,E表示实体结尾,I表示在实体内部,O表示非实体)基础上,采用内部的地址实体标注。例如’Q-B’, ‘Q-I’, ‘Q-E’,分别表示区的开头,区的中间词,区的结尾

浦 Q-B
东 Q-I
新 Q-I
区 Q-E
.....
, M
, M
就 M
是 M
说 M
呃 M
徐 Q-B
汇 Q-I
区 Q-E
康 Z-B
健 Z-I
街 Z-I
道 Z-E
桂 L-B
平 L-I
路 L-E
, M

Bert+BiLSTM+CRF模型代码

基于github网上大佬的Bert+BiLSTM+CRF基线模型改进,关键代码如下:

import tensorflow as tf
from tf_utils.bert_modeling import BertModel, BertConfig, get_assignment_map_from_checkpoint
from tensorflow.contrib.crf import crf_log_likelihood
from tensorflow.contrib.layers.python.layers import initializers
from tf_utils import rnncell as rnn


class Model:

    def __init__(self, config):
        self.config = config
        # 模型的数据占位符
        self.input_x_word = tf.placeholder(tf.int32, [None, None], name="input_x_word")
        self.input_x_len = tf.placeholder(tf.int32, name='input_x_len')
        self.input_mask = tf.placeholder(tf.int32, [None, None], name='input_mask')
        self.input_relation = tf.placeholder(tf.int32, [None, None], name='input_relation')  # 实体NER的真实标签
        self.keep_prob = tf.placeholder(tf.float32, name='dropout_keep_prob')
        self.is_training = tf.placeholder(tf.bool, None, name='is_training')

        # BERT Embedding
        self.init_embedding(bert_init=True)
        output_layer = self.word_embedding

        # 超参数设置
        self.relation_num = self.config.relation_num
        self.initializer = initializers.xavier_initializer()
        self.lstm_dim = self.config.lstm_dim
        self.embed_dense_dim = self.config.embed_dense_dim
        self.dropout = self.config.dropout
        self.model_type = self.config.model_type
        print('Run Model Type:', self.model_type)

        # idcnn的超参数
        self.layers = [
            {
                'dilation': 1
            },
            {
                'dilation': 1
            },
            {
                'dilation': 2
            },
        ]
        self.filter_width = 3
        self.num_filter = self.lstm_dim
        self.embedding_dim = self.embed_dense_dim
        self.repeat_times = 4
        self.cnn_output_width = 0

        # CRF超参数
        used = tf.sign(tf.abs(self.input_x_word))
        length = tf.reduce_sum(used, reduction_indices=1)
        self.lengths = tf.cast(length, tf.int32)
        self.batch_size = tf.shape(self.input_x_word)[0]
        self.num_steps = tf.shape(self.input_x_word)[-1]
        if self.model_type == 'bilstm':
            lstm_inputs = tf.nn.dropout(output_layer, self.dropout)
            lstm_outputs = self.biLSTM_layer(lstm_inputs, self.lstm_dim, self.lengths)
            self.logits = self.project_layer(lstm_outputs)

        elif self.model_type == 'idcnn':
            model_inputs = tf.nn.dropout(output_layer, self.dropout)
            model_outputs = self.IDCNN_layer(model_inputs)
            self.logits = self.project_layer_idcnn(model_outputs)

        else:
            raise KeyError

        # 计算损失
        self.loss = self.loss_layer(self.logits, self.lengths)


    def biLSTM_layer(self, lstm_inputs, lstm_dim, lengths, name=None):
        """
        :param lstm_inputs: [batch_size, num_steps, emb_size]
        :return: [batch_size, num_steps, 2*lstm_dim]
        """
        with tf.name_scope("char_BiLSTM" if not name else name):
            lstm_cell = {}
            for direction in ["forward", "backward"]:
                with tf.name_scope(direction):
                    lstm_cell[direction] = rnn.CoupledInputForgetGateLSTMCell(
                        lstm_dim,
                        use_peepholes=True,
                        initializer=self.initializer,
                        state_is_tuple=True)
            outputs, final_states = tf.nn.bidirectional_dynamic_rnn(
                lstm_cell["forward"],
                lstm_cell["backward"],
                lstm_inputs,
                dtype=tf.float32,
                sequence_length=lengths)
        return tf.concat(outputs, axis=2)

    def project_layer(self, lstm_outputs, name=None):
        """
        hidden layer between lstm layer and logits
        :param lstm_outputs: [batch_size, num_steps, emb_size]
        :return: [batch_size, num_steps, num_tags]
        """
        with tf.name_scope("project" if not name else name):
            with tf.name_scope("hidden"):
                W = tf.get_variable("HW", shape=[self.lstm_dim * 2, self.lstm_dim],
                                    dtype=tf.float32, initializer=self.initializer)

                b = tf.get_variable("Hb", shape=[self.lstm_dim], dtype=tf.float32,
                                    initializer=tf.zeros_initializer())
                output = tf.reshape(lstm_outputs, shape=[-1, self.lstm_dim * 2])
                hidden = tf.tanh(tf.nn.xw_plus_b(output, W, b))

            # project to score of tags
            with tf.name_scope("logits"):
                W = tf.get_variable("LW", shape=[self.lstm_dim, self.relation_num],
                                    dtype=tf.float32, initializer=self.initializer)

                b = tf.get_variable("Lb", shape=[self.relation_num], dtype=tf.float32,
                                    initializer=tf.zeros_initializer())

                pred = tf.nn.xw_plus_b(hidden, W, b)

            return tf.reshape(pred, [-1, self.num_steps, self.relation_num], name='pred_logits')

    def IDCNN_layer(self, model_inputs, name=None):
        """
        :param idcnn_inputs: [batch_size, num_steps, emb_size]
        :return: [batch_size, num_steps, cnn_output_width]
        """
        model_inputs = tf.expand_dims(model_inputs, 1)
        with tf.variable_scope("idcnn" if not name else name):
            shape = [1, self.filter_width, self.embedding_dim,
                     self.num_filter]
            print(shape)
            filter_weights = tf.get_variable(
                "idcnn_filter",
                shape=[1, self.filter_width, self.embedding_dim, self.num_filter],
                initializer=self.initializer
            )

            layerInput = tf.nn.conv2d(model_inputs,
                                      filter_weights,
                                      strides=[1, 1, 1, 1],
                                      padding="SAME",
                                      name="init_layer")
            finalOutFromLayers = []
            totalWidthForLastDim = 0
            for j in range(self.repeat_times):
                for i in range(len(self.layers)):
                    dilation = self.layers[i]['dilation']
                    isLast = True if i == (len(self.layers) - 1) else False
                    with tf.variable_scope("atrous-conv-layer-%d" % i,
                                           reuse=tf.AUTO_REUSE):
                        w = tf.get_variable(
                            "filterW",
                            shape=[1, self.filter_width, self.num_filter,
                                   self.num_filter],
                            initializer=tf.contrib.layers.xavier_initializer())
                        b = tf.get_variable("filterB", shape=[self.num_filter])
                        conv = tf.nn.atrous_conv2d(layerInput,
                                                   w,
                                                   rate=dilation,
                                                   padding="SAME")
                        conv = tf.nn.bias_add(conv, b)
                        conv = tf.nn.relu(conv)
                        if isLast:
                            finalOutFromLayers.append(conv)
                            totalWidthForLastDim += self.num_filter
                        layerInput = conv
            finalOut = tf.concat(axis=3, values=finalOutFromLayers)
            keepProb = tf.cond(self.is_training, lambda: 0.8, lambda: 1.0)
            # keepProb = 1.0 if reuse else 0.5
            finalOut = tf.nn.dropout(finalOut, keepProb)

            finalOut = tf.squeeze(finalOut, [1])
            finalOut = tf.reshape(finalOut, [-1, totalWidthForLastDim])
            self.cnn_output_width = totalWidthForLastDim
            return finalOut

    def project_layer_idcnn(self, idcnn_outputs, name=None):
        """
        :param lstm_outputs: [batch_size, num_steps, emb_size]
        :return: [batch_size, num_steps, num_tags]
        """
        with tf.name_scope("project" if not name else name):
            # project to score of tags
            with tf.name_scope("logits"):
                W = tf.get_variable("PLW", shape=[self.cnn_output_width, self.relation_num],
                                    dtype=tf.float32, initializer=self.initializer)

                b = tf.get_variable("PLb", initializer=tf.constant(0.001, shape=[self.relation_num]))

                pred = tf.nn.xw_plus_b(idcnn_outputs, W, b)

            return tf.reshape(pred, [-1, self.num_steps, self.relation_num], name='pred_logits')

    def loss_layer(self, project_logits, lengths, name=None):
        """
        计算CRF的loss
        :param project_logits: [1, num_steps, num_tags]
        :return: scalar loss
        """
        with tf.name_scope("crf_loss" if not name else name):
            small = -1000.0
            # pad logits for crf loss
            start_logits = tf.concat(
                [small * tf.ones(shape=[self.batch_size, 1, self.relation_num]), tf.zeros(shape=[self.batch_size, 1, 1])],
                axis=-1)
            pad_logits = tf.cast(small * tf.ones([self.batch_size, self.num_steps, 1]), tf.float32)
            logits = tf.concat([project_logits, pad_logits], axis=-1)
            logits = tf.concat([start_logits, logits], axis=1)
            targets = tf.concat(
                [tf.cast(self.relation_num * tf.ones([self.batch_size, 1]), tf.int32), self.input_relation], axis=-1)

            self.trans = tf.get_variable(
                name="transitions",
                shape=[self.relation_num + 1, self.relation_num + 1],  # 1
                # shape=[self.relation_num, self.relation_num],  # 1
                initializer=self.initializer)
            log_likelihood, self.trans = crf_log_likelihood(
                inputs=logits,
                tag_indices=targets,
                # tag_indices=self.input_relation,
                transition_params=self.trans,
                # sequence_lengths=lengths
                sequence_lengths=lengths + 1
            )  # + 1
            return tf.reduce_mean(-log_likelihood, name='loss')

    def init_embedding(self, bert_init=True):
        """
        对BERT的Embedding降维
        :param bert_init:
        :return:
        """
        with tf.name_scope('embedding'):
            word_embedding = self.bert_embed(bert_init)
            print('self.embed_dense_dim:', self.config.embed_dense_dim)
            word_embedding = tf.layers.dense(word_embedding, self.config.embed_dense_dim, activation=tf.nn.relu)
            hidden_size = word_embedding.shape[-1].value
        self.word_embedding = word_embedding
        print(word_embedding.shape)
        self.output_layer_hidden_size = hidden_size

    def bert_embed(self, bert_init=True):
        """
        读取BERT的TF模型
        :param bert_init:
        :return:
        """
        bert_config_file = self.config.bert_config_file
        bert_config = BertConfig.from_json_file(bert_config_file)
        # batch_size, max_seq_length = get_shape_list(self.input_x_word)
        # bert_mask = tf.pad(self.input_mask, [[0, 0], [2, 0]], constant_values=1)  # tensor左边填充2列
        model = BertModel(
            config=bert_config,
            is_training=self.is_training,  # 微调
            input_ids=self.input_x_word,
            input_mask=self.input_mask,
            token_type_ids=None,
            use_one_hot_embeddings=False)

        layer_logits = []
        for i, layer in enumerate(model.all_encoder_layers):
            layer_logits.append(
                tf.layers.dense(
                    layer, 1,
                    kernel_initializer=tf.truncated_normal_initializer(stddev=0.02),
                    name="layer_logit%d" % i
                )
            )

        layer_logits = tf.concat(layer_logits, axis=2)  # 第三维度拼接
        layer_dist = tf.nn.softmax(layer_logits)
        seq_out = tf.concat([tf.expand_dims(x, axis=2) for x in model.all_encoder_layers], axis=2)
        pooled_output = tf.matmul(tf.expand_dims(layer_dist, axis=2), seq_out)
        pooled_output = tf.squeeze(pooled_output, axis=2)
        pooled_layer = pooled_output
        # char_bert_outputs = pooled_laRERyer[:, 1: max_seq_length - 1, :]  # [batch_size, seq_length, embedding_size]
        char_bert_outputs = pooled_layer

        if self.config.use_origin_bert:
            final_hidden_states = model.get_sequence_output()  # 原生bert
            self.config.embed_dense_dim = 768
        else:
            final_hidden_states = char_bert_outputs  # 多层融合bert
            self.config.embed_dense_dim = 512

        tvars = tf.trainable_variables()
        init_checkpoint = self.config.bert_file  # './chinese_L-12_H-768_A-12/bert_model.ckpt'
        assignment_map, initialized_variable_names = get_assignment_map_from_checkpoint(tvars, init_checkpoint)
        if bert_init:
            tf.train.init_from_checkpoint(init_checkpoint, assignment_map)

        tf.logging.info("**** Trainable Variables ****")
        for var in tvars:
            init_string = ""
            if var.name in initialized_variable_names:
                init_string = ", *INIT_FROM_CKPT*"
            print("  name = {}, shape = {}{}".format(var.name, var.shape, init_string))
        print('init bert from checkpoint: {}'.format(init_checkpoint))
        return final_hidden_states

BERT模型运行日志

nohup: ignoring input
WARNING:tensorflow:From /data/Test/12345-BERT-NER/optimization.py:155: The name tf.train.AdamOptimizer is deprecated. Pleas
e use tf.compat.v1.train.AdamOptimizer instead.

WARNING:tensorflow:From /data/Test/12345-BERT-NER/bert/tokenization.py:125: The name tf.gfile.GFile is deprecated. Please u
se tf.io.gfile.GFile instead.

WARNING:tensorflow:From train_fine_tune.py:41: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto i
nstead.

WARNING:tensorflow:From train_fine_tune.py:43: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2020-10-21 20:56:51.457173: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this T
ensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-10-21 20:56:51.476659: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2020-10-21 20:56:51.482845: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4faeb90 initialized for platfor
m Host (this does not guarantee that XLA will be used). Devices:
2020-10-21 20:56:51.482889: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Ve
rsion
2020-10-21 20:56:51.486817: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcuda.so.1
2020-10-21 20:56:51.629666: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4faad50 initialized for platfor
m CUDA (this does not guarantee that XLA will be used). Devices:
2020-10-21 20:56:51.629747: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Tesla P40, Compu
te Capability 6.1
2020-10-21 20:56:51.632698: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties:
name: Tesla P40 major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:b6:00.0
2020-10-21 20:56:51.633321: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcudart.so.10.0
2020-10-21 20:56:51.637701: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcublas.so.10.0
2020-10-21 20:56:51.641439: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcufft.so.10.0
2020-10-21 20:56:51.641968: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcurand.so.10.0
2020-10-21 20:56:51.645511: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcusolver.so.10.0
2020-10-21 20:56:51.648178: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcusparse.so.10.0
2020-10-21 20:56:51.654540: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcudnn.so.7
2020-10-21 20:56:51.657031: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0
2020-10-21 20:56:51.657081: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic lib
rary libcudart.so.10.0
2020-10-21 20:56:51.658901: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor wit
h strength 1 edge matrix:
2020-10-21 20:56:51.658922: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186]      0
2020-10-21 20:56:51.658936: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0:   N
2020-10-21 20:56:51.661417: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localh
ost/replica:0/task:0/device:GPU:0 with 21625 MB memory) -> physical GPU (device: 0, name: Tesla P40, pci bus id: 0000:b6:00
.0, compute capability: 6.1)
WARNING:tensorflow:From /data/Test/12345-BERT-NER/model.py:13: The name tf.placeholder is deprecated. Please use tf.compat.
v1.placeholder instead.

WARNING:tensorflow:From /data/Test/12345-BERT-NER/tf_utils/bert_modeling.py:175: The name tf.variable_scope is deprecated.
Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From /data/Test/12345-BERT-NER/tf_utils/bert_modeling.py:416: The name tf.get_variable is deprecated. Pl
ease use tf.compat.v1.get_variable instead.

WARNING:tensorflow:From /data/Test/12345-BERT-NER/tf_utils/bert_modeling.py:497: The name tf.assert_less_equal is deprecate
d. Please use tf.compat.v1.assert_less_equal instead.

WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From /data/Test/12345-BERT-NER/tf_utils/bert_modeling.py:364: calling dropout (from tensorflow.python.op
s.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From /data/Test/12345-BERT-NER/tf_utils/bert_modeling.py:874: dense (from tensorflow.python.layers.core)
 is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.Dense instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/layers/core.py:187: Layer.apply (from
 tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
WARNING:tensorflow:From /data/Test/12345-BERT-NER/tf_utils/bert_modeling.py:282: The name tf.erf is deprecated. Please use
tf.math.erf instead.

WARNING:tensorflow:From /data/Test/12345-BERT-NER/model.py:288: The name tf.trainable_variables is deprecated. Please use t
f.compat.v1.trainable_variables instead.

WARNING:tensorflow:From /data/Test/12345-BERT-NER/model.py:292: The name tf.train.init_from_checkpoint is deprecated. Pleas
e use tf.compat.v1.train.init_from_checkpoint instead.

WARNING:tensorflow:From /data/Test/12345-BERT-NER/model.py:294: The name tf.logging.info is deprecated. Please use tf.compa
t.v1.logging.info instead.

#### /data/Test/12345-BERT-NER
GPU ID:  0
Model Type:  bilstm
Fine Tune Learning Rate:  5e-05
Data dir:  ./data/12345_entity_recog/clear_csv_data/
Pretrained Model Vocab:  ./data/pretrained_model/BERT/vocab.txt
bilstm embedding  256
...
798
Get the train iter data and dev iter data........!
...
198
  name = bert/embeddings/word_embeddings:0, shape = (21128, 768), *INIT_FROM_CKPT*
  name = bert/embeddings/token_type_embeddings:0, shape = (2, 768), *INIT_FROM_CKPT*
  name = bert/embeddings/position_embeddings:0, shape = (512, 768), *INIT_FROM_CKPT*
  name = bert/embeddings/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/embeddings/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_0/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_1/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_2/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_3/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_4/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_5/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_6/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_7/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_8/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_9/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_10/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_10/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
  name = bert/encoder/layer_10/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
  name = bert/encoder/layer_10/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*WARNING:tensorflow:From /data/Tes
t/12345-BERT-NER/model.py:93: bidirectional_dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed
in a future version.
Instructions for updating:
Please use `keras.layers.Bidirectional(keras.layers.RNN(cell))`, which is equivalent to this API
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/rnn.py:464: dynamic_rnn (from ten
sorflow.python.ops.rnn) is deprecated and will be removed in a future version.
......

Bert+BiLSTM+CRF模型运行结果

文本信息:
[CLS],工号8079,为您服务,,呃,现在想跟你们反映一下,就是淮海中路41568,,呃,他们的房子啊,有人在改造,,呃,,,我看也沟通了,,刚才呢跟76589打电话说要跟我们79299联系,7489,那个电话打不通,那么跟你们反映一下,,请有关部门来来看一下,呃,这样做是不是和服务有关的规定,没有,A,,呃黄埔区的,,呃,淮海中路啊,,还在海上呢,还还是,,呃项怀忠的淮啊,三点水,,现在有人在条889847号的强辩在改造房屋在条房屋墙,这个,,唉,这个谁来管一下,,看一下,A,房管部门啊,258236,,呃骸,,唉,55745打不通,,对,,唉,再跟你们反映一下,,呃,,唉,,好的,,呃,最好给我一个回复吧,85164036,,没有家里电话,,,唉,我叫吴总件,,您好,请问什么可以帮您,,请问一下,这个事情之前有向我们15543电话反映过吗,?你刚才跟我说是淮海中路0398号对吗,?请问什么区,,淮海中路的写法是淮海战役的淮海,忠心的忠,对吗,?你要投诉它什么呢,?噢,就投诉他教会群众想了,,你刚才说你刚才说向哪里啊?894947反映啊,87217是什么,房?管是59531,,呃,154297,然[SEP]
打标文本的地址信息:
[{'word': '淮海中路', 'start': 30, 'end': 34, 'type': 'L-E'}, {'word': '黄埔区', 'start': 154, 'end': 157, 'type': 'Q-E'},{'word': '039弄8号', 'start': 391, 'end': 397, 'type': 'H-E'}]
预测的地址信息:
[{'word': '淮海中路', 'start': 30, 'end': 34, 'type': 'L-E'}, {'word': '41568号', 'start': 34, 'end': 40, 'type': 'H-E'}, {'word': '黄埔区', 'start': 154, 'end': 157, 'type': 'Q-E'}, {'word': '039弄8号', 'start': 391, 'end': 397, 'type': 'H-E'}]

总结

本文简单介绍了Bert+BiLSTM+CRF模型的概念,及Bert+BiLSTM+CRF模型的案例应用。

段智华 CSDN认证博客专家 Spark AI 企业级AI技术
本人从事大数据人工智能开发和运维工作十余年,码龄5年,深入研究Spark源码,参与王家林大咖主编出版Spark+AI系列图书5本,清华大学出版社最新出版2本新书《Spark大数据商业实战三部曲:内核解密|商业案例|性能调优》第二版、《企业级AI技术内幕:深度学习框架开发+机器学习案例实战+Alluxio解密》,《企业级AI技术内幕》新书分为盘古人工智能框架开发专题篇、机器学习案例实战篇、分布式内存管理系统Alluxio解密篇。Spark新书第二版以数据智能为灵魂,包括内核解密篇,商业案例篇,性能调优篇和Spark+AI解密篇。从2015年开始撰写博文,累计原创1059篇,博客阅读量达155万次
已标记关键词 清除标记
©️2020 CSDN 皮肤主题: 大白 设计师:CSDN官方博客 返回首页
实付 19.90元
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、C币套餐、付费专栏及课程。

余额充值