layers in the embeddings, encoder, and pooler.
丢弃概率(嵌入层、编码层、池化层)
attention_probs_dropout_prob: The dropout ratio for the attention
probabilities.
注意力概率的丢弃比例
max_position_embeddings: The maximum sequence length that this model might
ever be used with. Typically set this to something large just in case
(e.g., 512 or 1024 or 2048).
最大序列长度,一般设置大一些以防万一,例如可以设置为512,1024,2048
type_vocab_size: The vocabulary size of the `token_type_ids` passed into
`BertModel`.
token_type_ids的词汇量
initializer_range: The stdev of the truncated_normal_initializer for
initializing all weight matrices.
初始化权重参数的标准差
"""
self.vocab_size = vocab_size
self.hidden_size = hidden_size
self.num_hidden_layers = num_hidden_layers
self.num_attention_heads = num_attention_heads
self.hidden_act = hidden_act
self.intermediate_size = intermediate_size
self.hidden_dropout_prob = hidden_dropout_prob
self.attention_probs_dropout_prob = attention_probs_dropout_prob
self.max_position_embeddings = max_position_embeddings
self.type_vocab_size = type_vocab_size
self.initializer_range = initializer_range
@classmethod 类方法
def from_dict(cls, json_object):
"""Constructs a `BertConfig` from a Python dictionary of parameters."""
从一个参数字典构造配置参数
config = BertConfig(vocab_size=None)
for (key, value) in six.iteritems(json_object):
config.__dict__[key] = value
return config
@classmethod
def from_json_file(cls, json_file): 从JSON文件构造BertConfig对象
"""Constructs a `BertConfig` from a json file of parameters."""
从一个JSON文件构造配置参数
with tf.gfile.GFile(json_file, "r") as reader:
text = reader.read()
return cls.from_dict(json.loads(text))
def to_dict(self): 将BertConfig对象转换为字典
"""Serializes this instance to a Python dictionary."""
output = copy.deepcopy(self.__dict__)
return output
def to_json_string(self): 将BertConfig对象转换为JSON格式字符串
"""Serializes this instance to a JSON string."""
return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n"
BERT模型类
BertModel
class BertModel(object):
"""BERT model ("Bidirectional Encoder Representations from Transformers").
Transformers模型的双向编码表示
Example usage:示例
```python
已经转换为 词片 id形式
# Already been converted into WordPiece token ids
tf.constant用于创建常量张量
input_ids = tf.constant([[31, 51, 99], [15, 5, 0]])
input_mask = tf.constant([[1, 1, 1], [1, 1, 0]])
token_type_ids = tf.constant([[0, 0, 1], [0, 2, 0]])
创建配置参数对象config
config = modeling.BertConfig(vocab_size=32000, hidden_size=512,
num_hidden_layers=8, num_attention_heads=6, intermediate_size=1024)
创建模型对象model
model = modeling.BertModel(config=config, is_training=True,
input_ids=input_ids, input_mask=input_mask, token_type_ids=token_type_ids)
嵌入层标签、池化输出
label_embeddings = tf.get_variable(...)
pooled_output = model.get_pooled_output()
tf.matmul=matrix multiply矩阵相乘
logits = tf.matmul(pooled_output, label_embeddings)
...
```
"""
def __init__(self,
config,
is_training,
input_ids,
input_mask=None,
token_type_ids=None,
use_one_hot_embeddings=False,
scope=None):
"""Constructor for BertModel.
Args:
config: `BertConfig` instance.配置参数对象
is_training: bool. true for training model, false for eval model. Controls
经验总结扩展阅读
- 【lwip】11-UDP协议&源码分析
- 硬核剖析Java锁底层AQS源码,深入理解底层架构设计
- SpringCloudAlibaba 微服务组件 Nacos 之配置中心源码深度解析
- Seata 1.5.2 源码学习
- MindStudio模型训练场景精度比对全流程和结果分析
- .NET 源码学习 [数据结构-线性表1.2] 链表与 LinkedList<T>
- Redisson源码解读-公平锁
- OpenHarmony移植案例: build lite源码分析之hb命令__entry__.py
- 【深入浅出 Yarn 架构与实现】1-2 搭建 Hadoop 源码阅读环境
- JVM学习笔记——内存模型篇