Professional Documents
Culture Documents
BERT
BERT
relations and meaning between words in a text. the basic transformer contains two
separate mechanisms, one is an encoder that reads the text input and a decoder that
creates output(prediction).
directional models read the text in a specific direction, (left to right or right
to left). Transformers encoder reads all the text at once, so we can say
transformers are nondirectional. this property allows transformers to learn the
context of words by taking surrounding words in any direction.
Segment Embeddings: (Optional Embedding) BERT takes sentence pairs as inputs for
(Question-Answering) tasks. BERT learns a unique embedding for the first and the
second sentences to help the model differentiate between them.
Token Embeddings: Token embedding basically contains all the information of input
text. it is an integer number specified for each unique word token.