Skip to content

Commit 0ad2b22

Browse files
committed
Update transformers_and_llms.ipynb
1 parent b4c43ea commit 0ad2b22

File tree

1 file changed

+11
-10
lines changed

1 file changed

+11
-10
lines changed

Module 9 - GenAI (LLMs and Prompt Engineering)/2. Intro to Transformers and LLMs/transformers_and_llms.ipynb

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -195,20 +195,21 @@
195195
"2. Individual NLP tasks have traditionally been solved by individual models created for each specific task. That is, until— BERT!\n",
196196
"3. Tasks - BERT can solve 11+ NLP tasks such as sentiment analysis, named entity recognition, etc...\n",
197197
"4. Pretrained on: \n",
198-
" **a.** English Wikipedia - At the time 2.5 billion words \n",
199-
" **b.** Book Corpus - 800 million words \n",
200-
"5. BERT's tokenizer handles OOV tokens (out of vocabulary / previously unknown) by breaking them up into smaller chunks of known tokens.\n",
201-
"6. Trained on two language modeling specific tasks: \n",
198+
" **a.** English Wikipedia - At the time 2.5 Billion words \n",
199+
" **b.** Book Corpus - 800 Million words \n",
200+
"5. Training on a dataset this large takes a long time. BERT’s training was made possible thanks to the novel Transformer architecture and sped up by using TPUs (Tensor Processing Units - Google’s custom circuit built specifically for large ML models). ~64 TPUs trained BERT over the course of 4 days.\n",
201+
"6. BERT's tokenizer handles OOV tokens (out of vocabulary / previously unknown) by breaking them up into smaller chunks of known tokens.\n",
202+
"7. Trained on two language modeling specific tasks: \n",
202203
" **a.** **Masked Language Modeling (MLM) aka Autoencoding Task** - Helps BERT recognize token interaction within the sentence. \n",
203204
" **b.** **Next Sentence Prediction (NSP) Task** - Helps BERT to understand how tokens interact with each other between sentences. \n",
204205
"<img style=\"float: right;\" width=\"300\" height=\"300\" src=\"data/images/bert_language_model_task.jpeg\">\n",
205-
"7. BERT uses three layer of token embedding for a given piece of text: Token Embedding, Segment Embedding and Position Embedding.\n",
206-
"8. BERT uses the encoder of transformer and ignores the decoder to become exceedingly good at processing/understanding massive amounts of text very quickly relative to other slower LLMs that focus on generating text one token at a time.\n",
207-
"9. BERT itself doesn't classify text or summarize documents but it is often used as a pre-trained model for downstream NLP tasks. \n",
206+
"8. BERT uses three layer of token embedding for a given piece of text: Token Embedding, Segment Embedding and Position Embedding.\n",
207+
"9. BERT uses the encoder of transformer and ignores the decoder to become exceedingly good at processing/understanding massive amounts of text very quickly relative to other slower LLMs that focus on generating text one token at a time.\n",
208+
"10. BERT itself doesn't classify text or summarize documents but it is often used as a pre-trained model for downstream NLP tasks. \n",
208209
"<img style=\"float: right;\" width=\"300\" height=\"300\" src=\"data/images/bert_classification.jpeg\">\n",
209-
"10. 1 year later RoBERTa by Facebook AI shown to not require NSP task. It matched and even beat the original BERT model's performance in many areas.\n",
210-
"11. Reference: [Click here to read more](https://huggingface.co/blog/bert-101)\n",
211-
"12. BERT Implementation: [Click here to learn how to use BERT](https://colab.research.google.com/github/jalammar/jalammar.github.io/blob/master/notebooks/bert/A_Visual_Notebook_to_Using_BERT_for_the_First_Time.ipynb)\n",
210+
"11. 1 year later RoBERTa by Facebook AI shown to not require NSP task. It matched and even beat the original BERT model's performance in many areas.\n",
211+
"12. Reference: [Click here to read more](https://huggingface.co/blog/bert-101)\n",
212+
"13. BERT Implementation: [Click here to learn how to use BERT](https://colab.research.google.com/github/jalammar/jalammar.github.io/blob/master/notebooks/bert/A_Visual_Notebook_to_Using_BERT_for_the_First_Time.ipynb)\n",
212213
"\n",
213214
"#### **2. GPT (Generative Pre-Trained Transformer)**\n",
214215
"\n",

0 commit comments

Comments
 (0)