A few days ago, OpenAI announced a new successor to their Language Model (LM) – GPT-3. This is the largest model trained so far, with 175 billion parameters.
the drawbacks of fine-tuning using task specific datasets.
- Getting these datasets is difficult.
- Fine-tuning allows the model to exploit spurious correlations, which lead to bad out-of-distribution performance.
- A brief directive in natural language is usually enough for humans to understand a given task. This adaptability is a desired property of NLP systems.
While training this large model with GPT-3 has its merits, reading a large portion of 72 pages can be tiresome. In this blog post, I’ll highlight the parts that I find interesting for people familiar with LMs, who merely wish to know (most of) the important points of this work.
Read more below…