GPT-3, a giant step for Deep Learning and NLP

A few days ago, OpenAI announced a new successor to their Language Model (LM) – GPT-3. This is the largest model trained so far, with 175 billion parameters.

the drawbacks of fine-tuning using task specific datasets.

  • Getting these datasets is difficult.
  • Fine-tuning allows the model to exploit spurious correlations, which lead to bad out-of-distribution performance.
  • A brief directive in natural language is usually enough for humans to understand a given task. This adaptability is a desired property of NLP systems.

While training this large model with GPT-3 has its merits, reading a large portion of 72 pages can be tiresome. In this blog post, I’ll highlight the parts that I find interesting for people familiar with LMs, who merely wish to know (most of) the important points of this work.

Read more below…

Can intelligence emerge simply by training a big enough language model using lots of data? OpenAI tries to do so, using 175 billion parameters.
Another Datum | Yoel Zeldes

For further reading:
@AI News – Deep Learning getting simplified
@AI News – A survey of Deep Learning for scientific discovery

Click here to connect with us

Spread the word

Related posts