Skip to content

Latest commit

 

History

History
41 lines (34 loc) · 2.03 KB

README.md

File metadata and controls

41 lines (34 loc) · 2.03 KB

Transformers


Author: Chelsea Zaloumis

Last update: 4/15/2021

A lecture-style exploration of transformers following Jay Alammar's post The Illustrated Transformer. Includes breakout questions and motivating examples.

Lecture objectives:

  1. Motivation for Transformers
  2. Define Transformers
  3. Define Self-Attention
    1. Self-Attention with vectors
    2. Self-Attention with matrices
  4. Define Multi-Head Attention
  5. Define Encoder-Decoder Attention layer
  6. Final Linear & Softmax Layers
  7. Loss Function

References/Resources

Further Work

  1. Coding a basic transformer for natural language processing.
  2. Coding a not-so-basic transformer for tbd application.