Word Embeddings in NLP: From Word2Vec to Transformers (Lecture 10)
In this lecture, we will explore Word Embeddings, a fundamental concept in Natural Language Processing (NLP) that allows machines to understand words in terms of vectors. Instead of treating words as discrete symbols, embeddings capture semantic meaning by placing similar words closer in a vector space.
Table of Contents
{% toc %}
1) What Are Word Embeddings?
Word embeddings are numerical representations of words in a continuous vector space.
- Similar words are closer in the space
- Capture semantic and syntactic relationships
- Improve machine learning performance for NLP tasks
Example:
- “king” - “man” + “woman” ≈ “queen”
This property is known as word vector arithmetic.
2) Popular Word Embedding Techniques
2.1 Word2Vec
- Introduced by Google (Mikolov et al., 2013)
- Two models: CBOW (predict word from context) and Skip-gram (predict context from word)
- Learns embeddings efficiently from large corpora
2.2 GloVe
- Developed by Stanford (Pennington et al., 2014)
- Stands for “Global Vectors for Word Representation”
- Uses co-occurrence matrix factorization
2.3 Transformer-based Embeddings
- Modern models like BERT or GPT
- Contextual embeddings (word meaning changes depending on sentence context)
- Outperform static embeddings in most NLP tasks
3) Implementing Word2Vec with Gensim
|
|
Expected Output (example):
|
|
4) Using Pretrained Embeddings (GloVe)
Download pretrained vectors: https://nlp.stanford.edu/projects/glove/
|
|
5) Contextual Embeddings with Transformers
Using Hugging Face Transformers:
|
|
Expected Output
|
|
(1 sentence, 6 tokens, embedding dimension 768)
6) Applications of Word Embeddings
- Sentiment analysis
- Machine translation
- Chatbots and Q&A systems
- Document classification
- Semantic search
7) Key Takeaways
- Word embeddings transform words into vectors that capture semantic meaning.
- Word2Vec and GloVe are static embeddings (same vector regardless of context).
- Transformers (BERT, GPT) provide contextual embeddings (different vectors based on sentence usage).
- Embeddings are the backbone of modern NLP systems.
8) What’s Next?
In Lecture 11, we will explore RNNs (Recurrent Neural Networks) and how they process sequential data like text.