Welcome to Royfactory

개발, AI, Kubernetes, 백엔드 기술 관련 최신 글을 다룹니다.

Transformer Applications: Summarization and Translation (Lecture 18)

Transformer Applications: Summarization and Translation (Lecture 18) In this lecture, we’ll explore two of the most practical applications of Transformers: text summarization and machine translation. Transformers excel at both tasks by leveraging their self-attention mechanism, which captures long-range dependencies and contextual meaning far better than RNN-based models. Table of Contents {% toc %} 1) Text Summarization Text summarization comes in two main forms: Extractive Summarization Selects key sentences directly from the original text. Example: Picking the 2–3 most important sentences from a news article. ...

August 27, 2025 · 2 min · 388 words · Roy

GPT Basics: Generative Pretrained Transformer Explained (Lecture 17)

GPT Basics: Generative Pretrained Transformer Explained (Lecture 17) In this lecture, we’ll explore GPT (Generative Pretrained Transformer), a Transformer-based model introduced by OpenAI in 2018. While BERT excels at understanding text (encoder-based), GPT specializes in generating text (decoder-based). GPT has since evolved into the foundation of ChatGPT and GPT-4. Table of Contents {% toc %} 1) Why GPT? GPT is designed to predict the next token in a sequence (autoregressive modeling). This makes it excellent at generating coherent, human-like text. ...

August 26, 2025 · 2 min · 361 words · Roy

BERT Architecture and Pretraining: From MLM to NSP (Lecture 16)

BERT Architecture and Pretraining: From MLM to NSP (Lecture 16) In this lecture, we’ll explore BERT (Bidirectional Encoder Representations from Transformers), a groundbreaking model introduced by Google in 2018. BERT significantly advanced NLP by introducing bidirectional context learning and a pretraining + fine-tuning framework, becoming the foundation for many state-of-the-art models. Table of Contents {% toc %} 1) Why BERT? Previous language models read text in only one direction (left-to-right or right-to-left). BERT, however, learns context from both directions simultaneously, making it far better at understanding word meaning in context. ...

August 25, 2025 · 2 min · 396 words · Roy

Transformer Architecture Basics: From Attention to Modern AI (Lecture 15)

Transformer Architecture Basics: From Attention to Modern AI (Lecture 15) In this lecture, we’ll introduce the Transformer architecture, which has become the foundation of modern AI models like GPT and BERT. Unlike RNNs or LSTMs that process sequences step by step, Transformers rely entirely on attention mechanisms and allow parallel processing, making them both faster and more effective. Table of Contents {% toc %} 1) Why Transformers? Traditional sequence models like RNNs and LSTMs process data sequentially, making training slow and prone to long-term dependency issues. ...

August 24, 2025 · 3 min · 473 words · Roy

Attention Mechanism Basics: Understanding Query, Key, and Value (Lecture 14)

Attention Mechanism Basics: Understanding Query, Key, and Value (Lecture 14) In this lecture, we’ll explore the Attention Mechanism, one of the most impactful innovations in deep learning and Natural Language Processing (NLP). The key idea is simple: instead of treating all words equally, the model focuses on the most relevant words to improve context understanding. Table of Contents {% toc %} 1) Why Attention Matters Traditional sequence models like RNN, LSTM, and GRU struggle with long sentences, often forgetting earlier information. Example: ...

August 22, 2025 · 3 min · 427 words · Roy

GRU Basics: Simplifying Recurrent Neural Networks (Lecture 13)

GRU Basics: Simplifying Recurrent Neural Networks (Lecture 13) In this lecture, we introduce GRU (Gated Recurrent Unit) networks, a simpler and faster variant of LSTM. You will learn the theory behind GRU gates, compare it with RNN and LSTM, and implement a sentiment analysis model on the IMDB dataset using TensorFlow/Keras. Table of Contents {% toc %} 1) Why GRU? Traditional RNNs suffer from the vanishing gradient problem, making it difficult to learn long-term dependencies. LSTMs solve this with a more complex structure but at the cost of slower training. ...

August 21, 2025 · 2 min · 424 words · Roy

LSTM Basics: Understanding Long Short-Term Memory Networks (Lecture 12)

LSTM Basics: Understanding Long Short-Term Memory Networks (Lecture 12) In this lecture, we will explore LSTM (Long Short-Term Memory) networks. Unlike simple RNNs that struggle with long-term dependencies, LSTMs use special gates to remember or forget information, making them powerful for NLP, speech recognition, and time-series prediction. Table of Contents {% toc %} 1) Why Do We Need LSTM? Traditional RNNs suffer from the vanishing gradient problem, making it difficult to capture long-term context. For example: ...

August 20, 2025 · 3 min · 519 words · Roy

Neural Network Basics: Build a Simple Image Classification Model (Lecture 8)

Neural Network Basics: Build a Simple Image Classification Model (Lecture 8) In this lecture, we’ll introduce Neural Networks, explain their core components, and build a simple image classification model using the MNIST handwritten digits dataset with TensorFlow/Keras. Table of Contents {% toc %} 1) What Is a Neural Network? A Neural Network is made up of interconnected units called neurons, organized in layers. Data flows through Input → Hidden Layers → Output, with weights and activations applied at each step. ...

August 17, 2025 · 3 min · 454 words · Roy

Recurrent Neural Network (RNN) Basics: Theory and PyTorch Implementation (Lecture 11)

Recurrent Neural Network (RNN) Basics: Theory and PyTorch Implementation (Lecture 11) In this lecture, we’ll explore Recurrent Neural Networks (RNNs), one of the fundamental architectures for handling sequential data. We’ll cover the theory behind RNNs, their mathematical formulation, limitations, and implement simple RNNs in PyTorch for both text and time-series prediction. Table of Contents {% toc %} 1) What is an RNN? Unlike feedforward networks that treat each input independently, RNNs are designed to remember previous states and use them in predicting future outputs. This makes RNNs highly effective for tasks where context and sequence order matter, such as: ...

August 16, 2025 · 3 min · 557 words · Roy

Word Embeddings in NLP: From Word2Vec to Transformers (Lecture 10)

Word Embeddings in NLP: From Word2Vec to Transformers (Lecture 10) In this lecture, we will explore Word Embeddings, a fundamental concept in Natural Language Processing (NLP) that allows machines to understand words in terms of vectors. Instead of treating words as discrete symbols, embeddings capture semantic meaning by placing similar words closer in a vector space. Table of Contents {% toc %} 1) What Are Word Embeddings? Word embeddings are numerical representations of words in a continuous vector space. ...

August 16, 2025 · 3 min · 482 words · Roy