💪
3 Week Bootcamp: Building Realtime LLM Application
  • Introduction
    • Timelines and Structure
    • Course Syllabus
    • Meet your Instructors
    • Action Items
  • Basics of LLM
    • What is Generative AI?
    • What is a Large Language Model?
    • Advantages and Applications of Large Language Models
    • Bonus Resource: Multimodal LLMs and Google Gemini
  • Word Vectors Simplified
    • What is a Word Vector
    • Word Vector Relationships
    • Role of Context in LLMs
    • Transforming Vectors into LLM Responses
      • Neural Networks and Transformers (Bonus Module)
      • Attention and Transformers (Bonus Module)
      • Multi-Head Attention, Transformers Architecture, and Further Reads (Bonus Module)
    • Graded Quiz 1
  • Prompt Engineering
    • What is Prompt Engineering
    • Prompt Engineering and In-context Learning
    • Best Practices to Follow in Prompt Engineering
    • Token Limits in Prompts
    • Ungraded Prompt Engineering Excercise
      • Story for the Excercise: The eSports Enigma
      • Your Task
  • Retrieval Augmented Generation and LLM Architecture
    • What is Retrieval Augmented Generation (RAG)?
    • Primer to RAG: Pre-Trained and Fine-Tuned LLMs
    • In-Context Learning
    • High-level LLM Architecture Components for In-context Learning
    • Diving Deeper: LLM Architecture Components
    • LLM Architecture Diagram and Various Steps
    • RAG versus Fine-Tuning and Prompt Engineering
    • Versatility and Efficiency in Retrieval-Augmented Generation (RAG)
    • Key Benefits of RAG for Enterprise-Grade LLM Applications
    • Similarity Search in Vectors (Bonus Module)
    • Using kNN and LSH to Enhance Similarity Search in Vector Embeddings (Bonus Module)
    • Graded Quiz 2
  • Hands-on Development
    • Prerequisites
    • Dropbox Retrieval App in 15 Minutes
      • Building the app without Dockerization
      • Understanding Docker
      • Building the Dockerized App
    • Amazon Discounts App
      • How the Project Works
      • Repository Walkthrough
    • How to Run 'Examples'
  • Bonus Resource: Recorded Interactions from the Archives
  • Bootcamp Keynote Session on Vision Transformers
  • Final Project + Giveaways
    • Prizes and Giveaways
    • Tracks for Submission
    • Final Submission
Powered by GitBook
On this page
  • 💡 A practical insight
  • How to Choose the Right Vector Embeddings Model
  1. Word Vectors Simplified

Word Vector Relationships

PreviousWhat is a Word VectorNextRole of Context in LLMs

Last updated 1 year ago

Navigating the landscape of text representation, it's essential to grasp how words relate to each other in vector form. In the upcoming video, Anup Surendran dives into the history of word vectors and takes a closer look at Google's groundbreaking Word2Vec project. Why are vector relationships so critical, and what biases do they bring?

Let's find out!

Here, Anup traces the evolution of word vectors, emphasizing the milestone that is Google's Word2Vec project. One of the standout features of Word2Vec is vector arithmetic, allowing us to reason about words mathematically.

For example, the vector equation "King - Man + Woman = Queen" showcases this property brilliantly.

The video also explores the role of word vector relationships in similarity search—a key capability in large language models. However, Anup goes ahead to discuss a very important component, i.e. the biases that inherently exist in these major technological developments.

Understanding these relationships and their implications not only deepens our grasp of Large Language Models but also equips us to use them more responsibly. 🌐

💡 A practical insight

You’ll often hear the “vector embeddings” and “word vectors” being used interchangeably in the context of LLMs. These vector embeddings are then stored in vector indexes, specialized data structures engineered to ensure rapid and relevant data access using these embeddings.

How to Choose the Right Vector Embeddings Model

Selecting the appropriate model for generating embeddings is an intriguing topic on its own. It's essential to recognize that there isn't a one-size-fits-all solution in this domain. A glance at this reveals a variety of embedding models, each tailored for specific applications. Currently, OpenAI's text-embedding-ada-002 stands out as the go-to model for producing efficient vector embeddings from diverse data, be it structured or unstructured. We'll delve deeper into its utilization in our tutorials by the end of this course.

MTEB Leaderboard on Hugging Face