Wednesday, March 27, 2024

Embedding AND Vector Databases - creating a long term memory




What Are Vector Databases? - Intelligent memory of the GenAI


While traditional databases store data in rows and columns, a vector database stores data as math vectors. Each piece of data is represented as a point in high-dimensional space, with hundreds or thousands of dimensions. This allows very sophisticated relationships between data points to be captured.


Searching and analyzing vector databases relies on vector mathematics and similarity calculations. By comparing vector positions, highly relevant results can be returned, even if there are no exact keyword matches.
Vector databases index and store the vector embeddings/tokens for faster retrieval at interactive speeds and similarity search with capabilities like CRUD (create, read, update, and delete) operations, horizontal scaling, and serverless.

Why Are Vector Databases Important for AI?


Vector databases are ideal for managing and extracting insights from the enormous datasets required to train modern AI models. Key advantages include:

In the midst of the Gen AI revolution, efficient data processing is crucial not only for GenAI but also for efficient semantic search. GenAI and semantic search rely on vector embeddings/tokens. This vector data representation carries semantic information critical for the AI to gain understanding and maintain a long-term memory they can draw upon when executing complex tasks.

Embeddings/Tokens

LLMs generate embeddings with many attributes or features linked to each other to represent different dimensions essential to understanding patterns, relationships, etc., making their representation challenging to manage.

That is why we need a specialized database to handle this data type. Vector databases like Pinecone meet this by offering optimized storage and querying capabilities for embeddings. Vector databases have the capabilities of a traditional database that are absent in standalone vector indexes and the specialization of dealing with vector embeddings, which traditional scalar-based databases lack.

Embeddings (arrays of numbers) represent data(words and images transformed into numerical vectors that capture their essences). For example, the phrase puppy and dog will have similar embeddings with vectors close to each other. These embeddings are stored on the vector DB.
Puppy = 0.3, 0.5, 0.9, 0.8, 0.4...]
Dog =[0.1,0.51, 0.6, 0.2, 0.8,,,]
Numbers depend on the ML algorithm and model.

If you can convert a text, sentence or image into many vectors, you can compare, detect, and find the closest cosine similarity, semantic similarity, etc.

OpenAI’s text embeddings measure the relatedness of text strings. Embeddings are commonly used for:

  • Search (where results are ranked by relevance to a query string)
  • Clustering (where text strings are grouped by similarity)
  • Recommendations (where items with related text strings are recommended)
  • Anomaly detection (where outliers with little relatedness are identified)
  • Diversity measurement (where similarity distributions are analyzed)
  • Classification (where text strings are classified by their most similar label)

Embedding Apps - GloVe, OpenAI, Word2Vec
Vector DBs are pinecone, Milvus, PgVector, Weaviate

here is how to create an embedding for the text "food" via an Open AI model 

curl https://api.openai.com/v1/embeddings \

  -H "Authorization: Bearer sk-“VPVgpYYi5znT3BlbkFJj0otiGN" \

  -H "Content-Type: application/json" \

  -d '{

    "input": "food"",                             

    "model": "text-embedding-ada-002",

    "encoding_format": "float"

  }'


More details: (Credits)

1. https://platform.openai.com/docs/api-reference/embeddings

2. good video course - explains the theory as well as the setting up of vector db


3. 
 3. https://www.youtube.com/watch?v=ySus5ZS0b94

Right size the vector DB:

Setting up Vector stores introduces new challenges. For example, correctly partitioning large data that cannot fit entirely in RAM in vector stores like Milvus is not easy. 
- Doing it poorly/under partitioning can result in some queries taking up too much RAM and bringing the service down.
- RAG responsiveness significantly depends on reducing the probes required to find relevant documents. So avoid over-partitioning as well

The Road Ahead

As GenAI moves into mainstream applications, vector databases' role will only grow. Their ability to organize and structure knowledge in a format tailored for AI aligns with the needs of next-gen generative models. 


Combining vector databases and transformers allows GenAI to understand language meaning rather than just keywords. This next-generation AI capability, powered by vector math, delivers such natural, intelligent conversations.







No comments:

Post a Comment

Mindful Software: Building Agentic Automations using GenAI

Currently, software development and automation are painful. The software or automation team has to complete almost 95% of the process, takin...