A vector database is a specialized database designed to store, index and query data stored in vectorized format, i.e. as multidimensional numerical representations. Vector databases are optimized for applications that require fast similarity searches over very large datasets.
A vector database stores data as high-dimensional vectors. Regular structured or unstructured data is first converted into a multidimensional numerical representation called a vector. These vector representations (also called vector embeddings) of the data exhibit certain mathematical characteristics that make it possible to easily find similar entries from a large collection of data points.
The vectors are organized using a vector index
A vector database differs from a regular database in that it stores data as mathematical vectors rather than storing structured or unstructured data in a tabular format the way standard databases do. Regular databases generally support structured queries (such as SQL) while vector databases are optimized for similarity queries.
Consider using a vector database when your main goal is to perform fast similarity searches over a very large dataset. For example, if you are building a recommendation system or an LLM-based interactive chatbot.
If your primary data processing method is to run SQL-like queries, consider using a regular database instead.
In some cases, you may also be able to avoid using a vector database entirely by using Pathway. Our Building LLM Apps without a Vector Database tutorial uses a vector index under the hood. Try it out to get a feel for the fast performance vector indexing can deliver.