Hacker News

by jimminyxon 12/4/24, 7:52 AMwith 18 comments

by tejaskumar_on 12/4/24, 9:21 AM

This title was a little misleading to me IMO because (maybe my skill issue) I associated "inferencing" with "generation".

After reading the article, it seems Pinecone just now supports in-DB vectorization, a feature that is shared by:

- DataStax Astra DB: https://www.datastax.com/blog/simplifying-vector-embedding-g... (since May 2024)

- Weaviate: https://weaviate.io/blog/introducing-weaviate-embeddings (as of yesterday)

by bobismyuncleon 12/4/24, 9:19 AM

This post has some more technical info: https://www.pinecone.io/blog/integrated-inference/

Makes a lot of sense to me to combine embedding, retrieval and reranking — I can imagine this being a way that they can differentiate themselves from the popular databases that have added support for vector search

by kingkongjaffaon 12/4/24, 11:11 AM

Can someone please explain how this works?

I assumed that a specific flavour of LLM was needed, an “embedding model” to generate the vectors. Is this announcement that pinecone is adding their own?

Is it better or worse than the models here: https://ollama.com/search?c=embedding For example?

by tech2treeson 12/4/24, 10:03 AM

Nothing new, Marqo has been doing this for a while now with their all in one platform to train, embed, retrieve, and evaluate.

I've played around with Weaviate & Astra DB but Marqo is the best and easiest solution imo.

by dmezzettion 12/4/24, 11:27 AM

txtai (https://github.com/neuml/txtai) has had inline vectorization since 2020. It supports Transformers, llama.cpp and LLM API services. It also has inline integration with LLM models and a built-in RAG pipeline.

Pinecone integrates AI inferencing with vector database