I am really not understanding how to retrieve the parent documents using Langchain’s ParentDocumentRetriever
when using Pinecone. The following code is working for creating the embeddings and inserting them into Pinecone:
const pinecone = new Pinecone();
const pineconeIndex = pinecone.Index(process.env.PINECONE_INDEX);
const docstore = new InMemoryStore();
const vectorstore = await PineconeStore.fromExistingIndex(
new OpenAIEmbeddings(),
{ pineconeIndex }
);
const retriever = new ParentDocumentRetriever({
vectorstore,
docstore,
childSplitter: new HTMLSplitter(),
parentK: 5,
});
// We must add the parent documents via the retriever's addDocuments method
await retriever.addDocuments(docs);
const retrievedDocs = await retriever.getRelevantDocuments("What is emptiness?");
console.log(retrievedDocs);
The retrievedDocs
contains a few parent documents, as expected.
Now that my index is created, I would like to subsequently perform the same operation, but without the await retriever.addDocuments(docs)
:
const pinecone = new Pinecone();
const pineconeIndex = pinecone.Index(process.env.PINECONE_INDEX);
const docstore = new InMemoryStore();
const vectorstore = await PineconeStore.fromExistingIndex(
new OpenAIEmbeddings(),
{ pineconeIndex }
);
const retriever = new ParentDocumentRetriever({
vectorstore,
docstore,
childSplitter: new HTMLSplitter(),
parentK: 5,
});
const retrievedDocs = await retriever.getRelevantDocuments("What is emptiness?");
console.log(retrievedDocs);
This yields no results. The documentation is really rather unclear on this: am I expected to implement my own document store containing all of the parent documents with their accompanying IDs or something like that? Can I save the InMemoryStore
to the filesystem, or use the LocalFileStore
? Does this document store pertain just to the parent documents?
I am not sure how to use LocalFileStore
since dropping it in as a replacement causes my IDE to become unhappy, because it extends BaseStore<string, Uint8Array>
, whereas InMemoryStore
extends BaseStore<string, T>
.
In summary, how would I use Pinecone as a vector store in combination with ParentDocumentRetriever
? What document store do I use?
It seems to me that this would be a pretty common use case; where might I find an example?