ElasticsearchEmbeddingsCache
This will help you get started with Elasticsearch key-value stores. For detailed documentation of all ElasticsearchEmbeddingsCache
features and configurations head to the API reference.
Overviewโ
The ElasticsearchEmbeddingsCache
is a ByteStore
implementation that uses your Elasticsearch instance for efficient storage and retrieval of embeddings.
Integration detailsโ
Class | Package | Local | JS support | Package downloads | Package latest |
---|---|---|---|---|---|
ElasticsearchEmbeddingsCache | langchain_elasticsearch | โ | โ |
Setupโ
To create a ElasticsearchEmbeddingsCache
byte store, you'll need an Elasticsearch cluster. You can set one up locally or create an Elastic account.
Installationโ
The LangChain ElasticsearchEmbeddingsCache
integration lives in the __package_name__
package:
%pip install -qU langchain_elasticsearch
Instantiationโ
Now we can instantiate our byte store:
from langchain_elasticsearch import ElasticsearchEmbeddingsCache
# Example config for a locally running Elasticsearch instance
kv_store = ElasticsearchEmbeddingsCache(
es_url="https://localhost:9200",
index_name="llm-chat-cache",
metadata={"project": "my_chatgpt_project"},
namespace="my_chatgpt_project",
es_user="elastic",
es_password="<GENERATED PASSWORD>",
es_params={
"ca_certs": "~/http_ca.crt",
},
)
Usageโ
You can set data under keys like this using the mset
method:
kv_store.mset(
[
["key1", b"value1"],
["key2", b"value2"],
]
)
kv_store.mget(
[
"key1",
"key2",
]
)
[b'value1', b'value2']
And you can delete data using the mdelete
method:
kv_store.mdelete(
[
"key1",
"key2",
]
)
kv_store.mget(
[
"key1",
"key2",
]
)
[None, None]
Use as an embeddings cacheโ
Like other ByteStores
, you can use an ElasticsearchEmbeddingsCache
instance for persistent caching in document ingestion for RAG.
However, cached vectors won't be searchable by default. The developer can customize the building of the Elasticsearch document in order to add indexed vector field.
This can be done by subclassing and overriding methods:
from typing import Any, Dict, List
class SearchableElasticsearchStore(ElasticsearchEmbeddingsCache):
@property
def mapping(self) -> Dict[str, Any]:
mapping = super().mapping
mapping["mappings"]["properties"]["vector"] = {
"type": "dense_vector",
"dims": 1536,
"index": True,
"similarity": "dot_product",
}
return mapping
def build_document(self, llm_input: str, vector: List[float]) -> Dict[str, Any]:
body = super().build_document(llm_input, vector)
body["vector"] = vector
return body
When overriding the mapping and the document building, please only make additive modifications, keeping the base mapping intact.
API referenceโ
For detailed documentation of all ElasticsearchEmbeddingsCache
features and configurations, head to the API reference: https://api.python.langchain.com/en/latest/cache/langchain_elasticsearch.cache.ElasticsearchEmbeddingsCache.html