diff --git a/README.md b/README.md index 518dae4..30edbed 100644 --- a/README.md +++ b/README.md @@ -105,10 +105,10 @@ import litellm litellm.api_base = "http://localhost:8080" ``` -### Cache-aside -Install with: +### Cache-aside +Install with: ```bash pip install semcache ``` @@ -127,6 +127,28 @@ response = client.get("Tell me France's capital city.") print(response) # "Paris" ``` + +or in Node.js + +Install with +```bash +npm install semcache +``` +Use the sdk in your service + +```javascript +const SemcacheClient = require('semcache'); + +const client = new SemcacheClient('http://localhost:8080'); + +(async () => { + await client.put('What is the capital of France?', 'Paris'); + + const result = await client.get('What is the capital of France?'); + console.log(result); // => 'Paris' +})(); +``` + ## Configuration Configure via environment variables or `config.yaml`: diff --git a/docs/semcache/docs/getting-started.md b/docs/semcache/docs/getting-started.md index 81a509f..5a9744b 100644 --- a/docs/semcache/docs/getting-started.md +++ b/docs/semcache/docs/getting-started.md @@ -19,7 +19,7 @@ docker run -p 8080:8080 semcache/semcache:latest Semcache will start on `http://localhost:8080` and is ready to proxy LLM requests. -## Setting Up The Client +## Setting up proxy client Semcache acts as a drop-in replacement for LLM APIs. Point your existing SDK to Semcache instead of the provider's endpoint: @@ -105,7 +105,7 @@ This request will: 3. The provider responds with the answer 4. Semcache caches the response and returns it to you -## Testing Semantic Similarity +### Testing Semantic Similarity Now try a semantically similar but differently worded question: @@ -153,7 +153,7 @@ Now try a semantically similar but differently worded question: Even though the wording is different, Semcache recognizes the semantic similarity and returns the cached response instantly - no API call to the upstream provider! -## Checking Cache Status +### Checking Cache Status You can verify cache hits by checking the response headers. If there is a cache hit the `X-Cache-Status` header will be set to `hit`: @@ -249,6 +249,50 @@ You can verify cache hits by checking the response headers. If there is a cache + +## Setting up cache aside instance + + + + Install with + ```bash + pip install semcache + ``` + + ```python + from semcache import Semcache + + # Initialize the client + client = Semcache(base_url="http://localhost:8080") + + # Store a key-data pair + client.put("What is the capital of France?", "Paris") + + # Retrieve data by semantic similarity + response = client.get("Tell me France's capital city.") + print(response) # "Paris" + ``` + + + Install with + ```bash + npm install semcache + ``` + ```javascript + const SemcacheClient = require('semcache'); + + const client = new SemcacheClient('http://localhost:8080'); + + (async () => { + await client.put('What is the capital of France?', 'Paris'); + + const result = await client.get('What is the capital of France?'); + console.log(result); // => 'Paris' + })(); + ``` + + + ## Monitor Your Cache Visit the built-in admin dashboard at `http://localhost:8080/admin` to monitor: @@ -263,4 +307,4 @@ The process is identical across all providers - Semcache automatically detects t - **[LLM Providers & Tools](./llm-providers-tools.md)** - Configure additional providers like DeepSeek, Mistral, and custom LLMs - **[Configuration](./configuration/cache-settings.md)** - Adjust similarity thresholds and cache behavior -- **[Monitoring](./monitoring/metrics.md)** - Set up production monitoring with Prometheus and Grafana \ No newline at end of file +- **[Monitoring](./monitoring/metrics.md)** - Set up production monitoring with Prometheus and Grafana