File size: 3,199 Bytes
17ee058 e4d5155 17ee058 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 |
---
library_name: efficient-context
language: python
tags:
- context-optimization
- llm
- cpu-optimization
- resource-constrained
- memory-management
license: mit
datasets:
- None
---
<!-- filepath: /Users/biswanath2.roul/Desktop/biswanath/office/poc/pypi/190525/efficient-context/model_card.md -->
# efficient-context
A Python library for optimizing LLM context handling in CPU-constrained environments.
## Model / Library Description
`efficient-context` addresses the challenge of working with large language models (LLMs) on CPU-only and memory-limited systems by providing efficient context management strategies. The library focuses on making LLMs more usable when computational resources are limited.
## Intended Use
This library is designed for:
- Deploying LLMs in resource-constrained environments
- Optimizing context handling for edge devices
- Creating applications that need to run on standard hardware
- Reducing memory usage when working with large documents
## Features
### Context Compression
- Semantic deduplication to remove redundant information
- Importance-based pruning that keeps critical information
- Automatic summarization of less relevant sections
### Advanced Chunking
- Semantic chunking that preserves logical units
- Adaptive chunk sizing based on content complexity
- Chunk relationships mapping for coherent retrieval
### Retrieval Optimization
- Lightweight embedding models optimized for CPU
- Tiered retrieval strategies (local vs. remote)
- Query-aware context assembly
### Memory Management
- Progressive loading/unloading of context
- Streaming context processing
- Memory-aware caching strategies
## Installation
```bash
pip install efficient-context
```
## Usage
```python
from efficient_context import ContextManager
from efficient_context.compression import SemanticDeduplicator
from efficient_context.chunking import SemanticChunker
from efficient_context.retrieval import CPUOptimizedRetriever
# Initialize a context manager with custom strategies
context_manager = ContextManager(
compressor=SemanticDeduplicator(threshold=0.85),
chunker=SemanticChunker(chunk_size=256),
retriever=CPUOptimizedRetriever(embedding_model="lightweight")
)
# Add documents to your context
context_manager.add_documents(documents)
# Generate optimized context for a query
optimized_context = context_manager.generate_context(
query="Tell me about the climate impact of renewable energy"
)
# Use the optimized context with your LLM
response = your_llm_model.generate(prompt=prompt, context=optimized_context)
```
## Performance and Benchmarks
The library has demonstrated excellent performance in handling repetitive content:
- With a threshold of 0.7, it achieved a 57.5% reduction in token count
- Processing times: 0.13-0.84 seconds for a 426-word document
- Query time: 0.08-0.14 seconds
## Limitations
- Designed primarily for text data
- Performance depends on the quality of embedding models
- Semantic deduplication may occasionally remove content that appears similar but has subtle differences
## Maintainer
This project is maintained by [Biswanath Roul](https://github.com/biswanathroul)
## License
MIT
```
|