File size: 3,199 Bytes
17ee058
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e4d5155
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17ee058
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
library_name: efficient-context
language: python
tags:
  - context-optimization
  - llm
  - cpu-optimization
  - resource-constrained
  - memory-management
license: mit
datasets:
  - None
---

<!-- filepath: /Users/biswanath2.roul/Desktop/biswanath/office/poc/pypi/190525/efficient-context/model_card.md -->
# efficient-context

A Python library for optimizing LLM context handling in CPU-constrained environments.

## Model / Library Description

`efficient-context` addresses the challenge of working with large language models (LLMs) on CPU-only and memory-limited systems by providing efficient context management strategies. The library focuses on making LLMs more usable when computational resources are limited.

## Intended Use

This library is designed for:
- Deploying LLMs in resource-constrained environments
- Optimizing context handling for edge devices
- Creating applications that need to run on standard hardware
- Reducing memory usage when working with large documents

## Features

### Context Compression
- Semantic deduplication to remove redundant information
- Importance-based pruning that keeps critical information
- Automatic summarization of less relevant sections

### Advanced Chunking
- Semantic chunking that preserves logical units
- Adaptive chunk sizing based on content complexity
- Chunk relationships mapping for coherent retrieval

### Retrieval Optimization
- Lightweight embedding models optimized for CPU
- Tiered retrieval strategies (local vs. remote)
- Query-aware context assembly

### Memory Management
- Progressive loading/unloading of context
- Streaming context processing
- Memory-aware caching strategies

## Installation

```bash
pip install efficient-context
```

## Usage

```python
from efficient_context import ContextManager
from efficient_context.compression import SemanticDeduplicator
from efficient_context.chunking import SemanticChunker
from efficient_context.retrieval import CPUOptimizedRetriever

# Initialize a context manager with custom strategies
context_manager = ContextManager(
    compressor=SemanticDeduplicator(threshold=0.85),
    chunker=SemanticChunker(chunk_size=256),
    retriever=CPUOptimizedRetriever(embedding_model="lightweight")
)

# Add documents to your context
context_manager.add_documents(documents)

# Generate optimized context for a query
optimized_context = context_manager.generate_context(
    query="Tell me about the climate impact of renewable energy"
)

# Use the optimized context with your LLM
response = your_llm_model.generate(prompt=prompt, context=optimized_context)
```

## Performance and Benchmarks

The library has demonstrated excellent performance in handling repetitive content:
- With a threshold of 0.7, it achieved a 57.5% reduction in token count
- Processing times: 0.13-0.84 seconds for a 426-word document
- Query time: 0.08-0.14 seconds

## Limitations

- Designed primarily for text data
- Performance depends on the quality of embedding models
- Semantic deduplication may occasionally remove content that appears similar but has subtle differences

## Maintainer

This project is maintained by [Biswanath Roul](https://github.com/biswanathroul)

## License

MIT
```