A newer version of the Gradio SDK is available:
5.46.1
AnkiGen Agent System
A sophisticated multi-agent system for generating high-quality flashcards using specialized AI agents.
Overview
The AnkiGen Agent System replaces the traditional single-LLM approach with a pipeline of specialized agents:
- Generator Agents: Create cards with domain expertise
- Judge Agents: Assess quality using multiple criteria
- Enhancement Agents: Improve and enrich card content
- Coordinators: Orchestrate workflows and handoffs
Quick Start
1. Installation
pip install openai-agents pyyaml
2. Environment Configuration
Create a .env
file or set environment variables:
# Basic agent mode
export ANKIGEN_AGENT_MODE=hybrid
# Enable specific agents
export ANKIGEN_ENABLE_SUBJECT_EXPERT=true
export ANKIGEN_ENABLE_CONTENT_JUDGE=true
export ANKIGEN_ENABLE_CLARITY_JUDGE=true
# Performance settings
export ANKIGEN_AGENT_TIMEOUT=30.0
export ANKIGEN_MIN_JUDGE_CONSENSUS=0.6
3. Usage
from ankigen_core.agents.integration import AgentOrchestrator
from ankigen_core.llm_interface import OpenAIClientManager
# Initialize
client_manager = OpenAIClientManager()
orchestrator = AgentOrchestrator(client_manager)
await orchestrator.initialize("your-openai-api-key")
# Generate cards with agents
cards, metadata = await orchestrator.generate_cards_with_agents(
topic="Python Functions",
subject="programming",
num_cards=5,
difficulty="intermediate"
)
Agent Types
Generation Agents
SubjectExpertAgent
- Purpose: Domain-specific card generation
- Specializes: Technical accuracy, terminology, real-world applications
- Configuration:
ANKIGEN_ENABLE_SUBJECT_EXPERT=true
PedagogicalAgent
- Purpose: Educational effectiveness review
- Specializes: Bloom's taxonomy, cognitive load, learning objectives
- Configuration:
ANKIGEN_ENABLE_PEDAGOGICAL_AGENT=true
ContentStructuringAgent
- Purpose: Consistent formatting and organization
- Specializes: Metadata enrichment, standardization
- Configuration:
ANKIGEN_ENABLE_CONTENT_STRUCTURING=true
GenerationCoordinator
- Purpose: Orchestrates multi-agent generation workflows
- Configuration:
ANKIGEN_ENABLE_GENERATION_COORDINATOR=true
Judge Agents
ContentAccuracyJudge
- Evaluates: Factual correctness, terminology, misconceptions
- Model: GPT-4o (high accuracy needed)
- Configuration:
ANKIGEN_ENABLE_CONTENT_JUDGE=true
PedagogicalJudge
- Evaluates: Educational effectiveness, cognitive levels
- Model: GPT-4o
- Configuration:
ANKIGEN_ENABLE_PEDAGOGICAL_JUDGE=true
ClarityJudge
- Evaluates: Communication clarity, readability
- Model: GPT-4o-mini (cost-effective)
- Configuration:
ANKIGEN_ENABLE_CLARITY_JUDGE=true
TechnicalJudge
- Evaluates: Code syntax, best practices (technical content only)
- Model: GPT-4o
- Configuration:
ANKIGEN_ENABLE_TECHNICAL_JUDGE=true
CompletenessJudge
- Evaluates: Required fields, metadata, quality standards
- Model: GPT-4o-mini
- Configuration:
ANKIGEN_ENABLE_COMPLETENESS_JUDGE=true
Enhancement Agents
RevisionAgent
- Purpose: Improves rejected cards based on judge feedback
- Configuration:
ANKIGEN_ENABLE_REVISION_AGENT=true
EnhancementAgent
- Purpose: Adds missing content and enriches metadata
- Configuration:
ANKIGEN_ENABLE_ENHANCEMENT_AGENT=true
Operating Modes
Legacy Mode
export ANKIGEN_AGENT_MODE=legacy
Uses the original single-LLM approach.
Agent-Only Mode
export ANKIGEN_AGENT_MODE=agent_only
Forces use of agent system for all generation.
Hybrid Mode
export ANKIGEN_AGENT_MODE=hybrid
Uses agents when enabled via feature flags, falls back to legacy otherwise.
A/B Testing Mode
export ANKIGEN_AGENT_MODE=a_b_test
export ANKIGEN_AB_TEST_RATIO=0.5
Randomly assigns users to agent vs legacy generation for comparison.
Configuration
Agent Configuration Files
Agents can be configured via YAML files in config/agents/
:
# config/agents/defaults/generators.yaml
agents:
subject_expert:
instructions: "You are a world-class expert in {subject}..."
model: "gpt-4o"
temperature: 0.7
timeout: 45.0
custom_prompts:
math: "Focus on problem-solving strategies"
science: "Emphasize experimental design"
Environment Variables
Agent Control
ANKIGEN_AGENT_MODE
: Operating mode (legacy/agent_only/hybrid/a_b_test)ANKIGEN_ENABLE_*
: Enable specific agents (true/false)
Performance
ANKIGEN_AGENT_TIMEOUT
: Agent execution timeout (seconds)ANKIGEN_MAX_AGENT_RETRIES
: Maximum retry attemptsANKIGEN_ENABLE_AGENT_CACHING
: Enable response caching
Quality Control
ANKIGEN_MIN_JUDGE_CONSENSUS
: Minimum agreement between judges (0.0-1.0)ANKIGEN_MAX_REVISION_ITERATIONS
: Maximum revision attempts
Monitoring & Metrics
Built-in Metrics
The system automatically tracks:
- Agent execution times and success rates
- Quality approval/rejection rates
- Token usage and costs
- Judge consensus scores
Performance Dashboard
orchestrator = AgentOrchestrator(client_manager)
metrics = orchestrator.get_performance_metrics()
print(f"24h Performance: {metrics['agent_performance']}")
print(f"Quality Metrics: {metrics['quality_metrics']}")
Tracing
OpenAI Agents SDK provides built-in tracing UI for debugging workflows.
Quality Pipeline
Phase 1: Generation
- Route to appropriate subject expert
- Generate initial cards
- Optional pedagogical review
- Optional content structuring
Phase 2: Quality Assessment
- Route cards to relevant judges
- Parallel evaluation by multiple specialists
- Calculate consensus scores
- Approve/reject based on thresholds
Phase 3: Improvement
- Revise rejected cards using judge feedback
- Re-evaluate revised cards
- Enhance approved cards with additional content
Cost Optimization
Model Selection
- Generation: GPT-4o for accuracy
- Simple Judges: GPT-4o-mini for cost efficiency
- Critical Judges: GPT-4o for quality
Caching Strategy
- Response caching at agent level
- Shared cache across similar requests
- Configurable cache TTL
Parallel Processing
- Judge agents run in parallel
- Batch processing for multiple cards
- Async execution throughout
Migration Strategy
Gradual Rollout
- Start with single judge agent
- Enable A/B testing
- Gradually enable more agents
- Monitor quality improvements
Rollback Plan
- Keep legacy system as fallback
- Feature flags for quick disable
- Performance comparison dashboards
Success Metrics
- 20%+ improvement in card quality scores
- Reduced manual editing needs
- Better user satisfaction ratings
- Maintained or improved generation speed
Troubleshooting
Common Issues
Agents Not Initializing
- Check OpenAI API key validity
- Verify agent mode configuration
- Check feature flag settings
Poor Quality Results
- Adjust judge consensus thresholds
- Enable more specialized judges
- Review agent configuration prompts
Performance Issues
- Enable caching
- Use parallel processing
- Optimize model selection
Debug Mode
export ANKIGEN_ENABLE_AGENT_TRACING=true
Enables detailed logging and tracing UI for workflow debugging.
Examples
Basic Usage
# Simple generation with agents
cards, metadata = await orchestrator.generate_cards_with_agents(
topic="Machine Learning",
subject="data_science",
num_cards=10
)
Advanced Configuration
# Custom enhancement targets
cards = await enhancement_agent.enhance_card_batch(
cards=cards,
enhancement_targets=["prerequisites", "learning_outcomes", "examples"]
)
Quality Pipeline
# Manual quality assessment
judge_results = await judge_coordinator.coordinate_judgment(
cards=cards,
enable_parallel=True,
min_consensus=0.8
)
Contributing
Adding New Agents
- Inherit from
BaseAgentWrapper
- Add configuration in YAML files
- Update feature flags
- Add to coordinator workflows
Testing
python -m pytest tests/unit/test_agents/
python -m pytest tests/integration/test_agent_workflows.py
Support
For issues and questions:
- Check the troubleshooting guide
- Review agent tracing logs
- Monitor performance metrics
- Enable debug mode for detailed logging