brickfrog commited on
Commit
0333a17
·
verified ·
1 Parent(s): efea6b6

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -5,12 +5,12 @@ app_file: app.py
5
  requirements: requirements.txt
6
  python: 3.10
7
  sdk: gradio
8
- sdk_version: 5.34.2
9
  ---
10
 
11
  # AnkiGen - Anki Card Generator
12
 
13
- AnkiGen is a Gradio-based web application that generates high-quality Anki-compatible CSV and `.apkg` deck files using an advanced multi-agent system powered by OpenAI Agents. The system employs specialized generator agents, quality assessment judges, and enhancement agents to create superior flashcards.
14
 
15
  ## Features
16
 
@@ -113,12 +113,10 @@ The codebase uses a sophisticated multi-agent architecture powered by the OpenAI
113
 
114
  - `app.py`: Main Gradio application interface and event handling.
115
  - `ankigen_core/`: Directory containing the core logic modules:
116
- - `agents/`: **OpenAI Agents system implementation**:
117
- - `base.py`: Base agent wrapper and configuration classes
118
- - `generators.py`: Specialized generator agents (SubjectExpertAgent, PedagogicalAgent, ContentStructuringAgent)
119
- - `judges.py`: Quality assessment agents (ContentAccuracyJudge, PedagogicalJudge, ClarityJudge, etc.)
120
- - `enhancers.py`: Revision and enhancement agents for card improvement
121
- - `integration.py`: AgentOrchestrator for coordinating the entire agent system
122
  - `config.py`: Agent configuration management
123
  - `schemas.py`: Pydantic schemas for structured agent outputs
124
  - `templates/`: Jinja2 templates for agent prompts
@@ -139,26 +137,11 @@ The codebase uses a sophisticated multi-agent architecture powered by the OpenAI
139
 
140
  AnkiGen employs a sophisticated multi-agent system built on the OpenAI Agents SDK that ensures high-quality flashcard generation through specialized roles and quality control:
141
 
142
- ### Generator Agents
143
- - **SubjectExpertAgent**: Provides domain-specific expertise for accurate content creation
144
- - **PedagogicalAgent**: Ensures cards follow effective learning principles and memory techniques
145
- - **ContentStructuringAgent**: Optimizes card structure, formatting, and information hierarchy
146
-
147
- ### Quality Assurance Judges
148
- - **ContentAccuracyJudge**: Verifies factual correctness and subject matter accuracy
149
- - **PedagogicalJudge**: Evaluates learning effectiveness and educational value
150
- - **ClarityJudge**: Assesses readability, comprehension, and clear communication
151
- - **TechnicalJudge**: Reviews technical accuracy for specialized subjects
152
- - **CompletenessJudge**: Ensures comprehensive coverage without information gaps
153
-
154
- ### Enhancement Agents
155
- - **RevisionAgent**: Identifies areas for improvement based on judge feedback
156
- - **EnhancementAgent**: Implements refinements and optimizations to failed cards
157
 
158
  ### Orchestration
159
- - **GenerationCoordinator**: Manages the card generation workflow and agent handoffs
160
- - **JudgeCoordinator**: Coordinates quality assessment across all judge agents
161
- - **AgentOrchestrator**: Main system controller that initializes and manages the entire agent ecosystem
162
 
163
  This architecture ensures that every generated flashcard undergoes rigorous quality control and iterative improvement, resulting in superior learning materials.
164
 
 
5
  requirements: requirements.txt
6
  python: 3.10
7
  sdk: gradio
8
+ sdk_version: 5.38.1
9
  ---
10
 
11
  # AnkiGen - Anki Card Generator
12
 
13
+ AnkiGen is a Gradio-based web application that generates high-quality Anki-compatible CSV and `.apkg` deck files using the OpenAI Agents SDK. The system leans on a specialized subject expert agent plus a lightweight self-review step to create solid flashcards without an expensive multi-agent cascade.
14
 
15
  ## Features
16
 
 
113
 
114
  - `app.py`: Main Gradio application interface and event handling.
115
  - `ankigen_core/`: Directory containing the core logic modules:
116
+ - `agents/`: **OpenAI Agents system implementation**:
117
+ - `base.py`: Base agent wrapper and configuration classes
118
+ - `generators.py`: SubjectExpertAgent for primary card creation
119
+ - `integration.py`: AgentOrchestrator for orchestrating generation + self-review
 
 
120
  - `config.py`: Agent configuration management
121
  - `schemas.py`: Pydantic schemas for structured agent outputs
122
  - `templates/`: Jinja2 templates for agent prompts
 
137
 
138
  AnkiGen employs a sophisticated multi-agent system built on the OpenAI Agents SDK that ensures high-quality flashcard generation through specialized roles and quality control:
139
 
140
+ ### Generator Agent
141
+ - **SubjectExpertAgent**: Provides domain-specific expertise for accurate content creation, followed by a single lightweight quality review that can revise or drop weak cards.
 
 
 
 
 
 
 
 
 
 
 
 
 
142
 
143
  ### Orchestration
144
+ - **AgentOrchestrator**: Main system controller that initializes the simplified agent pipeline and runs self-review before returning cards.
 
 
145
 
146
  This architecture ensures that every generated flashcard undergoes rigorous quality control and iterative improvement, resulting in superior learning materials.
147
 
ankigen_core/agents/__init__.py CHANGED
@@ -1,37 +1,12 @@
1
  # Agent system for AnkiGen agentic workflows
2
 
3
  from .base import BaseAgentWrapper, AgentConfig
4
- from .generators import (
5
- SubjectExpertAgent,
6
- PedagogicalAgent,
7
- ContentStructuringAgent,
8
- GenerationCoordinator,
9
- )
10
- from .judges import (
11
- ContentAccuracyJudge,
12
- PedagogicalJudge,
13
- ClarityJudge,
14
- TechnicalJudge,
15
- CompletenessJudge,
16
- JudgeCoordinator,
17
- )
18
- from .enhancers import RevisionAgent, EnhancementAgent
19
  from .config import AgentConfigManager
20
 
21
  __all__ = [
22
  "BaseAgentWrapper",
23
  "AgentConfig",
24
  "SubjectExpertAgent",
25
- "PedagogicalAgent",
26
- "ContentStructuringAgent",
27
- "GenerationCoordinator",
28
- "ContentAccuracyJudge",
29
- "PedagogicalJudge",
30
- "ClarityJudge",
31
- "TechnicalJudge",
32
- "CompletenessJudge",
33
- "JudgeCoordinator",
34
- "RevisionAgent",
35
- "EnhancementAgent",
36
  "AgentConfigManager",
37
  ]
 
1
  # Agent system for AnkiGen agentic workflows
2
 
3
  from .base import BaseAgentWrapper, AgentConfig
4
+ from .generators import SubjectExpertAgent
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  from .config import AgentConfigManager
6
 
7
  __all__ = [
8
  "BaseAgentWrapper",
9
  "AgentConfig",
10
  "SubjectExpertAgent",
 
 
 
 
 
 
 
 
 
 
 
11
  "AgentConfigManager",
12
  ]
ankigen_core/agents/base.py CHANGED
@@ -62,6 +62,11 @@ class BaseAgentWrapper:
62
  async def initialize(self):
63
  """Initialize the OpenAI agent with structured output support"""
64
  try:
 
 
 
 
 
65
  # Create model settings with temperature
66
  model_settings = ModelSettings(temperature=self.config.temperature)
67
 
@@ -102,30 +107,51 @@ class BaseAgentWrapper:
102
  if not self.agent:
103
  await self.initialize()
104
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
105
  try:
106
- # Add context to the user input if provided
107
- enhanced_input = user_input
108
- if context is not None:
109
- context_str = "\n".join([f"{k}: {v}" for k, v in context.items()])
110
- enhanced_input = f"{user_input}\n\nContext:\n{context_str}"
111
-
112
- # Execute the agent using Runner.run()
113
- if self.agent is None:
114
- raise ValueError("Agent not initialized")
115
-
116
- logger.info(f"🤖 EXECUTING AGENT: {self.config.name}")
117
- logger.info(f"📝 INPUT: {enhanced_input[:200]}...")
118
-
119
- result = await asyncio.wait_for(
120
- Runner.run(
121
- starting_agent=self.agent,
122
- input=enhanced_input,
123
- ),
124
- timeout=self.config.timeout,
125
  )
126
 
127
- logger.info(f"Agent {self.config.name} executed successfully")
128
-
129
  # Extract usage information from raw_responses
130
  total_usage = {
131
  "input_tokens": 0,
 
62
  async def initialize(self):
63
  """Initialize the OpenAI agent with structured output support"""
64
  try:
65
+ # Set the default OpenAI client for the agents SDK
66
+ from agents import set_default_openai_client
67
+
68
+ set_default_openai_client(self.openai_client, use_for_tracing=False)
69
+
70
  # Create model settings with temperature
71
  model_settings = ModelSettings(temperature=self.config.temperature)
72
 
 
107
  if not self.agent:
108
  await self.initialize()
109
 
110
+ # Add context to the user input if provided
111
+ enhanced_input = user_input
112
+ if context is not None:
113
+ context_str = "\n".join([f"{k}: {v}" for k, v in context.items()])
114
+ enhanced_input = f"{user_input}\n\nContext:\n{context_str}"
115
+
116
+ # Execute the agent using Runner.run() with retry logic
117
+ if self.agent is None:
118
+ raise ValueError("Agent not initialized")
119
+
120
+ logger.info(f"🤖 EXECUTING AGENT: {self.config.name}")
121
+ logger.info(f"📝 INPUT: {enhanced_input[:200]}...")
122
+
123
+ import time
124
+
125
+ start_time = time.time()
126
+
127
+ for attempt in range(self.config.retry_attempts):
128
+ try:
129
+ result = await asyncio.wait_for(
130
+ Runner.run(
131
+ starting_agent=self.agent,
132
+ input=enhanced_input,
133
+ ),
134
+ timeout=self.config.timeout,
135
+ )
136
+ break
137
+ except asyncio.TimeoutError:
138
+ if attempt < self.config.retry_attempts - 1:
139
+ logger.warning(
140
+ f"Agent {self.config.name} timed out (attempt {attempt + 1}/{self.config.retry_attempts}), retrying..."
141
+ )
142
+ continue
143
+ else:
144
+ logger.error(
145
+ f"Agent {self.config.name} timed out after {self.config.retry_attempts} attempts"
146
+ )
147
+ raise
148
+
149
  try:
150
+ execution_time = time.time() - start_time
151
+ logger.info(
152
+ f"Agent {self.config.name} executed successfully in {execution_time:.2f}s"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
153
  )
154
 
 
 
155
  # Extract usage information from raw_responses
156
  total_usage = {
157
  "input_tokens": 0,
ankigen_core/agents/config.py CHANGED
@@ -54,7 +54,6 @@ class AgentConfigManager:
54
  self.configs: Dict[str, AgentConfig] = {}
55
  self.prompt_templates: Dict[str, AgentPromptTemplate] = {}
56
 
57
- # Set up Jinja2 environment with templates directory
58
  template_dir = Path(__file__).parent / "templates"
59
  self.jinja_env = Environment(loader=FileSystemLoader(template_dir))
60
  self._load_default_configs()
@@ -66,18 +65,15 @@ class AgentConfigManager:
66
  logger.info(f"Updated model overrides: {model_overrides}")
67
 
68
  def update_template_vars(self, template_vars: Dict[str, Any]):
69
- """Update template variables and regenerate configs"""
70
- self.template_vars = template_vars
71
- self._load_default_configs()
72
- logger.info(f"Updated template variables: {template_vars}")
73
 
74
  def _load_default_configs(self):
75
  """Load all default configurations from Jinja templates"""
76
  try:
77
  self._load_configs_from_template("generators.j2")
78
- self._load_configs_from_template("judges.j2")
79
- self._load_configs_from_template("enhancers.j2")
80
- self._load_prompt_templates_from_template("prompts.j2")
81
  logger.info(
82
  f"Loaded {len(self.configs)} agent configurations from Jinja templates"
83
  )
@@ -96,17 +92,6 @@ class AgentConfigManager:
96
  # Default models for each agent type
97
  default_models = {
98
  "subject_expert_model": "gpt-4.1",
99
- "pedagogical_agent_model": "gpt-4.1-nano",
100
- "content_structuring_model": "gpt-4.1-nano",
101
- "generation_coordinator_model": "gpt-4.1",
102
- "content_accuracy_judge_model": "gpt-4.1-nano",
103
- "pedagogical_judge_model": "gpt-4.1-nano",
104
- "clarity_judge_model": "gpt-4.1-nano",
105
- "technical_judge_model": "gpt-4.1-nano",
106
- "completeness_judge_model": "gpt-4.1-nano",
107
- "judge_coordinator_model": "gpt-4.1",
108
- "revision_agent_model": "gpt-4.1",
109
- "enhancement_agent_model": "gpt-4.1",
110
  }
111
 
112
  # Simple mapping: agent_name -> agent_name_model
@@ -140,29 +125,6 @@ class AgentConfigManager:
140
  except Exception as e:
141
  logger.error(f"Failed to load configs from template {template_name}: {e}")
142
 
143
- def _load_prompt_templates_from_template(self, template_name: str):
144
- """Load prompt templates from a Jinja template"""
145
- try:
146
- template = self.jinja_env.get_template(template_name)
147
-
148
- # Render with current template variables
149
- rendered_json = template.render(**self.template_vars)
150
- template_data = json.loads(rendered_json)
151
-
152
- # Create AgentPromptTemplate objects
153
- for template_name, template_info in template_data.items():
154
- prompt_template = AgentPromptTemplate(
155
- system_prompt=template_info.get("system_prompt", ""),
156
- user_prompt_template=template_info.get("user_prompt_template", ""),
157
- variables=template_info.get("variables", {}),
158
- )
159
- self.prompt_templates[template_name] = prompt_template
160
-
161
- except Exception as e:
162
- logger.error(
163
- f"Failed to load prompt templates from template {template_name}: {e}"
164
- )
165
-
166
  def get_agent_config(self, agent_name: str) -> Optional[AgentConfig]:
167
  """Get configuration for a specific agent"""
168
  return self.configs.get(agent_name)
 
54
  self.configs: Dict[str, AgentConfig] = {}
55
  self.prompt_templates: Dict[str, AgentPromptTemplate] = {}
56
 
 
57
  template_dir = Path(__file__).parent / "templates"
58
  self.jinja_env = Environment(loader=FileSystemLoader(template_dir))
59
  self._load_default_configs()
 
65
  logger.info(f"Updated model overrides: {model_overrides}")
66
 
67
  def update_template_vars(self, template_vars: Dict[str, Any]):
68
+ logger.info(
69
+ "Template vars are no longer used in the simplified agent pipeline."
70
+ )
 
71
 
72
  def _load_default_configs(self):
73
  """Load all default configurations from Jinja templates"""
74
  try:
75
  self._load_configs_from_template("generators.j2")
76
+ self.prompt_templates.clear()
 
 
77
  logger.info(
78
  f"Loaded {len(self.configs)} agent configurations from Jinja templates"
79
  )
 
92
  # Default models for each agent type
93
  default_models = {
94
  "subject_expert_model": "gpt-4.1",
 
 
 
 
 
 
 
 
 
 
 
95
  }
96
 
97
  # Simple mapping: agent_name -> agent_name_model
 
125
  except Exception as e:
126
  logger.error(f"Failed to load configs from template {template_name}: {e}")
127
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
128
  def get_agent_config(self, agent_name: str) -> Optional[AgentConfig]:
129
  """Get configuration for a specific agent"""
130
  return self.configs.get(agent_name)
ankigen_core/agents/generators.py CHANGED
@@ -1,18 +1,60 @@
1
  # Specialized generator agents for card generation
2
 
3
  import json
4
- from typing import List, Dict, Any, Optional
5
- from datetime import datetime
6
 
7
  from openai import AsyncOpenAI
8
 
9
  from ankigen_core.logging import logger
10
  from ankigen_core.models import Card, CardFront, CardBack
11
- from .base import BaseAgentWrapper
12
  from .config import get_config_manager
13
  from .schemas import CardsGenerationSchema
14
 
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  class SubjectExpertAgent(BaseAgentWrapper):
17
  """Subject matter expert agent for domain-specific card generation"""
18
 
@@ -42,21 +84,95 @@ class SubjectExpertAgent(BaseAgentWrapper):
42
  async def generate_cards(
43
  self, topic: str, num_cards: int = 5, context: Optional[Dict[str, Any]] = None
44
  ) -> List[Card]:
45
- """Generate flashcards for a given topic"""
46
  try:
47
- user_input = f"Generate {num_cards} flashcards for the topic: {topic}"
48
- if context:
49
- user_input += f"\n\nAdditional context: {context}"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
 
51
- response, usage = await self.execute(user_input, context)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
- # Log usage information
54
- if usage and usage.get("total_tokens", 0) > 0:
55
  logger.info(
56
- f"💰 Token Usage: {usage['total_tokens']} tokens (Input: {usage['input_tokens']}, Output: {usage['output_tokens']})"
57
  )
58
 
59
- return self._parse_cards_response(response, topic)
 
 
 
60
 
61
  except Exception as e:
62
  logger.error(f"Card generation failed: {e}")
@@ -148,65 +264,17 @@ Return your response as a JSON object with this structure:
148
  cards = []
149
  for i, card_data in enumerate(card_data_list):
150
  try:
151
- # Handle both Pydantic models and dictionaries
152
- if hasattr(card_data, "front"):
153
- # Pydantic model
154
- front_data = card_data.front
155
- back_data = card_data.back
156
- metadata = card_data.metadata
157
- card_type = card_data.card_type
158
- else:
159
- # Dictionary
160
- if "front" not in card_data or "back" not in card_data:
161
- logger.warning(f"Skipping card {i}: missing front or back")
162
- continue
163
- front_data = card_data["front"]
164
- back_data = card_data["back"]
165
- metadata = card_data.get("metadata", {})
166
- card_type = card_data.get("card_type", "basic")
167
-
168
- # Extract question and answer
169
- if hasattr(front_data, "question"):
170
- question = front_data.question
171
- else:
172
- question = front_data.get("question", "")
173
-
174
- if hasattr(back_data, "answer"):
175
- answer = back_data.answer
176
- explanation = back_data.explanation
177
- example = back_data.example
178
  else:
179
- answer = back_data.get("answer", "")
180
- explanation = back_data.get("explanation", "")
181
- example = back_data.get("example", "")
182
-
183
- if not question or not answer:
184
- logger.warning(f"Skipping card {i}: missing question or answer")
185
  continue
186
 
187
- # Create Card object
188
- card = Card(
189
- card_type=card_type,
190
- front=CardFront(question=question),
191
- back=CardBack(
192
- answer=answer,
193
- explanation=explanation,
194
- example=example,
195
- ),
196
- metadata=metadata
197
- if isinstance(metadata, dict)
198
- else metadata.dict()
199
- if hasattr(metadata, "dict")
200
- else {},
201
- )
202
-
203
- # Ensure metadata includes subject and topic
204
- if card.metadata is not None:
205
- if "subject" not in card.metadata:
206
- card.metadata["subject"] = self.subject
207
- if "topic" not in card.metadata:
208
- card.metadata["topic"] = topic
209
-
210
  cards.append(card)
211
 
212
  except Exception as e:
@@ -234,357 +302,85 @@ Return your response as a JSON object with this structure:
234
  raise
235
 
236
 
237
- class PedagogicalAgent(BaseAgentWrapper):
238
- """Pedagogical specialist for educational effectiveness"""
239
-
240
- def __init__(self, openai_client: AsyncOpenAI):
241
- config_manager = get_config_manager()
242
- base_config = config_manager.get_agent_config("pedagogical")
243
-
244
- if not base_config:
245
- raise ValueError(
246
- "pedagogical configuration not found - agent system not properly initialized"
247
- )
248
-
249
- super().__init__(base_config, openai_client)
250
-
251
- async def review_cards(self, cards: List[Card]) -> List[Dict[str, Any]]:
252
- """Review cards for pedagogical effectiveness"""
253
- datetime.now()
254
-
255
- try:
256
- reviews = []
257
-
258
- for i, card in enumerate(cards):
259
- user_input = self._build_review_prompt(card, i)
260
- response, usage = await self.execute(user_input)
261
-
262
- try:
263
- review_data = (
264
- json.loads(response) if isinstance(response, str) else response
265
- )
266
- reviews.append(review_data)
267
- except Exception as e:
268
- logger.warning(f"Failed to parse review for card {i}: {e}")
269
- reviews.append(
270
- {
271
- "approved": True,
272
- "feedback": f"Review parsing failed: {e}",
273
- "improvements": [],
274
- }
275
- )
276
-
277
- # Record successful execution
278
-
279
- return reviews
280
-
281
- except Exception as e:
282
- logger.error(f"PedagogicalAgent review failed: {e}")
283
- raise
284
-
285
- def _parse_review_response(self, response) -> Dict[str, Any]:
286
- """Parse the review response into a dictionary"""
287
- try:
288
- if isinstance(response, str):
289
- data = json.loads(response)
290
- else:
291
- data = response
292
-
293
- # Validate required fields
294
- required_fields = [
295
- "pedagogical_quality",
296
- "clarity",
297
- "learning_effectiveness",
298
- ]
299
- if not all(field in data for field in required_fields):
300
- raise ValueError("Missing required review fields")
301
-
302
- return data
303
-
304
- except json.JSONDecodeError as e:
305
- logger.error(f"Failed to parse review response as JSON: {e}")
306
- raise ValueError(f"Invalid review response: {e}")
307
- except Exception as e:
308
- logger.error(f"Failed to parse review response: {e}")
309
- raise ValueError(f"Invalid review response: {e}")
310
-
311
- def _build_review_prompt(self, card: Card, index: int) -> str:
312
- """Build the review prompt for a single card"""
313
- return f"""Review this flashcard for pedagogical effectiveness:
314
-
315
- Card {index + 1}:
316
- Question: {card.front.question}
317
- Answer: {card.back.answer}
318
- Explanation: {card.back.explanation}
319
- Example: {card.back.example}
320
- Metadata: {json.dumps(card.metadata, indent=2)}
321
-
322
- Evaluate the card based on:
323
- 1. Learning Objectives: Does it have clear, measurable learning goals?
324
- 2. Bloom's Taxonomy: What cognitive level does it target? Is it appropriate?
325
- 3. Cognitive Load: Is the information manageable for learners?
326
- 4. Difficulty Progression: Is the difficulty appropriate for the target level?
327
- 5. Educational Value: Does it promote deep learning vs. memorization?
328
-
329
- Return your assessment as JSON:
330
- {{
331
- "approved": true/false,
332
- "cognitive_level": "remember|understand|apply|analyze|evaluate|create",
333
- "difficulty_rating": 1-5,
334
- "cognitive_load": "low|medium|high",
335
- "educational_value": 1-5,
336
- "feedback": "Detailed pedagogical assessment",
337
- "improvements": ["specific improvement suggestion 1", "suggestion 2"],
338
- "learning_objectives": ["clear learning objective 1", "objective 2"]
339
- }}"""
340
-
341
-
342
- class ContentStructuringAgent(BaseAgentWrapper):
343
- """Content organization and formatting specialist"""
344
-
345
- def __init__(self, openai_client: AsyncOpenAI):
346
- config_manager = get_config_manager()
347
- base_config = config_manager.get_agent_config("content_structuring")
348
-
349
- if not base_config:
350
- raise ValueError(
351
- "content_structuring configuration not found - agent system not properly initialized"
352
- )
353
-
354
- super().__init__(base_config, openai_client)
355
-
356
- async def structure_cards(self, cards: List[Card]) -> List[Card]:
357
- """Structure and format cards for consistency"""
358
- datetime.now()
359
 
360
  try:
361
- structured_cards = []
362
-
363
- for i, card in enumerate(cards):
364
- user_input = self._build_structuring_prompt(card, i)
365
- response, usage = await self.execute(user_input)
366
-
367
- try:
368
- structured_data = (
369
- json.loads(response) if isinstance(response, str) else response
370
- )
371
- structured_card = self._parse_structured_card(structured_data, card)
372
- structured_cards.append(structured_card)
373
- except Exception as e:
374
- logger.warning(f"Failed to structure card {i}: {e}")
375
- structured_cards.append(card) # Keep original on failure
376
-
377
- return structured_cards
378
-
379
  except Exception as e:
380
- logger.error(f"ContentStructuringAgent failed: {e}")
381
- raise
382
-
383
- def _build_structuring_prompt(self, card: Card, index: int) -> str:
384
- """Build the structuring prompt for a single card"""
385
- return f"""Structure and format this flashcard for optimal learning:
386
-
387
- Original Card {index + 1}:
388
- Question: {card.front.question}
389
- Answer: {card.back.answer}
390
- Explanation: {card.back.explanation}
391
- Example: {card.back.example}
392
- Type: {card.card_type}
393
- Metadata: {json.dumps(card.metadata, indent=2)}
394
-
395
- Improve the card's structure and formatting:
396
- 1. Ensure clear, concise, unambiguous question
397
- 2. Provide complete, well-structured answer
398
- 3. Add comprehensive explanation with reasoning
399
- 4. Include relevant, practical example
400
- 5. Enhance metadata with appropriate tags and categorization
401
- 6. Maintain consistent formatting and style
402
-
403
- Return the improved card as JSON:
404
- {{
405
- "card_type": "basic|cloze",
406
- "front": {{
407
- "question": "Improved, clear question"
408
- }},
409
- "back": {{
410
- "answer": "Complete, well-structured answer",
411
- "explanation": "Comprehensive explanation with reasoning",
412
- "example": "Relevant, practical example"
413
- }},
414
- "metadata": {{
415
- "topic": "specific topic",
416
- "subject": "subject area",
417
- "difficulty": "beginner|intermediate|advanced",
418
- "tags": ["tag1", "tag2", "tag3"],
419
- "learning_outcomes": ["outcome1", "outcome2"],
420
- "prerequisites": ["prereq1", "prereq2"],
421
- "estimated_time": "time in minutes",
422
- "category": "category name"
423
- }}
424
- }}"""
425
-
426
- def _parse_structured_card(
427
- self, structured_data: Dict[str, Any], original_card: Card
428
- ) -> Card:
429
- """Parse structured card data into Card object"""
430
- try:
431
- return Card(
432
- card_type=structured_data.get("card_type", original_card.card_type),
433
- front=CardFront(question=structured_data["front"]["question"]),
434
- back=CardBack(
435
- answer=structured_data["back"]["answer"],
436
- explanation=structured_data["back"].get("explanation", ""),
437
- example=structured_data["back"].get("example", ""),
438
- ),
439
- metadata=structured_data.get("metadata", original_card.metadata),
440
- )
441
- except Exception as e:
442
- logger.warning(f"Failed to parse structured card: {e}")
443
- return original_card
444
-
445
-
446
- class GenerationCoordinator(BaseAgentWrapper):
447
- """Coordinates the multi-agent card generation workflow"""
448
-
449
- def __init__(self, openai_client: AsyncOpenAI):
450
- config_manager = get_config_manager()
451
- base_config = config_manager.get_agent_config("generation_coordinator")
452
-
453
- if not base_config:
454
- raise ValueError(
455
- "generation_coordinator configuration not found - agent system not properly initialized"
456
- )
457
-
458
- super().__init__(base_config, openai_client)
459
-
460
- # Initialize specialized agents
461
- self.subject_expert = None
462
- self.pedagogical = PedagogicalAgent(openai_client)
463
- self.content_structuring = ContentStructuringAgent(openai_client)
464
-
465
- async def coordinate_generation(
466
- self,
467
- topic: str,
468
- subject: str = "general",
469
- num_cards: int = 5,
470
- difficulty: str = "intermediate",
471
- enable_review: bool = True,
472
- enable_structuring: bool = True,
473
- context: Dict[str, Any] = None,
474
- ) -> List[Card]:
475
- """Coordinate the full card generation pipeline"""
476
- datetime.now()
477
 
478
  try:
479
- # Initialize subject expert for the specific subject
480
- if not self.subject_expert or self.subject_expert.subject != subject:
481
- self.subject_expert = SubjectExpertAgent(self.openai_client, subject)
482
-
483
- logger.info(f"Starting coordinated generation: {topic} ({subject})")
484
-
485
- # Step 1: Generate initial cards
486
- cards = await self.subject_expert.generate_cards(
487
- topic=topic, num_cards=num_cards, context=context
488
- )
489
-
490
- # Step 2: Pedagogical review (optional)
491
- if enable_review and cards:
492
- logger.info("Performing pedagogical review...")
493
- reviews = await self.pedagogical.review_cards(cards)
494
-
495
- # Filter or flag cards based on reviews
496
- approved_cards = []
497
- for card, review in zip(cards, reviews):
498
- if review.get("approved", True):
499
- approved_cards.append(card)
500
- else:
501
- logger.info(
502
- f"Card flagged for revision: {card.front.question[:50]}..."
503
- )
504
-
505
- cards = approved_cards
506
-
507
- # Step 3: Content structuring (optional)
508
- if enable_structuring and cards:
509
- logger.info("Performing content structuring...")
510
- cards = await self.content_structuring.structure_cards(cards)
511
-
512
- # Record successful coordination
513
-
514
- logger.info(f"Generation coordination complete: {len(cards)} cards")
515
- return cards
516
-
517
  except Exception as e:
518
- logger.error(f"Generation coordination failed: {e}")
519
- raise
520
-
521
- async def generate_structured_cards(
522
- self,
523
- topic: str,
524
- num_cards: int = 5,
525
- difficulty: str = "intermediate",
526
- context: Optional[Dict[str, Any]] = None,
527
- ) -> List[Card]:
528
- """Generate structured flashcards with enhanced metadata"""
529
- try:
530
- user_input = f"""Generate {num_cards} structured flashcards for: {topic}
531
-
532
- Difficulty: {difficulty}
533
- Requirements:
534
- - Include detailed metadata
535
- - Add learning outcomes
536
- - Specify prerequisites
537
- - Include related concepts
538
- - Estimate study time"""
539
-
540
- response, usage = await self.execute(user_input)
541
-
542
- # Log usage information
543
- if usage and usage.get("total_tokens", 0) > 0:
544
- logger.info(
545
- f"💰 Token Usage: {usage['total_tokens']} tokens (Input: {usage['input_tokens']}, Output: {usage['output_tokens']})"
546
  )
547
-
548
- # Parse the structured response directly since it should be a CardsGenerationSchema
549
- if hasattr(response, "cards") and response.cards:
550
- return response.cards
551
- else:
552
- logger.warning("No cards found in structured response")
553
- return []
554
-
555
- except Exception as e:
556
- logger.error(f"Structured card generation failed: {e}")
557
- raise
558
-
559
- async def generate_adaptive_cards(
560
- self,
561
- topic: str,
562
- learning_style: str = "visual",
563
- num_cards: int = 5,
564
- context: Optional[Dict[str, Any]] = None,
565
- ) -> List[Card]:
566
- """Generate cards adapted to specific learning styles"""
567
- try:
568
- user_input = f"""Generate {num_cards} flashcards for: {topic}
569
-
570
- Learning Style: {learning_style}
571
- Adapt the content format and presentation to match this learning style."""
572
-
573
- response, usage = await self.execute(user_input)
574
-
575
- # Log usage information
576
- if usage and usage.get("total_tokens", 0) > 0:
577
- logger.info(
578
- f"💰 Token Usage: {usage['total_tokens']} tokens (Input: {usage['input_tokens']}, Output: {usage['output_tokens']})"
579
  )
 
 
 
 
 
 
580
 
581
- # Parse the adaptive response directly since it should be a CardsGenerationSchema
582
- if hasattr(response, "cards") and response.cards:
583
- return response.cards
584
- else:
585
- logger.warning("No cards found in adaptive response")
586
- return []
587
-
588
- except Exception as e:
589
- logger.error(f"Adaptive card generation failed: {e}")
590
- raise
 
1
  # Specialized generator agents for card generation
2
 
3
  import json
4
+ from typing import List, Dict, Any, Optional, Tuple
 
5
 
6
  from openai import AsyncOpenAI
7
 
8
  from ankigen_core.logging import logger
9
  from ankigen_core.models import Card, CardFront, CardBack
10
+ from .base import BaseAgentWrapper, AgentConfig
11
  from .config import get_config_manager
12
  from .schemas import CardsGenerationSchema
13
 
14
 
15
+ def card_dict_to_card(
16
+ card_data: Dict[str, Any],
17
+ default_topic: str,
18
+ default_subject: str,
19
+ ) -> Card:
20
+ """Convert a dictionary representation of a card into a Card object."""
21
+
22
+ if not isinstance(card_data, dict):
23
+ raise ValueError("Card payload must be a dictionary")
24
+
25
+ front_data = card_data.get("front")
26
+ back_data = card_data.get("back")
27
+
28
+ if not isinstance(front_data, dict) or "question" not in front_data:
29
+ raise ValueError("Card front must include a question field")
30
+ if not isinstance(back_data, dict) or "answer" not in back_data:
31
+ raise ValueError("Card back must include an answer field")
32
+
33
+ metadata = card_data.get("metadata", {}) or {}
34
+ if not isinstance(metadata, dict):
35
+ metadata = {}
36
+
37
+ subject = metadata.get("subject") or default_subject or "general"
38
+ topic = metadata.get("topic") or default_topic or "General Concepts"
39
+
40
+ card = Card(
41
+ card_type=str(card_data.get("card_type", "basic")),
42
+ front=CardFront(question=str(front_data.get("question", ""))),
43
+ back=CardBack(
44
+ answer=str(back_data.get("answer", "")),
45
+ explanation=str(back_data.get("explanation", "")),
46
+ example=str(back_data.get("example", "")),
47
+ ),
48
+ metadata=metadata,
49
+ )
50
+
51
+ if card.metadata is not None:
52
+ card.metadata.setdefault("subject", subject)
53
+ card.metadata.setdefault("topic", topic)
54
+
55
+ return card
56
+
57
+
58
  class SubjectExpertAgent(BaseAgentWrapper):
59
  """Subject matter expert agent for domain-specific card generation"""
60
 
 
84
  async def generate_cards(
85
  self, topic: str, num_cards: int = 5, context: Optional[Dict[str, Any]] = None
86
  ) -> List[Card]:
87
+ """Generate flashcards for a given topic with automatic batching for large requests"""
88
  try:
89
+ # Use batching for large numbers of cards to avoid LLM limitations
90
+ batch_size = 10 # Generate max 10 cards per batch
91
+ all_cards = []
92
+ total_usage = {"total_tokens": 0, "input_tokens": 0, "output_tokens": 0}
93
+
94
+ cards_remaining = num_cards
95
+ batch_num = 1
96
+
97
+ logger.info(
98
+ f"Generating {num_cards} cards for topic '{topic}' using {((num_cards - 1) // batch_size) + 1} batches"
99
+ )
100
+
101
+ # Track card topics from previous batches to avoid duplication
102
+ previous_card_topics = []
103
+
104
+ while cards_remaining > 0:
105
+ cards_in_this_batch = min(batch_size, cards_remaining)
106
+
107
+ logger.info(
108
+ f"Generating batch {batch_num}: {cards_in_this_batch} cards"
109
+ )
110
+
111
+ # Reset agent for each batch to avoid conversation history accumulation
112
+ self.agent = None
113
+ await self.initialize()
114
 
115
+ user_input = (
116
+ f"Generate {cards_in_this_batch} flashcards for the topic: {topic}"
117
+ )
118
+ if context:
119
+ user_input += f"\n\nAdditional context: {context}"
120
+
121
+ # Add previous topics to avoid repetition instead of full conversation history
122
+ if previous_card_topics:
123
+ topics_summary = ", ".join(
124
+ previous_card_topics[-20:]
125
+ ) # Last 20 topics to keep it manageable
126
+ user_input += f"\n\nAvoid creating cards about these already covered topics: {topics_summary}"
127
+
128
+ if batch_num > 1:
129
+ user_input += f"\n\nThis is batch {batch_num} of cards. Ensure these cards cover different aspects of the topic."
130
+
131
+ response, usage = await self.execute(user_input, context)
132
+
133
+ # Accumulate usage information
134
+ if usage:
135
+ for key in total_usage:
136
+ total_usage[key] += usage.get(key, 0)
137
+
138
+ batch_cards = self._parse_cards_response(response, topic)
139
+ all_cards.extend(batch_cards)
140
+
141
+ # Extract topics from generated cards to avoid duplication in next batch
142
+ for card in batch_cards:
143
+ if hasattr(card, "front") and card.front and card.front.question:
144
+ # Extract key terms from the question for deduplication
145
+ question_words = card.front.question.lower().split()
146
+ key_terms = [word for word in question_words if len(word) > 3][
147
+ :3
148
+ ] # First 3 meaningful words
149
+ if key_terms:
150
+ previous_card_topics.append(" ".join(key_terms))
151
+
152
+ cards_remaining -= len(batch_cards)
153
+ batch_num += 1
154
+
155
+ logger.info(
156
+ f"Batch {batch_num-1} generated {len(batch_cards)} cards. {cards_remaining} cards remaining."
157
+ )
158
+
159
+ # Safety check to prevent infinite loops
160
+ if len(batch_cards) == 0:
161
+ logger.warning(
162
+ f"No cards generated in batch {batch_num-1}, stopping generation"
163
+ )
164
+ break
165
 
166
+ # Log final usage information
167
+ if total_usage.get("total_tokens", 0) > 0:
168
  logger.info(
169
+ f"💰 Total Token Usage: {total_usage['total_tokens']} tokens (Input: {total_usage['input_tokens']}, Output: {total_usage['output_tokens']})"
170
  )
171
 
172
+ logger.info(
173
+ f"✅ Generated {len(all_cards)} cards total across {batch_num-1} batches for topic '{topic}'"
174
+ )
175
+ return all_cards
176
 
177
  except Exception as e:
178
  logger.error(f"Card generation failed: {e}")
 
264
  cards = []
265
  for i, card_data in enumerate(card_data_list):
266
  try:
267
+ if hasattr(card_data, "dict"):
268
+ payload = card_data.dict()
269
+ elif isinstance(card_data, dict):
270
+ payload = card_data
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
271
  else:
272
+ logger.warning(
273
+ f"Skipping card {i}: unsupported payload type {type(card_data)}"
274
+ )
 
 
 
275
  continue
276
 
277
+ card = card_dict_to_card(payload, topic, self.subject)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
278
  cards.append(card)
279
 
280
  except Exception as e:
 
302
  raise
303
 
304
 
305
+ class QualityReviewAgent(BaseAgentWrapper):
306
+ """Single-pass quality review agent for lightweight validation and fixes."""
307
+
308
+ def __init__(self, openai_client: AsyncOpenAI, model: str):
309
+ config = AgentConfig(
310
+ name="quality_reviewer",
311
+ instructions=(
312
+ "You are a meticulous flashcard reviewer. Review each card for factual accuracy, clarity,"
313
+ " atomic scope, and answer quality. When needed, revise the card while keeping it concise and"
314
+ " faithful to the original intent. Always respond with a JSON object containing:"
315
+ ' {"approved": bool, "reason": string, "revised_card": object or null}.'
316
+ " The revised card must follow the input schema with fields card_type, front.question,"
317
+ " back.answer/explanation/example, and metadata."
318
+ ),
319
+ model=model,
320
+ temperature=0.2,
321
+ timeout=45.0,
322
+ retry_attempts=2,
323
+ enable_tracing=False,
324
+ )
325
+ super().__init__(config, openai_client)
326
+
327
+ async def review_card(self, card: Card) -> Tuple[Optional[Card], bool, str]:
328
+ """Review a card and optionally return a revised version."""
329
+
330
+ card_payload = {
331
+ "card_type": card.card_type,
332
+ "front": {"question": card.front.question if card.front else ""},
333
+ "back": {
334
+ "answer": card.back.answer if card.back else "",
335
+ "explanation": card.back.explanation if card.back else "",
336
+ "example": card.back.example if card.back else "",
337
+ },
338
+ "metadata": card.metadata or {},
339
+ }
340
+
341
+ user_input = (
342
+ "Review the following flashcard. Approve it if it is accurate, clear, and atomic."
343
+ " If improvements are needed, provide a revised_card with the corrections applied.\n\n"
344
+ "Flashcard JSON:\n"
345
+ f"{json.dumps(card_payload, ensure_ascii=False)}\n\n"
346
+ "Respond with JSON matching this schema:\n"
347
+ '{\n "approved": true | false,\n "reason": "short explanation",\n'
348
+ ' "revised_card": { ... } | null\n}'
349
+ )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
350
 
351
  try:
352
+ response, _ = await self.execute(user_input)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
353
  except Exception as e:
354
+ logger.error(f"Quality review failed to execute: {e}")
355
+ return card, True, "Review failed; keeping original card"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
356
 
357
  try:
358
+ parsed = json.loads(response) if isinstance(response, str) else response
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
359
  except Exception as e:
360
+ logger.warning(f"Failed to parse review response as JSON: {e}")
361
+ return card, True, "Reviewer returned invalid JSON; keeping original"
362
+
363
+ approved = bool(parsed.get("approved", True))
364
+ reason = str(parsed.get("reason", ""))
365
+ revised_payload = parsed.get("revised_card")
366
+
367
+ revised_card: Optional[Card] = None
368
+ if isinstance(revised_payload, dict):
369
+ try:
370
+ metadata = revised_payload.get("metadata", {}) or {}
371
+ revised_subject = metadata.get("subject") or (card.metadata or {}).get(
372
+ "subject",
373
+ "general",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
374
  )
375
+ revised_topic = metadata.get("topic") or (card.metadata or {}).get(
376
+ "topic",
377
+ "General Concepts",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
378
  )
379
+ revised_card = card_dict_to_card(
380
+ revised_payload, revised_topic, revised_subject
381
+ )
382
+ except Exception as e:
383
+ logger.warning(f"Failed to build revised card from review payload: {e}")
384
+ revised_card = None
385
 
386
+ return revised_card or card, approved, reason or ""
 
 
 
 
 
 
 
 
 
ankigen_core/agents/integration.py CHANGED
@@ -1,16 +1,16 @@
1
  # Main integration module for AnkiGen agent system
2
 
3
- from typing import List, Dict, Any, Tuple
4
  from datetime import datetime
5
 
6
 
7
  from ankigen_core.logging import logger
8
  from ankigen_core.models import Card
9
  from ankigen_core.llm_interface import OpenAIClientManager
 
10
 
11
- from .generators import GenerationCoordinator, SubjectExpertAgent
12
- from .judges import JudgeCoordinator
13
- from .enhancers import RevisionAgent, EnhancementAgent
14
 
15
 
16
  class AgentOrchestrator:
@@ -20,14 +20,8 @@ class AgentOrchestrator:
20
  self.client_manager = client_manager
21
  self.openai_client = None
22
 
23
- # Initialize coordinators
24
- self.generation_coordinator = None
25
- self.judge_coordinator = None
26
- self.revision_agent = None
27
- self.enhancement_agent = None
28
-
29
- # All agents enabled by default
30
- self.all_agents_enabled = True
31
 
32
  async def initialize(self, api_key: str, model_overrides: Dict[str, str] = None):
33
  """Initialize the agent system"""
@@ -44,13 +38,7 @@ class AgentOrchestrator:
44
  config_manager.update_models(model_overrides)
45
  logger.info(f"Applied model overrides: {model_overrides}")
46
 
47
- # Initialize all agents
48
- self.generation_coordinator = GenerationCoordinator(self.openai_client)
49
- self.judge_coordinator = JudgeCoordinator(self.openai_client)
50
- self.revision_agent = RevisionAgent(self.openai_client)
51
- self.enhancement_agent = EnhancementAgent(self.openai_client)
52
-
53
- logger.info("Agent system initialized successfully")
54
 
55
  except Exception as e:
56
  logger.error(f"Failed to initialize agent system: {e}")
@@ -64,45 +52,66 @@ class AgentOrchestrator:
64
  difficulty: str = "intermediate",
65
  enable_quality_pipeline: bool = True,
66
  context: Dict[str, Any] = None,
 
 
67
  ) -> Tuple[List[Card], Dict[str, Any]]:
68
  """Generate cards using the agent system"""
69
  start_time = datetime.now()
70
 
71
  try:
72
- # Agents are always enabled now
73
-
74
  if not self.openai_client:
75
  raise ValueError("Agent system not initialized")
76
 
77
  logger.info(f"Starting agent-based card generation: {topic} ({subject})")
78
 
79
- # Phase 1: Generation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80
  cards = await self._generation_phase(
81
  topic=topic,
82
  subject=subject,
83
  num_cards=num_cards,
84
  difficulty=difficulty,
85
- context=context,
86
  )
87
 
88
- # Phase 2: Quality Assessment
89
- quality_results = {}
90
- if enable_quality_pipeline and self.judge_coordinator:
91
- cards, quality_results = await self._quality_phase(cards)
92
-
93
- # Phase 3: Enhancement
94
- if self.enhancement_agent:
95
- cards = await self._enhancement_phase(cards)
96
 
97
  # Collect metadata
98
  metadata = {
99
  "generation_method": "agent_system",
100
  "generation_time": (datetime.now() - start_time).total_seconds(),
101
  "cards_generated": len(cards),
102
- "quality_results": quality_results,
103
  "topic": topic,
104
  "subject": subject,
105
  "difficulty": difficulty,
 
 
106
  }
107
 
108
  logger.info(
@@ -124,116 +133,72 @@ class AgentOrchestrator:
124
  ) -> List[Card]:
125
  """Execute the card generation phase"""
126
 
127
- if self.generation_coordinator:
128
- # Use coordinated multi-agent generation
129
- cards = await self.generation_coordinator.coordinate_generation(
130
- topic=topic,
131
- subject=subject,
132
- num_cards=num_cards,
133
- difficulty=difficulty,
134
- enable_review=True,
135
- enable_structuring=True,
136
- context=context,
137
- )
138
- else:
139
- # Use subject expert agent directly
140
- subject_expert = SubjectExpertAgent(self.openai_client, subject)
141
- cards = await subject_expert.generate_cards(
142
- topic=topic, num_cards=num_cards, difficulty=difficulty, context=context
143
- )
144
 
145
  logger.info(f"Generation phase complete: {len(cards)} cards generated")
146
  return cards
147
 
148
- async def _quality_phase(
149
  self, cards: List[Card]
150
  ) -> Tuple[List[Card], Dict[str, Any]]:
151
- """Execute the quality assessment and improvement phase"""
152
 
153
- if not self.judge_coordinator:
154
- return cards, {"message": "Judge coordinator not available"}
155
 
156
- logger.info(f"Starting quality assessment for {len(cards)} cards")
157
 
158
- # Judge all cards
159
- judge_results = await self.judge_coordinator.coordinate_judgment(
160
- cards=cards,
161
- enable_parallel=True,
162
- min_consensus=0.6,
163
- )
 
164
 
165
- # Separate approved and rejected cards
166
- approved_cards = []
167
- rejected_cards = []
168
 
169
- for card, decisions, approved in judge_results:
 
 
 
170
  if approved:
171
- approved_cards.append(card)
172
  else:
173
- rejected_cards.append((card, decisions))
174
-
175
- # Attempt to revise rejected cards
176
- revised_cards = []
177
- if self.revision_agent and rejected_cards:
178
- logger.info(f"Attempting to revise {len(rejected_cards)} rejected cards")
179
-
180
- for card, decisions in rejected_cards:
181
- try:
182
- revised_card = await self.revision_agent.revise_card(
183
- card=card,
184
- judge_decisions=decisions,
185
- max_iterations=2,
186
- )
187
-
188
- # Re-judge the revised card
189
- revision_results = await self.judge_coordinator.coordinate_judgment(
190
- cards=[revised_card],
191
- enable_parallel=False, # Single card, no need for parallel
192
- min_consensus=0.6,
193
- )
194
-
195
- if revision_results and revision_results[0][2]: # If approved
196
- revised_cards.append(revised_card)
197
- else:
198
- logger.warning(
199
- f"Revised card still rejected: {card.front.question[:50]}..."
200
- )
201
-
202
- except Exception as e:
203
- logger.error(f"Failed to revise card: {e}")
204
-
205
- # Combine approved and successfully revised cards
206
- final_cards = approved_cards + revised_cards
207
-
208
- # Prepare quality results
209
- quality_results = {
210
- "total_cards_judged": len(cards),
211
- "initially_approved": len(approved_cards),
212
- "initially_rejected": len(rejected_cards),
213
- "successfully_revised": len(revised_cards),
214
- "final_approval_rate": len(final_cards) / len(cards) if cards else 0,
215
- "judge_decisions": len(judge_results),
216
  }
217
 
218
- logger.info(
219
- f"Quality phase complete: {len(final_cards)}/{len(cards)} cards approved"
220
- )
221
- return final_cards, quality_results
222
-
223
- async def _enhancement_phase(self, cards: List[Card]) -> List[Card]:
224
- """Execute the enhancement phase"""
225
-
226
- if not self.enhancement_agent:
227
- return cards
228
-
229
- logger.info(f"Starting enhancement for {len(cards)} cards")
230
-
231
- enhanced_cards = await self.enhancement_agent.enhance_card_batch(
232
- cards=cards, enhancement_targets=["explanation", "example", "metadata"]
233
- )
234
 
235
- logger.info(f"Enhancement phase complete: {len(enhanced_cards)} cards enhanced")
236
- return enhanced_cards
237
 
238
  def get_performance_metrics(self) -> Dict[str, Any]:
239
  """Get performance metrics for the agent system"""
 
1
  # Main integration module for AnkiGen agent system
2
 
3
+ from typing import List, Dict, Any, Tuple, Optional
4
  from datetime import datetime
5
 
6
 
7
  from ankigen_core.logging import logger
8
  from ankigen_core.models import Card
9
  from ankigen_core.llm_interface import OpenAIClientManager
10
+ from ankigen_core.context7 import Context7Client
11
 
12
+ from .generators import SubjectExpertAgent, QualityReviewAgent
13
+ from ankigen_core.agents.config import get_config_manager
 
14
 
15
 
16
  class AgentOrchestrator:
 
20
  self.client_manager = client_manager
21
  self.openai_client = None
22
 
23
+ self.subject_expert = None
24
+ self.quality_reviewer = None
 
 
 
 
 
 
25
 
26
  async def initialize(self, api_key: str, model_overrides: Dict[str, str] = None):
27
  """Initialize the agent system"""
 
38
  config_manager.update_models(model_overrides)
39
  logger.info(f"Applied model overrides: {model_overrides}")
40
 
41
+ logger.info("Agent system initialized successfully (simplified pipeline)")
 
 
 
 
 
 
42
 
43
  except Exception as e:
44
  logger.error(f"Failed to initialize agent system: {e}")
 
52
  difficulty: str = "intermediate",
53
  enable_quality_pipeline: bool = True,
54
  context: Dict[str, Any] = None,
55
+ library_name: Optional[str] = None,
56
+ library_topic: Optional[str] = None,
57
  ) -> Tuple[List[Card], Dict[str, Any]]:
58
  """Generate cards using the agent system"""
59
  start_time = datetime.now()
60
 
61
  try:
 
 
62
  if not self.openai_client:
63
  raise ValueError("Agent system not initialized")
64
 
65
  logger.info(f"Starting agent-based card generation: {topic} ({subject})")
66
 
67
+ # Enhance context with library documentation if requested
68
+ enhanced_context = context or {}
69
+ library_docs = None
70
+
71
+ if library_name:
72
+ logger.info(f"Fetching library documentation for: {library_name}")
73
+ try:
74
+ context7_client = Context7Client()
75
+ library_docs = await context7_client.fetch_library_documentation(
76
+ library_name, topic=library_topic, tokens=5000
77
+ )
78
+
79
+ if library_docs:
80
+ enhanced_context["library_documentation"] = library_docs
81
+ enhanced_context["library_name"] = library_name
82
+ logger.info(
83
+ f"Added {len(library_docs)} chars of {library_name} documentation to context"
84
+ )
85
+ else:
86
+ logger.warning(
87
+ f"Could not fetch documentation for library: {library_name}"
88
+ )
89
+ except Exception as e:
90
+ logger.error(f"Error fetching library documentation: {e}")
91
+
92
  cards = await self._generation_phase(
93
  topic=topic,
94
  subject=subject,
95
  num_cards=num_cards,
96
  difficulty=difficulty,
97
+ context=enhanced_context,
98
  )
99
 
100
+ review_results = {}
101
+ if enable_quality_pipeline:
102
+ cards, review_results = await self._quality_review_phase(cards)
 
 
 
 
 
103
 
104
  # Collect metadata
105
  metadata = {
106
  "generation_method": "agent_system",
107
  "generation_time": (datetime.now() - start_time).total_seconds(),
108
  "cards_generated": len(cards),
109
+ "review_results": review_results,
110
  "topic": topic,
111
  "subject": subject,
112
  "difficulty": difficulty,
113
+ "library_name": library_name if library_name else None,
114
+ "library_docs_used": bool(library_docs),
115
  }
116
 
117
  logger.info(
 
133
  ) -> List[Card]:
134
  """Execute the card generation phase"""
135
 
136
+ if not self.subject_expert or self.subject_expert.subject != subject:
137
+ self.subject_expert = SubjectExpertAgent(self.openai_client, subject)
138
+
139
+ # Add difficulty to context if needed
140
+ if context is None:
141
+ context = {}
142
+ context["difficulty"] = difficulty
143
+
144
+ cards = await self.subject_expert.generate_cards(
145
+ topic=topic, num_cards=num_cards, context=context
146
+ )
 
 
 
 
 
 
147
 
148
  logger.info(f"Generation phase complete: {len(cards)} cards generated")
149
  return cards
150
 
151
+ async def _quality_review_phase(
152
  self, cards: List[Card]
153
  ) -> Tuple[List[Card], Dict[str, Any]]:
154
+ """Perform a single quality-review pass with optional fixes."""
155
 
156
+ if not cards:
157
+ return cards, {"message": "No cards to review"}
158
 
159
+ logger.info(f"Performing quality review for {len(cards)} cards")
160
 
161
+ if not self.quality_reviewer:
162
+ # Use the same model as the subject expert by default.
163
+ subject_config = get_config_manager().get_agent_config("subject_expert")
164
+ reviewer_model = subject_config.model if subject_config else "gpt-4.1"
165
+ self.quality_reviewer = QualityReviewAgent(
166
+ self.openai_client, reviewer_model
167
+ )
168
 
169
+ reviewed_cards: List[Card] = []
170
+ approvals: List[Dict[str, Any]] = []
 
171
 
172
+ for card in cards:
173
+ reviewed_card, approved, reason = await self.quality_reviewer.review_card(
174
+ card
175
+ )
176
  if approved:
177
+ reviewed_cards.append(reviewed_card)
178
  else:
179
+ approvals.append(
180
+ {
181
+ "question": card.front.question if card.front else "",
182
+ "reason": reason,
183
+ }
184
+ )
185
+
186
+ review_results = {
187
+ "total_cards_reviewed": len(cards),
188
+ "approved_cards": len(reviewed_cards),
189
+ "rejected_cards": approvals,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
190
  }
191
 
192
+ if approvals:
193
+ logger.warning(
194
+ "Quality review rejected cards: %s",
195
+ "; ".join(
196
+ f"{entry['question'][:50]}… ({entry['reason']})"
197
+ for entry in approvals
198
+ ),
199
+ )
 
 
 
 
 
 
 
 
200
 
201
+ return reviewed_cards, review_results
 
202
 
203
  def get_performance_metrics(self) -> Dict[str, Any]:
204
  """Get performance metrics for the agent system"""
ankigen_core/agents/templates/generators.j2 CHANGED
@@ -5,33 +5,12 @@
5
  "instructions": "You are a world-class expert in {{ subject | default('the subject area') }} with deep pedagogical knowledge. \nYour role is to generate high-quality flashcards that demonstrate mastery of {{ subject | default('the subject') }} concepts.\n\nKey responsibilities:\n- Create ATOMIC cards: extremely short (1-9 words on back), break complex info into multiple simple cards\n- Use standardized, bland prompts without fancy formatting or unusual words\n- Design prompts that match real-life recall situations\n- Put ALL to-be-learned information on the BACK of cards, never in prompts\n- Ensure technical accuracy and depth appropriate for the target level\n- Use domain-specific terminology correctly\n- Connect concepts to prerequisite knowledge\n\nPrioritize atomic simplicity over comprehensive single cards. Generate cards that test understanding through simple, direct recall.",
6
  "model": "{{ subject_expert_model }}",
7
  "temperature": 0.7,
8
- "timeout": 45.0,
9
  "custom_prompts": {
10
  "math": "Focus on problem-solving strategies and mathematical reasoning",
11
  "science": "Emphasize experimental design and scientific method",
12
  "history": "Connect events to broader historical patterns and causation",
13
  "programming": "Include executable examples and best practices"
14
  }
15
- },
16
- "pedagogical": {
17
- "name": "pedagogical",
18
- "instructions": "You are an educational specialist focused on learning theory and instructional design.\nYour role is to ensure all flashcards follow educational best practices.\n\nApply these frameworks:\n- Bloom's Taxonomy: Ensure questions target appropriate cognitive levels\n- Spaced Repetition: Design cards for optimal retention\n- Cognitive Load Theory: Avoid overwhelming learners\n- Active Learning: Encourage engagement and application\n\nReview cards for:\n- Clear learning objectives\n- Appropriate difficulty progression\n- Effective use of examples and analogies\n- Prerequisite knowledge alignment",
19
- "model": "{{ pedagogical_agent_model }}",
20
- "temperature": 0.6,
21
- "timeout": 30.0
22
- },
23
- "content_structuring": {
24
- "name": "content_structuring",
25
- "instructions": "You are a content organization specialist focused on atomic card design and optimal learning structure.\nYour role is to format and organize flashcard content following proven Anki principles.\n\nEnsure all cards follow atomic principles:\n- Break down any card longer than 9 words into multiple cards\n- Use standardized, bland prompt formats (no fancy words, formatting, or visual cues)\n- Consistent question structures (e.g., 'T [concept]' for terminology, '[concept] definition' for definitions)\n- Put ALL information to be learned on the BACK, never in prompts\n- Create 'handles' (>references) to connect related cards without making individual cards long\n- Use multiple difficulty levels for the same concept when appropriate\n\nPrioritize simplicity and consistency over comprehensive single cards.",
26
- "model": "{{ content_structuring_model }}",
27
- "temperature": 0.5,
28
- "timeout": 25.0
29
- },
30
- "generation_coordinator": {
31
- "name": "generation_coordinator",
32
- "instructions": "You are the generation workflow coordinator. \nYour role is to orchestrate the card generation process and manage handoffs between specialized agents.\n\nResponsibilities:\n- Route requests to appropriate specialist agents\n- Coordinate parallel generation tasks\n- Manage workflow state and progress\n- Handle errors and fallback strategies\n- Optimize generation pipelines\n\nMake decisions based on content type, user preferences, and system load.",
33
- "model": "{{ generation_coordinator_model }}",
34
- "temperature": 0.3,
35
- "timeout": 20.0
36
  }
37
  }
 
5
  "instructions": "You are a world-class expert in {{ subject | default('the subject area') }} with deep pedagogical knowledge. \nYour role is to generate high-quality flashcards that demonstrate mastery of {{ subject | default('the subject') }} concepts.\n\nKey responsibilities:\n- Create ATOMIC cards: extremely short (1-9 words on back), break complex info into multiple simple cards\n- Use standardized, bland prompts without fancy formatting or unusual words\n- Design prompts that match real-life recall situations\n- Put ALL to-be-learned information on the BACK of cards, never in prompts\n- Ensure technical accuracy and depth appropriate for the target level\n- Use domain-specific terminology correctly\n- Connect concepts to prerequisite knowledge\n\nPrioritize atomic simplicity over comprehensive single cards. Generate cards that test understanding through simple, direct recall.",
6
  "model": "{{ subject_expert_model }}",
7
  "temperature": 0.7,
8
+ "timeout": 120.0,
9
  "custom_prompts": {
10
  "math": "Focus on problem-solving strategies and mathematical reasoning",
11
  "science": "Emphasize experimental design and scientific method",
12
  "history": "Connect events to broader historical patterns and causation",
13
  "programming": "Include executable examples and best practices"
14
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  }
16
  }
ankigen_core/card_generator.py CHANGED
@@ -3,21 +3,16 @@
3
  import gradio as gr
4
  import pandas as pd
5
  from typing import List, Dict, Any
6
- import asyncio
7
- from urllib.parse import urlparse
8
 
9
  # Imports from our core modules
10
  from ankigen_core.utils import (
11
  get_logger,
12
  ResponseCache,
13
- fetch_webpage_text,
14
  strip_html_tags,
15
  )
16
- from ankigen_core.llm_interface import OpenAIClientManager, structured_output_completion
17
  from ankigen_core.models import (
18
  Card,
19
- CardFront,
20
- CardBack,
21
  ) # Import necessary Pydantic models
22
 
23
  # Import agent system - required
@@ -72,170 +67,7 @@ GENERATION_MODES = [
72
  # --- Core Functions --- (Moved and adapted from app.py)
73
 
74
 
75
- async def generate_cards_batch(
76
- openai_client, # Renamed from client to openai_client for clarity
77
- cache: ResponseCache, # Added cache parameter
78
- model: str,
79
- topic: str,
80
- num_cards: int,
81
- system_prompt: str,
82
- generate_cloze: bool = False,
83
- batch_size: int = 3, # Keep batch_size, though not explicitly used in this version
84
- ):
85
- """Generate a batch of cards for a topic, potentially including cloze deletions"""
86
-
87
- cloze_instruction = ""
88
- if generate_cloze:
89
- cloze_instruction = """
90
- Where appropriate, generate Cloze deletion cards.
91
- - For Cloze cards, set "card_type" to "cloze".
92
- - Format the question field using Anki's cloze syntax (e.g., "The capital of France is {{c1::Paris}}.").
93
- - The "answer" field should contain the full, non-cloze text or specific context for the cloze.
94
- - For standard question/answer cards, set "card_type" to "basic".
95
- """
96
-
97
- cards_prompt = f"""
98
- Generate {num_cards} ATOMIC flashcards for the topic: {topic}
99
-
100
- Follow these ATOMIC principles:
101
- - Each answer should be 1-9 words maximum
102
- - Use bland, standardized questions (no fancy formatting)
103
- - Break complex concepts into multiple simple cards
104
- - Put ALL learning content in answers, never in questions
105
- - Use handles (>references) to connect related cards
106
- - Design questions to match real-life recall situations
107
-
108
- {cloze_instruction}
109
- Return your response as a JSON object with the following structure:
110
- {{
111
- "cards": [
112
- {{
113
- "card_type": "basic or cloze",
114
- "front": {{
115
- "question": "question text (potentially with {{{{c1::cloze syntax}}}})"
116
- }},
117
- "back": {{
118
- "answer": "concise answer or full text for cloze",
119
- "explanation": "detailed explanation",
120
- "example": "practical example"
121
- }},
122
- "metadata": {{
123
- "prerequisites": ["list", "of", "prerequisites"],
124
- "learning_outcomes": ["list", "of", "outcomes"],
125
- "difficulty": "beginner/intermediate/advanced"
126
- }}
127
- }}
128
- // ... more cards
129
- ]
130
- }}
131
- """
132
-
133
- try:
134
- logger.info(
135
- f"Generating card batch for {topic}, Cloze enabled: {generate_cloze}"
136
- )
137
- # Call the imported structured_output_completion, passing client and cache
138
- response = await structured_output_completion(
139
- openai_client=openai_client,
140
- model=model,
141
- response_format={"type": "json_object"},
142
- system_prompt=system_prompt,
143
- user_prompt=cards_prompt,
144
- cache=cache, # Pass the cache instance
145
- )
146
-
147
- if not response or "cards" not in response:
148
- logger.error("Invalid cards response format")
149
- raise ValueError("Failed to generate cards. Please try again.")
150
-
151
- cards_list = []
152
- for card_data in response["cards"]:
153
- if "front" not in card_data or "back" not in card_data:
154
- logger.warning(
155
- f"Skipping card due to missing front/back data: {card_data}"
156
- )
157
- continue
158
- if "question" not in card_data["front"]:
159
- logger.warning(f"Skipping card due to missing question: {card_data}")
160
- continue
161
- if (
162
- "answer" not in card_data["back"]
163
- or "explanation" not in card_data["back"]
164
- or "example" not in card_data["back"]
165
- ):
166
- logger.warning(
167
- f"Skipping card due to missing answer/explanation/example: {card_data}"
168
- )
169
- continue
170
-
171
- # Use imported Pydantic models
172
- card = Card(
173
- card_type=card_data.get("card_type", "basic"),
174
- front=CardFront(
175
- question=strip_html_tags(card_data["front"].get("question", ""))
176
- ),
177
- back=CardBack(
178
- answer=strip_html_tags(card_data["back"].get("answer", "")),
179
- explanation=strip_html_tags(
180
- card_data["back"].get("explanation", "")
181
- ),
182
- example=strip_html_tags(card_data["back"].get("example", "")),
183
- ),
184
- metadata=card_data.get("metadata", {}),
185
- )
186
- cards_list.append(card)
187
-
188
- return cards_list
189
-
190
- except Exception as e:
191
- logger.error(
192
- f"Failed to generate cards batch for {topic}: {str(e)}", exc_info=True
193
- )
194
- raise # Re-raise for the main function to handle
195
-
196
-
197
- async def judge_card(
198
- openai_client,
199
- cache: ResponseCache,
200
- model: str,
201
- card: Card,
202
- ) -> bool:
203
- """Use an LLM to validate a single card."""
204
- system_prompt = (
205
- "You review flashcards and decide if the question is clear and useful. "
206
- 'Respond with a JSON object like {"is_valid": true}.'
207
- )
208
- user_prompt = f"Question: {card.front.question}\nAnswer: {card.back.answer}"
209
- try:
210
- result = await structured_output_completion(
211
- openai_client=openai_client,
212
- model=model,
213
- response_format={"type": "json_object"},
214
- system_prompt=system_prompt,
215
- user_prompt=user_prompt,
216
- cache=cache,
217
- )
218
- if isinstance(result, dict):
219
- return bool(result.get("is_valid", True))
220
- except Exception as e: # pragma: no cover - network or parse errors
221
- logger.warning(f"LLM judge failed for card '{card.front.question}': {e}")
222
- return True
223
-
224
-
225
- async def judge_cards(
226
- openai_client,
227
- cache: ResponseCache,
228
- model: str,
229
- cards: List[Card],
230
- ) -> List[Card]:
231
- """Filter cards using the LLM judge."""
232
- validated: List[Card] = []
233
- for card in cards:
234
- if await judge_card(openai_client, cache, model, card):
235
- validated.append(card)
236
- else:
237
- logger.info(f"Card rejected by judge: {card.front.question}")
238
- return validated
239
 
240
 
241
  async def orchestrate_card_generation( # MODIFIED: Added async
@@ -253,6 +85,8 @@ async def orchestrate_card_generation( # MODIFIED: Added async
253
  preference_prompt: str,
254
  generate_cloze: bool,
255
  use_llm_judge: bool = False,
 
 
256
  ):
257
  """Orchestrates the card generation process based on UI inputs."""
258
 
@@ -265,34 +99,14 @@ async def orchestrate_card_generation( # MODIFIED: Added async
265
  if AGENTS_AVAILABLE:
266
  logger.info("🤖 Using agent system for card generation")
267
  try:
268
- # Initialize token tracker
269
  from ankigen_core.agents.token_tracker import get_token_tracker
270
 
271
  token_tracker = get_token_tracker()
272
 
273
- # Initialize agent orchestrator with the actual model from UI
274
- # Initialize orchestrator with model overrides
275
  orchestrator = AgentOrchestrator(client_manager)
276
 
277
- # Set model overrides for all agents
278
- logger.info(f"Overriding all agent models to use: {model_name}")
279
- model_overrides = {
280
- "generation_coordinator": model_name,
281
- "subject_expert": model_name,
282
- "pedagogical_agent": model_name,
283
- "content_structuring": model_name,
284
- "enhancement_agent": model_name,
285
- "revision_agent": model_name,
286
- "content_accuracy_judge": model_name,
287
- "pedagogical_judge": model_name,
288
- "clarity_judge": model_name,
289
- "technical_judge": model_name,
290
- "completeness_judge": model_name,
291
- "judge_coordinator": model_name,
292
- }
293
-
294
- # Initialize with model overrides
295
- await orchestrator.initialize(api_key_input, model_overrides)
296
 
297
  # Map generation mode to subject
298
  agent_subject = "general"
@@ -303,22 +117,21 @@ async def orchestrate_card_generation( # MODIFIED: Added async
303
  elif generation_mode == "text":
304
  agent_subject = "content_analysis"
305
 
306
- # Calculate total cards needed
307
  total_cards_needed = topic_number * cards_per_topic
308
 
309
- # Prepare context for text mode
310
  context = {}
311
  if generation_mode == "text" and source_text:
312
  context["source_text"] = source_text
313
 
314
- # Generate cards with agents using the actual model from UI
315
  agent_cards, agent_metadata = await orchestrator.generate_cards_with_agents(
316
  topic=subject if subject else "Mixed Topics",
317
  subject=agent_subject,
318
  num_cards=total_cards_needed,
319
- difficulty="intermediate", # Could be made configurable
320
  enable_quality_pipeline=True,
321
  context=context,
 
 
322
  )
323
 
324
  # Get token usage from session
@@ -340,16 +153,14 @@ async def orchestrate_card_generation( # MODIFIED: Added async
340
  if agent_cards:
341
  formatted_cards = format_cards_for_dataframe(
342
  agent_cards,
343
- topic_name=f"Agent Generated - {subject}"
344
- if subject
345
- else "Agent Generated",
346
  start_index=1,
347
  )
348
 
349
  output_df = pd.DataFrame(
350
  formatted_cards, columns=get_dataframe_columns()
351
  )
352
- total_cards_message = f"<div><b>🤖 Agent Generated Cards:</b> <span id='total-cards-count'>{len(output_df)}</span></div>"
353
 
354
  logger.info(
355
  f"Agent system generated {len(output_df)} cards successfully"
@@ -373,539 +184,17 @@ async def orchestrate_card_generation( # MODIFIED: Added async
373
  "",
374
  )
375
 
376
- # This should never be reached since agents are required
377
- logger.error("Agent system not available but required")
378
- if not api_key_input:
379
- logger.warning("No API key provided to orchestrator")
380
- gr.Error("OpenAI API key is required")
381
- return pd.DataFrame(columns=get_dataframe_columns()), "API key is required.", 0
382
- # Re-initialize client via manager if API key changes or not initialized
383
- # This logic might need refinement depending on how API key state is managed in UI
384
- try:
385
- # Attempt to initialize (will raise error if key is invalid)
386
- await client_manager.initialize_client(api_key_input)
387
- openai_client = client_manager.get_client()
388
- except (ValueError, RuntimeError, Exception) as e:
389
- logger.error(f"Client initialization failed in orchestrator: {e}")
390
- gr.Error(f"OpenAI Client Error: {e}")
391
- return (
392
- pd.DataFrame(columns=get_dataframe_columns()),
393
- f"OpenAI Client Error: {e}",
394
- 0,
395
- )
396
-
397
- model = model_name
398
- flattened_data = []
399
- total_cards_generated = 0
400
- # Use track_tqdm=True in the calling Gradio handler if desired
401
- # progress_tracker = gr.Progress(track_tqdm=True)
402
-
403
- # -------------------------------------
404
-
405
- try:
406
- # page_text_for_generation = "" # No longer needed here
407
-
408
- # --- Web Mode (Crawler) is now handled by crawl_and_generate in ui_logic.py ---
409
- # The 'web' case for orchestrate_card_generation is removed as it's a separate flow.
410
- # This function now handles 'subject', 'path', and 'text' (where text can be a URL).
411
-
412
- # --- Subject Mode ---
413
- if generation_mode == "subject":
414
- logger.info("Orchestrator: Subject Mode")
415
- if not subject or not subject.strip():
416
- gr.Error("Subject is required for 'Single Subject' mode.")
417
- return (
418
- pd.DataFrame(columns=get_dataframe_columns()),
419
- "Subject is required.",
420
- gr.update(
421
- value="<div><b>Total Cards Generated:</b> <span id='total-cards-count'>0</span></div>",
422
- visible=False,
423
- ),
424
- )
425
- system_prompt = f"""You are an expert in {subject} and an experienced educator. {preference_prompt}"""
426
- # Split subjects if multiple are comma-separated
427
- individual_subjects = [s.strip() for s in subject.split(",") if s.strip()]
428
- if (
429
- not individual_subjects
430
- ): # Handle case where subject might be just commas or whitespace
431
- gr.Error("Valid subject(s) required.")
432
- return (
433
- pd.DataFrame(columns=get_dataframe_columns()),
434
- "Valid subject(s) required.",
435
- gr.update(
436
- value="<div><b>Total Cards Generated:</b> <span id='total-cards-count'>0</span></div>",
437
- visible=False,
438
- ),
439
- )
440
-
441
- topics_for_generation = []
442
- # max(1, topic_number // len(individual_subjects)) # Distribute topic_number
443
-
444
- for ind_subject in individual_subjects:
445
- # For single/multiple subjects, we might generate sub-topics or just use the subject as a topic
446
- # For simplicity, let's assume each subject passed is a "topic" for now,
447
- # and cards_per_topic applies to each.
448
- # Or, if topic_number > 1, we could try to make LLM break down ind_subject into num_topics_per_subject.
449
- # Current UI has "Number of Topics" and "Cards per Topic".
450
- # If "Number of Topics" is meant per subject provided, then this logic needs care.
451
- # Let's assume "Number of Topics" is total, and we divide it.
452
- # If "Single Subject" mode, topic_number might represent sub-topics of that single subject.
453
-
454
- # For now, let's simplify: treat each provided subject as a high-level topic.
455
- # And generate 'cards_per_topic' for each. 'topic_number' might be less relevant here or define sub-breakdown.
456
- # To align with UI (topic_number and cards_per_topic), if multiple subjects,
457
- # we could make `topic_number` apply to how many sub-topics to generate for EACH subject,
458
- # and `cards_per_topic` for each of those sub-topics.
459
- # Or, if len(individual_subjects) > 1, `topic_number` is ignored and we use `cards_per_topic` for each subject.
460
-
461
- # Simpler: if 1 subject, topic_number is subtopics. If multiple, each is a topic.
462
- if len(individual_subjects) == 1:
463
- # If it's a single subject, we might want to break it down into `topic_number` sub-topics.
464
- # This would require an LLM call to get sub-topics first.
465
- # For now, let's treat the single subject as one topic, and `topic_number` is ignored.
466
- # Or, let's assume `topic_number` means we want `topic_number` variations or aspects of this subject.
467
- # The prompt for generate_cards_batch takes a "topic".
468
- # Let's create `topic_number` "topics" that are just slight variations or aspects of the main subject.
469
- if topic_number == 1:
470
- topics_for_generation.append(
471
- {"name": ind_subject, "num_cards": cards_per_topic}
472
- )
473
- else:
474
- # This is a placeholder for a more sophisticated sub-topic generation
475
- # For now, just make `topic_number` distinct calls for the same subject if user wants more "topics"
476
- # gr.Info(f"Generating for {topic_number} aspects/sub-sections of '{ind_subject}'.")
477
- for i in range(topic_number):
478
- topics_for_generation.append(
479
- {
480
- "name": f"{ind_subject} - Aspect {i + 1}",
481
- "num_cards": cards_per_topic,
482
- }
483
- )
484
- else: # Multiple subjects provided
485
- topics_for_generation.append(
486
- {"name": ind_subject, "num_cards": cards_per_topic}
487
- )
488
-
489
- # --- Learning Path Mode ---
490
- elif generation_mode == "path":
491
- logger.info("Orchestrator: Learning Path Mode")
492
- # In path mode, 'subject' contains the pre-analyzed subjects, comma-separated.
493
- # 'description' (the learning goal) was used by analyze_learning_path, not directly here for card gen.
494
- if (
495
- not subject or not subject.strip()
496
- ): # 'subject' here comes from the anki_cards_data_df after analysis
497
- gr.Error("No subjects provided from learning path analysis.")
498
- return (
499
- pd.DataFrame(columns=get_dataframe_columns()),
500
- "No subjects from path analysis.",
501
- gr.update(
502
- value="<div><b>Total Cards Generated:</b> <span id='total-cards-count'>0</span></div>",
503
- visible=False,
504
- ),
505
- )
506
-
507
- system_prompt = f"""You are an expert in curriculum design and an experienced educator. {preference_prompt}"""
508
- analyzed_subjects = [s.strip() for s in subject.split(",") if s.strip()]
509
- if not analyzed_subjects:
510
- gr.Error("No valid subjects parsed from learning path.")
511
- return (
512
- pd.DataFrame(columns=get_dataframe_columns()),
513
- "No valid subjects from path.",
514
- gr.update(
515
- value="<div><b>Total Cards Generated:</b> <span id='total-cards-count'>0</span></div>",
516
- visible=False,
517
- ),
518
- )
519
-
520
- # topic_number might be interpreted as how many cards to generate for EACH analyzed subject,
521
- # or how many sub-topics to break each analyzed subject into.
522
- # Given "Cards per Topic" slider, it's more likely each analyzed subject is a "topic".
523
- topics_for_generation = [
524
- {"name": subj, "num_cards": cards_per_topic}
525
- for subj in analyzed_subjects
526
- ]
527
-
528
- # --- Text Mode / Single Web Page from Text Mode ---
529
- elif generation_mode == "text":
530
- logger.info("Orchestrator: Text Mode")
531
- actual_text_to_process = source_text
532
-
533
- if (
534
- not actual_text_to_process or not actual_text_to_process.strip()
535
- ): # Check after potential fetch
536
- gr.Error("Text input is empty.")
537
- return (
538
- pd.DataFrame(columns=get_dataframe_columns()),
539
- "Text input is empty.",
540
- gr.update(
541
- value="<div><b>Total Cards Generated:</b> <span id='total-cards-count'>0</span></div>",
542
- visible=False,
543
- ),
544
- )
545
-
546
- # Check if source_text is a URL
547
- # Use a more robust check for URL (e.g., regex or urllib.parse)
548
- is_url = False
549
- if isinstance(source_text, str) and source_text.strip().lower().startswith(
550
- ("http://", "https://")
551
- ):
552
- try:
553
- # A more robust check could involve trying to parse it
554
- result = urlparse(source_text.strip())
555
- if all([result.scheme, result.netloc]):
556
- is_url = True
557
- except ImportError: # Fallback if urlparse not available (should be)
558
- pass # is_url remains False
559
-
560
- if is_url:
561
- url_to_fetch = source_text.strip()
562
- logger.info(f"Text mode identified URL: {url_to_fetch}")
563
- gr.Info(f"🕸️ Fetching content from URL in text field: {url_to_fetch}...")
564
- try:
565
- page_content = await asyncio.to_thread(
566
- fetch_webpage_text, url_to_fetch
567
- ) # Ensure fetch_webpage_text is thread-safe or run in executor
568
- if not page_content or not page_content.strip():
569
- gr.Warning(
570
- f"Could not extract meaningful text from URL: {url_to_fetch}. Please check the URL or page content."
571
- )
572
- return (
573
- pd.DataFrame(columns=get_dataframe_columns()),
574
- "No meaningful text extracted from URL.",
575
- gr.update(
576
- value="<div><b>Total Cards Generated:</b> <span id='total-cards-count'>0</span></div>",
577
- visible=False,
578
- ),
579
- )
580
- actual_text_to_process = page_content
581
- source_text_display_name = f"Content from {url_to_fetch}"
582
- gr.Info(
583
- f"✅ Successfully fetched text from URL (approx. {len(actual_text_to_process)} chars)."
584
- )
585
- except Exception as e:
586
- logger.error(
587
- f"Failed to fetch or process URL {url_to_fetch} in text mode: {e}",
588
- exc_info=True,
589
- )
590
- gr.Error(f"Failed to fetch content from URL: {str(e)}")
591
- return (
592
- pd.DataFrame(columns=get_dataframe_columns()),
593
- f"URL fetch error: {str(e)}",
594
- gr.update(
595
- value="<div><b>Total Cards Generated:</b> <span id='total-cards-count'>0</span></div>",
596
- visible=False,
597
- ),
598
- )
599
- else: # Not a URL, or failed to parse as one
600
- if (
601
- not source_text or not source_text.strip()
602
- ): # Re-check original source_text if not a URL
603
- gr.Error("Text input is empty.")
604
- return (
605
- pd.DataFrame(columns=get_dataframe_columns()),
606
- "Text input is empty.",
607
- gr.update(
608
- value="<div><b>Total Cards Generated:</b> <span id='total-cards-count'>0</span></div>",
609
- visible=False,
610
- ),
611
- )
612
- actual_text_to_process = source_text # Use as is
613
- source_text_display_name = "Content from Provided Text"
614
- logger.info("Text mode: Processing provided text directly.")
615
-
616
- # For text mode (either direct text or fetched from URL), generate cards from this content.
617
- # The LLM will need the text. We can pass it via the system prompt or a specialized user prompt.
618
- # For now, let's use a system prompt that tells it to base cards on the provided text.
619
- # And we'll create one "topic" for all cards.
620
-
621
- system_prompt = f"""You are an expert in distilling information and creating flashcards from text. {preference_prompt}
622
- Base your flashcards STRICTLY on the following text content provided by the user in their next message.
623
- Do not use external knowledge unless explicitly asked to clarify something from the text.
624
- The user will provide the text content that needs to be turned into flashcards.""" # System prompt now expects text in user prompt.
625
-
626
- # The user_prompt in generate_cards_batch will need to include actual_text_to_process.
627
- # Let's adapt generate_cards_batch or how it's called for this.
628
- # For now, let's assume generate_cards_batch's `cards_prompt` will be wrapped or modified
629
- # to include `actual_text_to_process` when `generation_mode` is "text".
630
-
631
- # This requires a change in how `generate_cards_batch` constructs its `cards_prompt` if text is primary.
632
- # Alternative: pass `actual_text_to_process` as part of the user_prompt to `structured_output_completion`
633
- # directly from here, bypassing `generate_cards_batch`'s topic-based prompt for "text" mode.
634
- # This seems cleaner.
635
-
636
- # Let's make a direct call to structured_output_completion for "text" mode.
637
- text_mode_user_prompt = f"""
638
- Please generate {cards_per_topic * topic_number} ATOMIC flashcards based on the following text content.
639
-
640
- Follow these ATOMIC principles:
641
- - Each answer should be 1-9 words maximum
642
- - Use bland, standardized questions (no fancy formatting)
643
- - Break complex concepts into multiple simple cards
644
- - Put ALL learning content in answers, never in questions
645
- - Use handles (>references) to connect related cards
646
- - Design questions to match real-life recall situations
647
-
648
- Ensure the flashcards cover diverse aspects of the text.
649
- {get_cloze_instruction(generate_cloze)}
650
- Return your response as a JSON object with the following structure:
651
- {get_card_json_structure_prompt()}
652
-
653
- Text Content to process:
654
- ---
655
- {actual_text_to_process[:15000]}
656
- ---
657
- """ # Truncate to avoid excessive length, system prompt already set context.
658
-
659
- gr.Info(f"Generating cards from: {source_text_display_name}...")
660
- try:
661
- response = await structured_output_completion(
662
- openai_client=openai_client,
663
- model=model,
664
- response_format={"type": "json_object"},
665
- system_prompt=system_prompt, # System prompt instructs to use text from user prompt
666
- user_prompt=text_mode_user_prompt, # User prompt contains the text
667
- cache=cache,
668
- )
669
- raw_cards = [] # Default if response is None
670
- if response:
671
- raw_cards = response.get("cards", [])
672
- else:
673
- logger.warning(
674
- "structured_output_completion returned None, defaulting to empty card list for text mode."
675
- )
676
- processed_cards = process_raw_cards_data(raw_cards)
677
- if use_llm_judge and processed_cards:
678
- processed_cards = await judge_cards(
679
- openai_client, cache, model, processed_cards
680
- )
681
- formatted_cards = format_cards_for_dataframe(
682
- processed_cards, topic_name=source_text_display_name, start_index=1
683
- )
684
- flattened_data.extend(formatted_cards)
685
- total_cards_generated += len(formatted_cards)
686
-
687
- # Skip topics_for_generation loop for text mode as cards are generated directly.
688
- topics_for_generation = [] # Ensure it's empty
689
-
690
- except Exception as e:
691
- logger.error(
692
- f"Error during 'From Text' card generation: {e}", exc_info=True
693
- )
694
- gr.Error(f"Error generating cards from text: {str(e)}")
695
- return (
696
- pd.DataFrame(columns=get_dataframe_columns()),
697
- f"Text Gen Error: {str(e)}",
698
- gr.update(
699
- value="<div><b>Total Cards Generated:</b> <span id='total-cards-count'>0</span></div>",
700
- visible=False,
701
- ),
702
- )
703
-
704
- else: # Should not happen if generation_mode is validated, but as a fallback
705
- logger.error(f"Unknown generation mode: {generation_mode}")
706
- gr.Error(f"Unknown generation mode: {generation_mode}")
707
- return (
708
- pd.DataFrame(columns=get_dataframe_columns()),
709
- "Unknown mode.",
710
- gr.update(
711
- value="<div><b>Total Cards Generated:</b> <span id='total-cards-count'>0</span></div>",
712
- visible=False,
713
- ),
714
- )
715
-
716
- # --- Batch Generation Loop (for subject and path modes) ---
717
- # progress_total_batches = len(topics_for_generation)
718
- # current_batch_num = 0
719
-
720
- for topic_info in (
721
- topics_for_generation
722
- ): # This loop will be skipped if text_mode populated flattened_data directly
723
- # current_batch_num += 1
724
- # progress_tracker.progress(current_batch_num / progress_total_batches, desc=f"Generating for topic: {topic_info['name']}")
725
- # logger.info(f"Progress: {current_batch_num}/{progress_total_batches} - Topic: {topic_info['name']}")
726
- gr.Info(
727
- f"Generating cards for topic: {topic_info['name']}..."
728
- ) # UI feedback
729
-
730
- try:
731
- # System prompt is already set based on mode (subject/path)
732
- # generate_cards_batch will use this system_prompt
733
- batch_cards = await generate_cards_batch(
734
- openai_client,
735
- cache,
736
- model,
737
- topic_info["name"],
738
- topic_info["num_cards"],
739
- system_prompt, # System prompt defined above based on mode
740
- generate_cloze,
741
- )
742
- if use_llm_judge and batch_cards:
743
- batch_cards = await judge_cards(
744
- openai_client, cache, model, batch_cards
745
- )
746
- # Assign topic name to cards before formatting for DataFrame
747
- formatted_batch = format_cards_for_dataframe(
748
- batch_cards,
749
- topic_name=topic_info["name"],
750
- start_index=total_cards_generated + 1,
751
- )
752
- flattened_data.extend(formatted_batch)
753
- total_cards_generated += len(formatted_batch)
754
- logger.info(
755
- f"Generated {len(formatted_batch)} cards for topic {topic_info['name']}"
756
- )
757
-
758
- except Exception as e:
759
- logger.error(
760
- f"Error generating cards for topic {topic_info['name']}: {e}",
761
- exc_info=True,
762
- )
763
- # Optionally, decide if one topic failing should stop all, or just skip
764
- gr.Warning(
765
- f"Could not generate cards for topic '{topic_info['name']}': {str(e)}. Skipping."
766
- )
767
- continue # Continue to next topic
768
-
769
- # --- Final Processing ---
770
- if not flattened_data:
771
- gr.Info(
772
- "No cards were generated."
773
- ) # More informative than just empty table
774
- # Return empty dataframe with correct columns
775
- return (
776
- pd.DataFrame(columns=get_dataframe_columns()),
777
- "No cards generated.",
778
- gr.update(
779
- value="<div><b>Total Cards Generated:</b> <span id='total-cards-count'>0</span></div>",
780
- visible=False,
781
- ),
782
- )
783
-
784
- # Deduplication (if needed, and if it makes sense across different topics)
785
- # For now, deduplication logic might be too aggressive if topics are meant to have overlapping concepts from different angles.
786
- # final_cards_data = deduplicate_cards(flattened_data) # Assuming deduplicate_cards expects list of dicts
787
- final_cards_data = (
788
- flattened_data # Skipping deduplication for now to preserve topic structure
789
- )
790
-
791
- # Re-index cards if deduplication changed the count or if start_index logic wasn't perfect
792
- # For now, format_cards_for_dataframe handles indexing.
793
-
794
- output_df = pd.DataFrame(final_cards_data, columns=get_dataframe_columns())
795
-
796
- total_cards_message = f"<div><b>💡 Legacy Generated Cards:</b> <span id='total-cards-count'>{len(output_df)}</span></div>"
797
-
798
- logger.info(f"Legacy orchestration complete. Total cards: {len(output_df)}")
799
- return output_df, total_cards_message
800
-
801
- except Exception as e:
802
- logger.error(
803
- f"Critical error in orchestrate_card_generation: {e}", exc_info=True
804
- )
805
- gr.Error(f"An unexpected error occurred: {str(e)}")
806
- return (
807
- pd.DataFrame(columns=get_dataframe_columns()),
808
- f"Unexpected error: {str(e)}",
809
- gr.update(
810
- value="<div><b>Total Cards Generated:</b> <span id='total-cards-count'>0</span></div>",
811
- visible=False,
812
- ),
813
- )
814
- finally:
815
- # Placeholder if any cleanup is needed
816
- pass
817
-
818
-
819
- # Helper function to get Cloze instruction string
820
- def get_cloze_instruction(generate_cloze: bool) -> str:
821
- if generate_cloze:
822
- return """
823
- Where appropriate, generate Cloze deletion cards.
824
- - For Cloze cards, set "card_type" to "cloze".
825
- - Format the question field using Anki's cloze syntax (e.g., "The capital of France is {{c1::Paris}}.").
826
- - The "answer" field should contain the full, non-cloze text or specific context for the cloze.
827
- - For standard question/answer cards, set "card_type" to "basic".
828
- """
829
- return ""
830
-
831
-
832
- # Helper function to get JSON structure prompt for cards
833
- def get_card_json_structure_prompt() -> str:
834
- return """
835
- {
836
- "cards": [
837
- {
838
- "card_type": "basic or cloze",
839
- "front": {
840
- "question": "question text (potentially with {{{{c1::cloze syntax}}}})"
841
- },
842
- "back": {
843
- "answer": "concise answer or full text for cloze",
844
- "explanation": "detailed explanation",
845
- "example": "practical example"
846
- },
847
- "metadata": {
848
- "prerequisites": ["list", "of", "prerequisites"],
849
- "learning_outcomes": ["list", "of", "outcomes"],
850
- "difficulty": "beginner/intermediate/advanced"
851
- }
852
- }
853
- // ... more cards
854
- ]
855
- }
856
- """
857
-
858
-
859
- # Helper function to process raw card data from LLM into Card Pydantic models
860
- def process_raw_cards_data(cards_data: list) -> list[Card]:
861
- cards_list = []
862
- if not isinstance(cards_data, list):
863
- logger.warning(
864
- f"Expected a list of cards, got {type(cards_data)}. Raw data: {cards_data}"
865
- )
866
- return cards_list
867
 
868
- for card_item in cards_data:
869
- if not isinstance(card_item, dict):
870
- logger.warning(
871
- f"Expected card item to be a dict, got {type(card_item)}. Item: {card_item}"
872
- )
873
- continue
874
- try:
875
- # Basic validation for essential fields
876
- if (
877
- not all(k in card_item for k in ["front", "back"])
878
- or not isinstance(card_item["front"], dict)
879
- or not isinstance(card_item["back"], dict)
880
- or "question" not in card_item["front"]
881
- or "answer" not in card_item["back"]
882
- ):
883
- logger.warning(
884
- f"Skipping card due to missing essential fields: {card_item}"
885
- )
886
- continue
887
 
888
- card = Card(
889
- card_type=card_item.get("card_type", "basic"),
890
- front=CardFront(
891
- question=strip_html_tags(card_item["front"].get("question", ""))
892
- ),
893
- back=CardBack(
894
- answer=strip_html_tags(card_item["back"].get("answer", "")),
895
- explanation=strip_html_tags(
896
- card_item["back"].get("explanation", "")
897
- ),
898
- example=strip_html_tags(card_item["back"].get("example", "")),
899
- ),
900
- metadata=card_item.get("metadata", {}),
901
- )
902
- cards_list.append(card)
903
- except Exception as e: # Catch Pydantic validation errors or others
904
- logger.error(
905
- f"Error processing card data item: {card_item}. Error: {e}",
906
- exc_info=True,
907
- )
908
- return cards_list
909
 
910
 
911
  # --- Formatting and Utility Functions --- (Moved and adapted)
 
3
  import gradio as gr
4
  import pandas as pd
5
  from typing import List, Dict, Any
 
 
6
 
7
  # Imports from our core modules
8
  from ankigen_core.utils import (
9
  get_logger,
10
  ResponseCache,
 
11
  strip_html_tags,
12
  )
13
+ from ankigen_core.llm_interface import OpenAIClientManager
14
  from ankigen_core.models import (
15
  Card,
 
 
16
  ) # Import necessary Pydantic models
17
 
18
  # Import agent system - required
 
67
  # --- Core Functions --- (Moved and adapted from app.py)
68
 
69
 
70
+ # Legacy functions removed - all card generation now handled by agent system
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
 
72
 
73
  async def orchestrate_card_generation( # MODIFIED: Added async
 
85
  preference_prompt: str,
86
  generate_cloze: bool,
87
  use_llm_judge: bool = False,
88
+ library_name: str = None,
89
+ library_topic: str = None,
90
  ):
91
  """Orchestrates the card generation process based on UI inputs."""
92
 
 
99
  if AGENTS_AVAILABLE:
100
  logger.info("🤖 Using agent system for card generation")
101
  try:
 
102
  from ankigen_core.agents.token_tracker import get_token_tracker
103
 
104
  token_tracker = get_token_tracker()
105
 
 
 
106
  orchestrator = AgentOrchestrator(client_manager)
107
 
108
+ logger.info(f"Using {model_name} for SubjectExpertAgent")
109
+ await orchestrator.initialize(api_key_input, {"subject_expert": model_name})
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
110
 
111
  # Map generation mode to subject
112
  agent_subject = "general"
 
117
  elif generation_mode == "text":
118
  agent_subject = "content_analysis"
119
 
 
120
  total_cards_needed = topic_number * cards_per_topic
121
 
 
122
  context = {}
123
  if generation_mode == "text" and source_text:
124
  context["source_text"] = source_text
125
 
 
126
  agent_cards, agent_metadata = await orchestrator.generate_cards_with_agents(
127
  topic=subject if subject else "Mixed Topics",
128
  subject=agent_subject,
129
  num_cards=total_cards_needed,
130
+ difficulty="intermediate",
131
  enable_quality_pipeline=True,
132
  context=context,
133
+ library_name=library_name,
134
+ library_topic=library_topic,
135
  )
136
 
137
  # Get token usage from session
 
153
  if agent_cards:
154
  formatted_cards = format_cards_for_dataframe(
155
  agent_cards,
156
+ topic_name=subject if subject else "General",
 
 
157
  start_index=1,
158
  )
159
 
160
  output_df = pd.DataFrame(
161
  formatted_cards, columns=get_dataframe_columns()
162
  )
163
+ total_cards_message = f"<div><b>Cards Generated:</b> <span id='total-cards-count'>{len(output_df)}</span></div>"
164
 
165
  logger.info(
166
  f"Agent system generated {len(output_df)} cards successfully"
 
184
  "",
185
  )
186
 
187
+ # Agent system is required and should never fail to be available
188
+ logger.error("Agent system failed but is required - this should not happen")
189
+ gr.Error("Agent system is required but not available")
190
+ return (
191
+ pd.DataFrame(columns=get_dataframe_columns()),
192
+ "Agent system error",
193
+ "",
194
+ )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
195
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
196
 
197
+ # Legacy helper functions removed - all processing now handled by agent system
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
198
 
199
 
200
  # --- Formatting and Utility Functions --- (Moved and adapted)
ankigen_core/context7.py ADDED
@@ -0,0 +1,177 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Context7 integration for library documentation"""
2
+
3
+ import asyncio
4
+ import subprocess
5
+ import json
6
+ from typing import Optional, Dict, Any
7
+ from ankigen_core.logging import logger
8
+
9
+
10
+ class Context7Client:
11
+ """Context7 MCP client for fetching library documentation"""
12
+
13
+ def __init__(self):
14
+ self.server_process = None
15
+
16
+ async def call_context7_tool(
17
+ self, tool_name: str, args: Dict[str, Any]
18
+ ) -> Optional[Dict[str, Any]]:
19
+ """Call a Context7 tool via direct JSONRPC"""
20
+ try:
21
+ # Build the JSONRPC request
22
+ request = {
23
+ "jsonrpc": "2.0",
24
+ "id": 1,
25
+ "method": "tools/call",
26
+ "params": {"name": tool_name, "arguments": args},
27
+ }
28
+
29
+ # Call the Context7 server
30
+ process = await asyncio.create_subprocess_exec(
31
+ "npx",
32
+ "@upstash/context7-mcp",
33
+ stdin=subprocess.PIPE,
34
+ stdout=subprocess.PIPE,
35
+ stderr=subprocess.PIPE,
36
+ )
37
+
38
+ # Send initialization first
39
+ init_request = {
40
+ "jsonrpc": "2.0",
41
+ "id": 0,
42
+ "method": "initialize",
43
+ "params": {
44
+ "protocolVersion": "2025-06-18",
45
+ "capabilities": {},
46
+ "clientInfo": {"name": "ankigen", "version": "1.0.0"},
47
+ },
48
+ }
49
+
50
+ # Send both requests
51
+ input_data = json.dumps(init_request) + "\n" + json.dumps(request) + "\n"
52
+ stdout, stderr = await process.communicate(input=input_data.encode())
53
+
54
+ # Parse responses
55
+ responses = stdout.decode().strip().split("\n")
56
+ if len(responses) >= 2:
57
+ # Skip init response, get tool response
58
+ tool_response = json.loads(responses[1])
59
+
60
+ if "result" in tool_response:
61
+ result = tool_response["result"]
62
+ # Extract content from the result
63
+ if "content" in result and result["content"]:
64
+ content_item = result["content"][0]
65
+ if "text" in content_item:
66
+ return {"text": content_item["text"], "success": True}
67
+ elif "type" in content_item and content_item["type"] == "text":
68
+ return {
69
+ "text": content_item.get("text", ""),
70
+ "success": True,
71
+ }
72
+ return {"error": "No content in response", "success": False}
73
+ elif "error" in tool_response:
74
+ return {"error": tool_response["error"], "success": False}
75
+
76
+ return {"error": "Invalid response format", "success": False}
77
+
78
+ except Exception as e:
79
+ logger.error(f"Error calling Context7 tool {tool_name}: {e}")
80
+ return {"error": str(e), "success": False}
81
+
82
+ async def resolve_library_id(self, library_name: str) -> Optional[str]:
83
+ """Resolve a library name to a Context7-compatible ID"""
84
+ logger.info(f"Resolving library ID for: {library_name}")
85
+
86
+ result = await self.call_context7_tool(
87
+ "resolve-library-id", {"libraryName": library_name}
88
+ )
89
+
90
+ if result and result.get("success") and result.get("text"):
91
+ # Parse the text to extract library ID
92
+ text = result["text"]
93
+ import re
94
+
95
+ # First, look for specific Context7-compatible library ID mentions
96
+ lines = text.split("\n")
97
+ for line in lines:
98
+ if "Context7-compatible library ID:" in line:
99
+ # Extract the ID after the colon
100
+ parts = line.split("Context7-compatible library ID:")
101
+ if len(parts) > 1:
102
+ library_id = parts[1].strip()
103
+ if library_id.startswith("/"):
104
+ logger.info(
105
+ f"Resolved '{library_name}' to ID: {library_id}"
106
+ )
107
+ return library_id
108
+
109
+ # Fallback: Look for library ID pattern but be more specific
110
+ # Must have actual library names, not generic /org/project
111
+ matches = re.findall(r"/[\w-]+/[\w.-]+(?:/[\w.-]+)?", text)
112
+ for match in matches:
113
+ # Filter out generic placeholders
114
+ if match != "/org/project" and "example" not in match.lower():
115
+ logger.info(f"Resolved '{library_name}' to ID: {match}")
116
+ return match
117
+
118
+ logger.warning(f"Could not resolve library ID for '{library_name}'")
119
+ return None
120
+
121
+ async def get_library_docs(
122
+ self, library_id: str, topic: Optional[str] = None, tokens: int = 5000
123
+ ) -> Optional[str]:
124
+ """Get documentation for a library"""
125
+ logger.info(
126
+ f"Fetching docs for: {library_id}" + (f" (topic: {topic})" if topic else "")
127
+ )
128
+
129
+ args = {"context7CompatibleLibraryID": library_id, "tokens": tokens}
130
+ if topic:
131
+ args["topic"] = topic
132
+
133
+ result = await self.call_context7_tool("get-library-docs", args)
134
+
135
+ if result and result.get("success") and result.get("text"):
136
+ docs = result["text"]
137
+ logger.info(f"Retrieved {len(docs)} characters of documentation")
138
+ return docs
139
+
140
+ logger.warning(f"Could not fetch docs for '{library_id}'")
141
+ return None
142
+
143
+ async def fetch_library_documentation(
144
+ self, library_name: str, topic: Optional[str] = None, tokens: int = 5000
145
+ ) -> Optional[str]:
146
+ """Convenience method to resolve and fetch docs in one call"""
147
+ library_id = await self.resolve_library_id(library_name)
148
+ if not library_id:
149
+ return None
150
+
151
+ return await self.get_library_docs(library_id, topic, tokens)
152
+
153
+
154
+ async def test_context7():
155
+ """Test the Context7 integration"""
156
+ client = Context7Client()
157
+
158
+ print("Testing Context7 integration...")
159
+
160
+ # Test resolving a library
161
+ library_id = await client.resolve_library_id("react")
162
+ if library_id:
163
+ print(f"✓ Resolved 'react' to ID: {library_id}")
164
+
165
+ # Test fetching docs
166
+ docs = await client.get_library_docs(library_id, topic="hooks", tokens=2000)
167
+ if docs:
168
+ print(f"✓ Fetched {len(docs)} characters of documentation")
169
+ print(f"Preview: {docs[:300]}...")
170
+ else:
171
+ print("✗ Failed to fetch documentation")
172
+ else:
173
+ print("✗ Failed to resolve library ID")
174
+
175
+
176
+ if __name__ == "__main__":
177
+ asyncio.run(test_context7())
ankigen_core/utils.py CHANGED
@@ -9,7 +9,6 @@ from bs4 import BeautifulSoup
9
  from functools import lru_cache
10
  from typing import Any, Optional
11
  import time
12
- import re
13
 
14
  # --- Logging Setup ---
15
  _logger_instance = None
@@ -196,11 +195,12 @@ class RateLimiter:
196
  # def some_other_util_function():
197
  # pass
198
 
199
- HTML_TAG_REGEX = re.compile(r"<[^>]*>")
200
-
201
 
202
  def strip_html_tags(text: str) -> str:
203
- """Removes HTML tags from a string."""
204
  if not isinstance(text, str):
205
  return str(text) # Ensure it's a string, or return as is if not coercible
206
- return HTML_TAG_REGEX.sub("", text).strip()
 
 
 
 
9
  from functools import lru_cache
10
  from typing import Any, Optional
11
  import time
 
12
 
13
  # --- Logging Setup ---
14
  _logger_instance = None
 
195
  # def some_other_util_function():
196
  # pass
197
 
 
 
198
 
199
  def strip_html_tags(text: str) -> str:
200
+ """Removes HTML tags from a string using a safe, non-regex approach."""
201
  if not isinstance(text, str):
202
  return str(text) # Ensure it's a string, or return as is if not coercible
203
+
204
+ # Use BeautifulSoup for safe HTML parsing
205
+ soup = BeautifulSoup(text, "html.parser")
206
+ return soup.get_text().strip()
app.py CHANGED
@@ -256,6 +256,21 @@ def create_ankigen_interface():
256
  info="Your key is used solely for processing your requests.",
257
  elem_id="api-key-textbox",
258
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
259
  with gr.Column(scale=1):
260
  with gr.Accordion("Advanced Settings", open=False):
261
  model_choices_ui = [
@@ -302,161 +317,9 @@ def create_ankigen_interface():
302
  label="Generate Cloze Cards (Experimental)",
303
  value=False,
304
  )
305
-
306
- # Agent System Controls (simplified since we're agent-only)
307
- if AGENTS_AVAILABLE_APP:
308
- # Hidden dropdown for compatibility - always set to agent_only
309
- agent_mode_dropdown = gr.Dropdown(
310
- choices=[("Agent Only", "agent_only")],
311
- value="agent_only",
312
- label="Agent Mode",
313
- visible=False,
314
- )
315
-
316
- with gr.Accordion("Agent Configuration", open=False):
317
- gr.Markdown("**Core Generation Pipeline**")
318
- enable_subject_expert = gr.Checkbox(
319
- label="Subject Expert Agent",
320
- value=True,
321
- info="Domain-specific expertise",
322
- )
323
- enable_generation_coordinator = gr.Checkbox(
324
- label="Generation Coordinator",
325
- value=True,
326
- info="Orchestrates multi-agent generation",
327
- )
328
-
329
- gr.Markdown("**Quality Assurance**")
330
- enable_content_judge = gr.Checkbox(
331
- label="Content Accuracy Judge",
332
- value=True,
333
- info="Factual correctness validation",
334
- )
335
- enable_clarity_judge = gr.Checkbox(
336
- label="Clarity Judge",
337
- value=True,
338
- info="Language clarity and comprehension",
339
- )
340
-
341
- gr.Markdown("**Optional Enhancements**")
342
- enable_pedagogical_agent = gr.Checkbox(
343
- label="Pedagogical Agent",
344
- value=False,
345
- info="Educational effectiveness review",
346
- )
347
- enable_pedagogical_judge = gr.Checkbox(
348
- label="Pedagogical Judge",
349
- value=False,
350
- info="Learning theory compliance",
351
- )
352
- enable_enhancement_agent = gr.Checkbox(
353
- label="Enhancement Agent",
354
- value=False,
355
- info="Content enrichment and metadata",
356
- )
357
-
358
- with gr.Accordion(
359
- "🛠️ Agent Model Selection", open=False
360
- ):
361
- gr.Markdown("**Individual Agent Models**")
362
-
363
- # Generator models
364
- subject_expert_model = gr.Dropdown(
365
- choices=model_choices_ui,
366
- value="gpt-4.1",
367
- label="Subject Expert Model",
368
- info="Model for domain expertise",
369
- allow_custom_value=True,
370
- )
371
- generation_coordinator_model = gr.Dropdown(
372
- choices=model_choices_ui,
373
- value="gpt-4.1-nano",
374
- label="Generation Coordinator Model",
375
- info="Model for orchestration",
376
- allow_custom_value=True,
377
- )
378
-
379
- # Judge models
380
- content_judge_model = gr.Dropdown(
381
- choices=model_choices_ui,
382
- value="gpt-4.1",
383
- label="Content Accuracy Judge Model",
384
- info="Model for fact-checking",
385
- allow_custom_value=True,
386
- )
387
- clarity_judge_model = gr.Dropdown(
388
- choices=model_choices_ui,
389
- value="gpt-4.1-nano",
390
- label="Clarity Judge Model",
391
- info="Model for language clarity",
392
- allow_custom_value=True,
393
- )
394
-
395
- # Enhancement models
396
- pedagogical_agent_model = gr.Dropdown(
397
- choices=model_choices_ui,
398
- value="gpt-4.1",
399
- label="Pedagogical Agent Model",
400
- info="Model for educational theory",
401
- allow_custom_value=True,
402
- )
403
- enhancement_agent_model = gr.Dropdown(
404
- choices=model_choices_ui,
405
- value="gpt-4.1",
406
- label="Enhancement Agent Model",
407
- info="Model for content enrichment",
408
- allow_custom_value=True,
409
- )
410
- else:
411
- # Placeholder when agents not available
412
- agent_mode_dropdown = gr.Dropdown(
413
- choices=[("Legacy Only", "legacy")],
414
- value="legacy",
415
- label="Agent Mode",
416
- info="Agent system not available",
417
- interactive=False,
418
- )
419
- enable_subject_expert = gr.Checkbox(
420
- value=False, visible=False
421
- )
422
- enable_generation_coordinator = gr.Checkbox(
423
- value=False, visible=False
424
- )
425
- enable_pedagogical_agent = gr.Checkbox(
426
- value=False, visible=False
427
- )
428
- enable_content_judge = gr.Checkbox(
429
- value=False, visible=False
430
- )
431
- enable_clarity_judge = gr.Checkbox(
432
- value=False, visible=False
433
- )
434
- enable_pedagogical_judge = gr.Checkbox(
435
- value=False, visible=False
436
- )
437
- enable_enhancement_agent = gr.Checkbox(
438
- value=False, visible=False
439
- )
440
-
441
- # Hidden model dropdowns for non-agent mode
442
- subject_expert_model = gr.Dropdown(
443
- value="gpt-4.1", visible=False
444
- )
445
- generation_coordinator_model = gr.Dropdown(
446
- value="gpt-4.1-nano", visible=False
447
- )
448
- content_judge_model = gr.Dropdown(
449
- value="gpt-4.1", visible=False
450
- )
451
- clarity_judge_model = gr.Dropdown(
452
- value="gpt-4.1-nano", visible=False
453
- )
454
- pedagogical_agent_model = gr.Dropdown(
455
- value="gpt-4.1", visible=False
456
- )
457
- enhancement_agent_model = gr.Dropdown(
458
- value="gpt-4.1", visible=False
459
- )
460
 
461
  generate_button = gr.Button("Generate Cards", variant="primary")
462
 
@@ -655,96 +518,10 @@ def create_ankigen_interface():
655
  cards_per_topic_val,
656
  preference_prompt_val,
657
  generate_cloze_checkbox_val,
658
- agent_mode_val,
659
- enable_subject_expert_val,
660
- enable_generation_coordinator_val,
661
- enable_pedagogical_agent_val,
662
- enable_content_judge_val,
663
- enable_clarity_judge_val,
664
- enable_pedagogical_judge_val,
665
- enable_enhancement_agent_val,
666
- subject_expert_model_val,
667
- generation_coordinator_model_val,
668
- content_judge_model_val,
669
- clarity_judge_model_val,
670
- pedagogical_agent_model_val,
671
- enhancement_agent_model_val,
672
  progress=gr.Progress(track_tqdm=True), # Added progress tracker
673
  ):
674
- # Apply agent settings if agents are available
675
- if AGENTS_AVAILABLE_APP:
676
- import os
677
-
678
- # Set agent mode
679
- os.environ["ANKIGEN_AGENT_MODE"] = agent_mode_val
680
-
681
- # Set individual agent flags (using correct environment variable names)
682
- os.environ["ANKIGEN_ENABLE_SUBJECT_EXPERT"] = str(
683
- enable_subject_expert_val
684
- ).lower()
685
- os.environ["ANKIGEN_ENABLE_GENERATION_COORDINATOR"] = str(
686
- enable_generation_coordinator_val
687
- ).lower()
688
- os.environ["ANKIGEN_ENABLE_PEDAGOGICAL_AGENT"] = str(
689
- enable_pedagogical_agent_val
690
- ).lower()
691
- os.environ["ANKIGEN_ENABLE_CONTENT_JUDGE"] = str(
692
- enable_content_judge_val
693
- ).lower()
694
- os.environ["ANKIGEN_ENABLE_CLARITY_JUDGE"] = str(
695
- enable_clarity_judge_val
696
- ).lower()
697
- os.environ["ANKIGEN_ENABLE_PEDAGOGICAL_JUDGE"] = str(
698
- enable_pedagogical_judge_val
699
- ).lower()
700
- os.environ["ANKIGEN_ENABLE_ENHANCEMENT_AGENT"] = str(
701
- enable_enhancement_agent_val
702
- ).lower()
703
-
704
- # Enable additional required flags for proper agent coordination
705
- os.environ["ANKIGEN_ENABLE_JUDGE_COORDINATOR"] = (
706
- "true" # Required for judge coordination
707
- )
708
- os.environ["ANKIGEN_ENABLE_PARALLEL_JUDGING"] = (
709
- "true" # Enable parallel judging for performance
710
- )
711
-
712
- # Configure agent models from UI selections
713
- model_overrides = {
714
- "subject_expert": subject_expert_model_val,
715
- "generation_coordinator": generation_coordinator_model_val,
716
- "content_accuracy_judge": content_judge_model_val,
717
- "clarity_judge": clarity_judge_model_val,
718
- "pedagogical_agent": pedagogical_agent_model_val,
719
- "enhancement_agent": enhancement_agent_model_val,
720
- }
721
-
722
- # Template variables for Jinja rendering
723
- template_vars = {
724
- "subject": subject_val or "general studies",
725
- "difficulty": "intermediate", # Could be made configurable
726
- "topic": subject_val or "general concepts",
727
- }
728
-
729
- # Initialize config manager with model overrides and template variables
730
- from ankigen_core.agents.config import get_config_manager
731
-
732
- get_config_manager(model_overrides, template_vars)
733
-
734
- # Log the agent configuration
735
- logger.info(f"Agent mode set to: {agent_mode_val}")
736
- logger.info(f"Model overrides: {model_overrides}")
737
- logger.info(
738
- f"Active agents: Subject Expert={enable_subject_expert_val}, Generation Coordinator={enable_generation_coordinator_val}, Content Judge={enable_content_judge_val}, Clarity Judge={enable_clarity_judge_val}"
739
- )
740
-
741
- # Reload feature flags to pick up the new environment variables
742
- try:
743
- # Agent system is available
744
- logger.info("Agent system enabled")
745
- except Exception as e:
746
- logger.warning(f"Failed to reload feature flags: {e}")
747
-
748
  # Recreate the partial function call, but now it can be awaited
749
  # The actual orchestrate_card_generation is already partially applied with client_manager and response_cache
750
  # So, we need to get that specific partial object if it's stored, or redefine the partial logic here.
@@ -762,6 +539,8 @@ def create_ankigen_interface():
762
  cards_per_topic_val,
763
  preference_prompt_val,
764
  generate_cloze_checkbox_val,
 
 
765
  )
766
  # Expect 3-tuple return (dataframe, total_cards_html, token_usage_html)
767
 
@@ -778,20 +557,8 @@ def create_ankigen_interface():
778
  cards_per_topic,
779
  preference_prompt,
780
  generate_cloze_checkbox,
781
- agent_mode_dropdown,
782
- enable_subject_expert,
783
- enable_generation_coordinator,
784
- enable_pedagogical_agent,
785
- enable_content_judge,
786
- enable_clarity_judge,
787
- enable_pedagogical_judge,
788
- enable_enhancement_agent,
789
- subject_expert_model,
790
- generation_coordinator_model,
791
- content_judge_model,
792
- clarity_judge_model,
793
- pedagogical_agent_model,
794
- enhancement_agent_model,
795
  ],
796
  outputs=[output, total_cards_html, token_usage_html],
797
  show_progress="full",
 
256
  info="Your key is used solely for processing your requests.",
257
  elem_id="api-key-textbox",
258
  )
259
+
260
+ # Context7 Library Documentation
261
+ with gr.Accordion(
262
+ "Library Documentation (optional)", open=False
263
+ ):
264
+ library_name_input = gr.Textbox(
265
+ label="Library Name",
266
+ placeholder="e.g., 'react', 'tensorflow', 'pandas'",
267
+ info="Fetch up-to-date documentation for this library",
268
+ )
269
+ library_topic_input = gr.Textbox(
270
+ label="Documentation Focus (optional)",
271
+ placeholder="e.g., 'hooks', 'data loading', 'transforms'",
272
+ info="Specific topic within the library to focus on",
273
+ )
274
  with gr.Column(scale=1):
275
  with gr.Accordion("Advanced Settings", open=False):
276
  model_choices_ui = [
 
317
  label="Generate Cloze Cards (Experimental)",
318
  value=False,
319
  )
320
+ gr.Markdown(
321
+ "*Cards are generated by the subject expert agent with a quick self-review to catch obvious gaps.*"
322
+ )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
323
 
324
  generate_button = gr.Button("Generate Cards", variant="primary")
325
 
 
518
  cards_per_topic_val,
519
  preference_prompt_val,
520
  generate_cloze_checkbox_val,
521
+ library_name_val,
522
+ library_topic_val,
 
 
 
 
 
 
 
 
 
 
 
 
523
  progress=gr.Progress(track_tqdm=True), # Added progress tracker
524
  ):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
525
  # Recreate the partial function call, but now it can be awaited
526
  # The actual orchestrate_card_generation is already partially applied with client_manager and response_cache
527
  # So, we need to get that specific partial object if it's stored, or redefine the partial logic here.
 
539
  cards_per_topic_val,
540
  preference_prompt_val,
541
  generate_cloze_checkbox_val,
542
+ library_name=library_name_val if library_name_val else None,
543
+ library_topic=library_topic_val if library_topic_val else None,
544
  )
545
  # Expect 3-tuple return (dataframe, total_cards_html, token_usage_html)
546
 
 
557
  cards_per_topic,
558
  preference_prompt,
559
  generate_cloze_checkbox,
560
+ library_name_input,
561
+ library_topic_input,
 
 
 
 
 
 
 
 
 
 
 
 
562
  ],
563
  outputs=[output, total_cards_html, token_usage_html],
564
  show_progress="full",
pyproject.toml CHANGED
@@ -12,26 +12,27 @@ authors = [
12
  readme = "README.md"
13
  requires-python = ">=3.10"
14
  dependencies = [
15
- "openai>=1.91.0",
16
- "openai-agents>=0.1.0",
17
- "gradio>=5.34.2",
18
  "tenacity>=9.1.2",
19
  "genanki>=0.13.1",
20
- "pydantic==2.10.6",
21
- "pandas==2.2.3",
22
- "beautifulsoup4==4.12.3",
23
- "lxml==5.2.2",
24
- "tiktoken>=0.9.0",
 
25
  ]
26
 
27
  [project.optional-dependencies]
28
  dev = [
29
- "pytest>=8.4.1",
30
- "pytest-cov>=6.2.1",
31
- "pytest-mock>=3.14.1",
32
- "ruff>=0.12.0",
33
- "black>=25.1.0",
34
- "pre-commit>=4.2.0",
35
  "pytest-anyio>=0.0.0",
36
  ]
37
 
 
12
  readme = "README.md"
13
  requires-python = ">=3.10"
14
  dependencies = [
15
+ "openai>=1.109.1",
16
+ "openai-agents>=0.3.2",
17
+ "gradio>=5.47.0",
18
  "tenacity>=9.1.2",
19
  "genanki>=0.13.1",
20
+ "pydantic==2.11.9",
21
+ "pandas>=2.3.2",
22
+ "beautifulsoup4==4.13.5",
23
+ "lxml>=6.0.2",
24
+ "tiktoken>=0.11.0",
25
+ "fastmcp>=2.12.3",
26
  ]
27
 
28
  [project.optional-dependencies]
29
  dev = [
30
+ "pytest>=8.4.2",
31
+ "pytest-cov>=7.0.0",
32
+ "pytest-mock>=3.15.1",
33
+ "ruff>=0.13.1",
34
+ "black>=25.9.0",
35
+ "pre-commit>=4.3.0",
36
  "pytest-anyio>=0.0.0",
37
  ]
38
 
requirements.txt CHANGED
@@ -1,249 +1,107 @@
1
- # This file was autogenerated by uv via the following command:
2
- # uv pip compile pyproject.toml --python-version 3.10 -o requirements.txt
3
  aiofiles==24.1.0
4
- # via gradio
5
  annotated-types==0.7.0
6
- # via pydantic
7
  anyio==4.9.0
8
- # via
9
- # gradio
10
- # httpx
11
- # mcp
12
- # openai
13
- # sse-starlette
14
- # starlette
15
  attrs==25.3.0
16
- # via
17
- # jsonschema
18
- # referencing
19
- beautifulsoup4==4.12.3
20
- # via ankigen (pyproject.toml)
21
  cached-property==2.0.1
22
- # via genanki
23
  certifi==2025.6.15
24
- # via
25
- # httpcore
26
- # httpx
27
- # requests
28
  charset-normalizer==3.4.2
29
- # via requests
30
  chevron==0.14.0
31
- # via genanki
32
  click==8.2.1
33
- # via
34
- # typer
35
- # uvicorn
36
  colorama==0.4.6
37
- # via griffe
 
38
  distro==1.9.0
39
- # via openai
 
 
 
40
  exceptiongroup==1.3.0
41
- # via anyio
42
  fastapi==0.115.13
43
- # via gradio
44
  ffmpy==0.6.0
45
- # via gradio
46
  filelock==3.18.0
47
- # via huggingface-hub
48
  frozendict==2.4.6
49
- # via genanki
50
  fsspec==2025.5.1
51
- # via
52
- # gradio-client
53
- # huggingface-hub
54
  genanki==0.13.1
55
- # via ankigen (pyproject.toml)
56
- gradio==5.34.2
57
- # via ankigen (pyproject.toml)
58
- gradio-client==1.10.3
59
- # via gradio
60
  griffe==1.7.3
61
- # via openai-agents
62
  groovy==0.1.2
63
- # via gradio
64
  h11==0.16.0
65
- # via
66
- # httpcore
67
- # uvicorn
68
  hf-xet==1.1.5
69
- # via huggingface-hub
70
  httpcore==1.0.9
71
- # via httpx
72
  httpx==0.28.1
73
- # via
74
- # gradio
75
- # gradio-client
76
- # mcp
77
- # openai
78
- # safehttpx
79
  httpx-sse==0.4.1
80
- # via mcp
81
- huggingface-hub==0.33.1
82
- # via
83
- # gradio
84
- # gradio-client
85
  idna==3.10
86
- # via
87
- # anyio
88
- # httpx
89
- # requests
90
  jinja2==3.1.6
91
- # via gradio
92
  jiter==0.10.0
93
- # via openai
94
  jsonschema==4.24.0
95
- # via mcp
96
  jsonschema-specifications==2025.4.1
97
- # via jsonschema
98
- lxml==5.2.2
99
- # via ankigen (pyproject.toml)
100
  markdown-it-py==3.0.0
101
- # via rich
102
  markupsafe==3.0.2
103
- # via
104
- # gradio
105
- # jinja2
106
- mcp==1.10.1
107
- # via openai-agents
108
  mdurl==0.1.2
109
- # via markdown-it-py
110
- numpy==1.26.4
111
- # via
112
- # gradio
113
- # pandas
114
- openai==1.91.0
115
- # via
116
- # ankigen (pyproject.toml)
117
- # openai-agents
118
- openai-agents==0.1.0
119
- # via ankigen (pyproject.toml)
120
  orjson==3.10.18
121
- # via gradio
122
  packaging==25.0
123
- # via
124
- # gradio
125
- # gradio-client
126
- # huggingface-hub
127
- pandas==2.2.3
128
- # via
129
- # ankigen (pyproject.toml)
130
- # gradio
131
- pillow==11.3.0
132
- # via gradio
133
- pydantic==2.10.6
134
- # via
135
- # ankigen (pyproject.toml)
136
- # fastapi
137
- # gradio
138
- # mcp
139
- # openai
140
- # openai-agents
141
- # pydantic-settings
142
- pydantic-core==2.27.2
143
- # via pydantic
144
  pydantic-settings==2.10.1
145
- # via mcp
146
  pydub==0.25.1
147
- # via gradio
148
  pygments==2.19.2
149
- # via rich
150
  python-dateutil==2.9.0.post0
151
- # via pandas
152
  python-dotenv==1.1.1
153
- # via pydantic-settings
154
  python-multipart==0.0.20
155
- # via
156
- # gradio
157
- # mcp
158
  pytz==2025.2
159
- # via pandas
160
  pyyaml==6.0.2
161
- # via
162
- # genanki
163
- # gradio
164
- # huggingface-hub
165
  referencing==0.36.2
166
- # via
167
- # jsonschema
168
- # jsonschema-specifications
169
  regex==2024.11.6
170
- # via tiktoken
171
  requests==2.32.4
172
- # via
173
- # huggingface-hub
174
- # openai-agents
175
- # tiktoken
176
  rich==14.0.0
177
- # via typer
178
  rpds-py==0.26.0
179
- # via
180
- # jsonschema
181
- # referencing
182
- ruff==0.12.0
183
- # via gradio
184
  safehttpx==0.1.6
185
- # via gradio
186
  semantic-version==2.10.0
187
- # via gradio
188
  shellingham==1.5.4
189
- # via typer
190
  six==1.17.0
191
- # via python-dateutil
192
  sniffio==1.3.1
193
- # via
194
- # anyio
195
- # openai
196
  soupsieve==2.7
197
- # via beautifulsoup4
198
  sse-starlette==2.3.6
199
- # via mcp
200
  starlette==0.46.2
201
- # via
202
- # fastapi
203
- # gradio
204
- # mcp
205
  tenacity==9.1.2
206
- # via ankigen (pyproject.toml)
207
- tiktoken==0.9.0
208
- # via ankigen (pyproject.toml)
209
  tomlkit==0.13.3
210
- # via gradio
211
  tqdm==4.67.1
212
- # via
213
- # huggingface-hub
214
- # openai
215
  typer==0.16.0
216
- # via gradio
217
  types-requests==2.32.4.20250611
218
- # via openai-agents
219
  typing-extensions==4.14.0
220
- # via
221
- # anyio
222
- # exceptiongroup
223
- # fastapi
224
- # gradio
225
- # gradio-client
226
- # huggingface-hub
227
- # openai
228
- # openai-agents
229
- # pydantic
230
- # pydantic-core
231
- # referencing
232
- # rich
233
- # typer
234
- # typing-inspection
235
- # uvicorn
236
  typing-inspection==0.4.1
237
- # via pydantic-settings
238
  tzdata==2025.2
239
- # via pandas
240
  urllib3==2.5.0
241
- # via
242
- # requests
243
- # types-requests
244
  uvicorn==0.34.3
245
- # via
246
- # gradio
247
- # mcp
248
  websockets==15.0.1
249
- # via gradio-client
 
 
 
1
  aiofiles==24.1.0
 
2
  annotated-types==0.7.0
 
3
  anyio==4.9.0
 
 
 
 
 
 
 
4
  attrs==25.3.0
5
+ authlib==1.6.4
6
+ beautifulsoup4==4.13.5
7
+ brotli==1.1.0
 
 
8
  cached-property==2.0.1
 
9
  certifi==2025.6.15
10
+ cffi==2.0.0
 
 
 
11
  charset-normalizer==3.4.2
 
12
  chevron==0.14.0
 
13
  click==8.2.1
 
 
 
14
  colorama==0.4.6
15
+ cryptography==46.0.1
16
+ cyclopts==3.24.0
17
  distro==1.9.0
18
+ dnspython==2.8.0
19
+ docstring-parser==0.17.0
20
+ docutils==0.22.2
21
+ email-validator==2.3.0
22
  exceptiongroup==1.3.0
 
23
  fastapi==0.115.13
24
+ fastmcp==2.12.3
25
  ffmpy==0.6.0
 
26
  filelock==3.18.0
 
27
  frozendict==2.4.6
 
28
  fsspec==2025.5.1
 
 
 
29
  genanki==0.13.1
30
+ gradio==5.47.0
31
+ gradio-client==1.13.2
 
 
 
32
  griffe==1.7.3
 
33
  groovy==0.1.2
 
34
  h11==0.16.0
 
 
 
35
  hf-xet==1.1.5
 
36
  httpcore==1.0.9
 
37
  httpx==0.28.1
 
 
 
 
 
 
38
  httpx-sse==0.4.1
39
+ huggingface-hub==0.35.1
 
 
 
 
40
  idna==3.10
41
+ isodate==0.7.2
 
 
 
42
  jinja2==3.1.6
 
43
  jiter==0.10.0
 
44
  jsonschema==4.24.0
45
+ jsonschema-path==0.3.4
46
  jsonschema-specifications==2025.4.1
47
+ lazy-object-proxy==1.12.0
48
+ lxml==6.0.2
 
49
  markdown-it-py==3.0.0
 
50
  markupsafe==3.0.2
51
+ mcp==1.14.1
 
 
 
 
52
  mdurl==0.1.2
53
+ more-itertools==10.8.0
54
+ numpy==2.3.1
55
+ openai==1.109.1
56
+ openai-agents==0.3.2
57
+ openapi-core==0.19.5
58
+ openapi-pydantic==0.5.1
59
+ openapi-schema-validator==0.6.3
60
+ openapi-spec-validator==0.7.2
 
 
 
61
  orjson==3.10.18
 
62
  packaging==25.0
63
+ pandas==2.3.2
64
+ parse==1.20.2
65
+ pathable==0.4.4
66
+ pillow==11.2.1
67
+ pycparser==2.23
68
+ pydantic==2.11.9
69
+ pydantic-core==2.33.2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70
  pydantic-settings==2.10.1
 
71
  pydub==0.25.1
 
72
  pygments==2.19.2
73
+ pyperclip==1.10.0
74
  python-dateutil==2.9.0.post0
 
75
  python-dotenv==1.1.1
 
76
  python-multipart==0.0.20
 
 
 
77
  pytz==2025.2
 
78
  pyyaml==6.0.2
 
 
 
 
79
  referencing==0.36.2
 
 
 
80
  regex==2024.11.6
 
81
  requests==2.32.4
82
+ rfc3339-validator==0.1.4
 
 
 
83
  rich==14.0.0
84
+ rich-rst==1.3.1
85
  rpds-py==0.26.0
86
+ ruff==0.13.1
 
 
 
 
87
  safehttpx==0.1.6
 
88
  semantic-version==2.10.0
 
89
  shellingham==1.5.4
 
90
  six==1.17.0
 
91
  sniffio==1.3.1
 
 
 
92
  soupsieve==2.7
 
93
  sse-starlette==2.3.6
 
94
  starlette==0.46.2
 
 
 
 
95
  tenacity==9.1.2
96
+ tiktoken==0.11.0
 
 
97
  tomlkit==0.13.3
 
98
  tqdm==4.67.1
 
 
 
99
  typer==0.16.0
 
100
  types-requests==2.32.4.20250611
 
101
  typing-extensions==4.14.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
102
  typing-inspection==0.4.1
 
103
  tzdata==2025.2
 
104
  urllib3==2.5.0
 
 
 
105
  uvicorn==0.34.3
 
 
 
106
  websockets==15.0.1
107
+ werkzeug==3.1.1
uv.lock CHANGED
The diff for this file is too large to render. See raw diff