Spaces:

milwright
/

cloze-reader

Sleeping

milwright commited on Jun 23

Commit

7b3ba65

1 Parent(s): eeb2654

Add filtering for excessive dashes in passage extraction

- Detect sequences of 3+ consecutive dashes/hyphens
- Calculate dash ratio relative to total words
- Add quality scoring penalties for excessive dash usage
- Reject passages with dash sequences or >2% dash ratio
- Prevents selection of passages with formatting separators

Files changed (2) hide show

LEADERBOARD_ROADMAP.md +171 -0
src/clozeGameEngine.js +7 -0

LEADERBOARD_ROADMAP.md ADDED Viewed

	@@ -0,0 +1,171 @@

+# Cloze Reader Leaderboard Implementation Roadmap
+## Overview
+This document outlines the implementation plan for adding a competitive leaderboard system to the Cloze Reader game, where players can submit their scores using 3-letter acronyms.
+## Phase 1: Core Infrastructure (Week 1-2)
+### 1.1 Database Schema
+- Create leaderboard table structure:
+  ```sql
+  leaderboard {
+    id: UUID
+    acronym: VARCHAR(3)
+    score: INTEGER
+    level_reached: INTEGER
+    total_time: INTEGER (seconds)
+    created_at: TIMESTAMP
+    ip_hash: VARCHAR(64) // For rate limiting
+  }
+  ```
+### 1.2 API Endpoints
+- `POST /api/leaderboard/submit` - Submit new score
+- `GET /api/leaderboard/top/{period}` - Get top scores (daily/weekly/all-time)
+- `GET /api/leaderboard/check-acronym/{acronym}` - Validate acronym availability
+### 1.3 Score Calculation
+- Base score = (correct_answers * 100) * level_multiplier
+- Time bonus = max(0, 1000 - seconds_per_round)
+- Streak bonus = consecutive_correct * 50
+## Phase 2: Frontend Integration (Week 2-3)
+### 2.1 UI Components
+- **Leaderboard Modal** (`leaderboardModal.js`)
+  - Top 10 display with rank, acronym, score, level
+  - Period toggle (Today/Week/All-Time)
+  - Personal best highlight
+### 2.2 Score Submission Flow
+- End-of-game prompt for acronym entry
+- 3-letter validation (A-Z only)
+- Profanity filter implementation
+- Success/error feedback
+### 2.3 Visual Elements
+- Trophy icons for top 3 positions
+- Animated score counter
+- Level badges display
+## Phase 3: Security & Performance (Week 3-4)
+### 3.1 Anti-Cheat Measures
+- Server-side score validation
+- Rate limiting (1 submission per 5 minutes per IP)
+- Score feasibility checks (max possible score per level)
+- Request signing with session tokens
+### 3.2 Caching Strategy
+- Redis cache for top 100 scores
+- 5-minute TTL for leaderboard queries
+- Real-time updates for top 10 changes
+### 3.3 Data Persistence
+- PostgreSQL for primary storage
+- Daily backups of leaderboard data
+- Archived monthly snapshots
+## Phase 4: Advanced Features (Week 4-5)
+### 4.1 Achievement System
+- "First Timer" - First leaderboard entry
+- "Vocabulary Master" - 10+ correct in a row
+- "Speed Reader" - Complete round < 30 seconds
+- "Persistent Scholar" - Play 7 days straight
+### 4.2 Social Features
+- Share score to social media
+- Challenge link generation
+- Friend acronym tracking
+### 4.3 Analytics Dashboard
+- Player retention metrics
+- Popular acronym analysis
+- Score distribution graphs
+## Technical Implementation Details
+### Backend Changes Required
+1. **FastAPI Endpoints** (`app.py`):
+   ```python
+   @app.post("/api/leaderboard/submit")
+   async def submit_score(score_data: ScoreSubmission)
+   @app.get("/api/leaderboard/top/{period}")
+   async def get_leaderboard(period: str, limit: int = 10)
+   ```
+2. **Database Models** (`models.py` - new file):
+   ```python
+   class LeaderboardEntry(Base):
+       __tablename__ = "leaderboard"
+       # Schema implementation
+   ```
+3. **Validation Service** (`validation.py` - new file):
+   - Acronym format validation
+   - Profanity checking
+   - Score feasibility verification
+### Frontend Changes Required
+1. **Game Engine Integration** (`clozeGameEngine.js`):
+   - Track game metrics for scoring
+   - Call submission API on game end
+   - Store session data for validation
+2. **UI Updates** (`app.js`):
+   - Add leaderboard button to main menu
+   - Integrate submission modal
+   - Handle API responses
+3. **New Modules**:
+   - `leaderboardService.js` - API communication
+   - `scoreCalculator.js` - Client-side scoring logic
+   - `leaderboardUI.js` - UI component management
+## Deployment Considerations
+### Infrastructure Requirements
+- Database: PostgreSQL 14+
+- Cache: Redis 6+
+- API rate limiting: nginx or API Gateway
+- SSL certificate for secure submissions
+### Environment Variables
+```
+DATABASE_URL=postgresql://...
+REDIS_URL=redis://...
+LEADERBOARD_SECRET=... # For request signing
+PROFANITY_API_KEY=... # Optional external service
+```
+### Migration Strategy
+1. Deploy database schema
+2. Enable API endpoints (feature flagged)
+3. Gradual UI rollout (A/B testing)
+4. Full launch with announcement
+## Success Metrics
+- **Engagement**: 30% of players submit scores
+- **Retention**: 15% return to beat their score
+- **Performance**: <100ms leaderboard load time
+- **Security**: Zero validated cheating incidents
+## Timeline Summary
+- **Week 1-2**: Backend infrastructure
+- **Week 2-3**: Frontend integration
+- **Week 3-4**: Security hardening
+- **Week 4-5**: Advanced features
+- **Week 6**: Testing & deployment
+## Open Questions
+1. Should we allow Unicode characters in acronyms?
+2. Reset frequency for periodic leaderboards?
+3. Maximum entries per player per day?
+4. Prize/reward system for top performers?

src/clozeGameEngine.js CHANGED Viewed

@@ -164,6 +164,10 @@ class ClozeGame {
       const sentenceList = passage.split(/[.!?]+/).filter(s => s.trim().length > 10);
       const lines = passage.split('\n').filter(l => l.trim());
       // Check for repetitive patterns (common in indexes/TOCs)
       const repeatedPhrases = ['CONTENTS', 'CHAPTER', 'Volume', 'Vol.', 'Part', 'Book'];
       const repetitionCount = repeatedPhrases.reduce((count, phrase) =>
@@ -182,6 +186,7 @@ class ClozeGame {
       const avgWordsPerSentence = totalWords / Math.max(1, sentenceList.length);
       const repetitionRatio = repetitionCount / totalWords;
       const titleLineRatio = titleLines / Math.max(1, lines.length);
       // Stricter thresholds for higher levels
       const capsThreshold = this.currentLevel >= 3 ? 0.03 : 0.05;
@@ -198,6 +203,8 @@ class ClozeGame {
       if (shortWordRatio < 0.3) { qualityScore += 2; issues.push(`short-words: ${Math.round(shortWordRatio * 100)}%`); }
       if (repetitionRatio > 0.02) { qualityScore += repetitionRatio * 50; issues.push(`repetitive: ${Math.round(repetitionRatio * 100)}%`); }
       if (titleLineRatio > 0.2) { qualityScore += 5; issues.push(`title-lines: ${Math.round(titleLineRatio * 100)}%`); }
       // Reject if quality score indicates technical/non-narrative content
       if (qualityScore > 3) {

       const sentenceList = passage.split(/[.!?]+/).filter(s => s.trim().length > 10);
       const lines = passage.split('\n').filter(l => l.trim());
+      // Count excessive dashes (n-dashes, m-dashes, hyphens in sequence)
+      const dashSequences = (passage.match(/[-—–]{3,}/g) || []).length;
+      const totalDashes = (passage.match(/[-—–]/g) || []).length;
       // Check for repetitive patterns (common in indexes/TOCs)
       const repeatedPhrases = ['CONTENTS', 'CHAPTER', 'Volume', 'Vol.', 'Part', 'Book'];
       const repetitionCount = repeatedPhrases.reduce((count, phrase) =>
       const avgWordsPerSentence = totalWords / Math.max(1, sentenceList.length);
       const repetitionRatio = repetitionCount / totalWords;
       const titleLineRatio = titleLines / Math.max(1, lines.length);
+      const dashRatio = totalDashes / totalWords;
       // Stricter thresholds for higher levels
       const capsThreshold = this.currentLevel >= 3 ? 0.03 : 0.05;
       if (shortWordRatio < 0.3) { qualityScore += 2; issues.push(`short-words: ${Math.round(shortWordRatio * 100)}%`); }
       if (repetitionRatio > 0.02) { qualityScore += repetitionRatio * 50; issues.push(`repetitive: ${Math.round(repetitionRatio * 100)}%`); }
       if (titleLineRatio > 0.2) { qualityScore += 5; issues.push(`title-lines: ${Math.round(titleLineRatio * 100)}%`); }
+      if (dashSequences > 0) { qualityScore += dashSequences * 3; issues.push(`dash-sequences: ${dashSequences}`); }
+      if (dashRatio > 0.02) { qualityScore += dashRatio * 25; issues.push(`dashes: ${Math.round(dashRatio * 100)}%`); }
       // Reject if quality score indicates technical/non-narrative content
       if (qualityScore > 3) {