Open-GAMMA

Running

App Files Files Community

openfree commited on 29 days ago

Commit

5e39079

1 Parent(s): 2549300

Update README.md

Browse files

Files changed (1) hide show

README.md +108 -0

README.md CHANGED Viewed

@@ -8,4 +8,112 @@ sdk_version: 5.33.0
 app_file: app.py
 pinned: false
 short_description: Reasoning + Deep Research + API(NVIDIA H100 GPU)
 ---

 app_file: app.py
 pinned: false
 short_description: Reasoning + Deep Research + API(NVIDIA H100 GPU)
+models:
+ - VIDraft/Gemma-3-R1984-27B
 ---
+FACTS Grounding Leaderboard - Medical AI Evaluation
+🏥 Overview
+FACTS Grounding is an AI reliability evaluation system developed by Google DeepMind that verifies whether AI responses are grounded solely in provided documents. This evaluation is particularly crucial in healthcare, where inaccurate information can be life-threatening.
+🎯 Key Features
+Evaluation Methodology
+Long Medical Document Input (~32,000 tokens ≈ 40 A4 pages)
+AI Response Generation Based on Documents
+Dual-Criteria Assessment
+✅ Quality Check: Does the AI accurately understand the question?
+✅ Grounding Check: Are all responses based on the provided documents?
+Medical-Focused Version
+236 medical cases selected from 860 total problems
+Strict evaluation criteria reflecting healthcare field requirements
+🏆 Current Leaderboard Rankings (June 5, 2025)
+Overall Score TOP 5
+1. deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
+2. VIDraft/Gemma-3-R1984-27B
+3. meta-llama/Llama-3.3-70B-Instruct
+4. Qwen/Qwen3-30B-A3B
+5. Qwen/Qwen3-4B
+💡 Why FACTS Matters for Medical AI
+1. Patient Safety Assurance
+General AI Error: "The capital of France is London" → Simple mistake
+Medical AI Error: "This medication is safe for pregnant women" → Life-threatening!
+2. Building Healthcare Provider Trust
+Ensures all responses are grounded in medical literature
+Transparent scoring enables AI system reliability verification
+3. Regulatory Compliance & Standardization
+Emerging as a global medical AI standard
+Potential reference for FDA, CE, and other regulatory approvals
+🌍 Advancing Global Medical AI
+Core Values of FACTS
+Accuracy: Provides only evidence-based medical responses
+Transparency: All evaluation processes and data are public
+Accessibility: Global participation in evaluation
+Practicality: Reflects real-world healthcare scenarios
+Medical-Specific Features
+Understanding complex medical terminology and drug interactions
+Recognition of diverse symptom descriptions
+Verification of clinical guideline and protocol adherence
+Consideration of medical ethics and patient privacy
+🚀 The Future of Medical AI
+FACTS Grounding is establishing itself as the 'quality certification system' for medical AI.
+Expected Impact
+Global Standardization: Unified criteria for worldwide medical AI evaluation
+Quality Improvement: Continuous model enhancement through benchmarking
+Accelerated Clinical Adoption: Rapid deployment of validated AI systems
+Enhanced Patient Care: More accurate and safer healthcare services
+🤝 How to Participate
+The leaderboard operates with complete transparency, allowing anyone to submit and evaluate their models.
+Download the FACTS Grounding dataset
+Train and optimize your model
+Submit evaluation results
+Check rankings on the leaderboard
+📊 Utilizing Evaluation Results
+Healthcare Institutions
+Objective performance metrics for AI adoption decisions
+Selection criteria among multiple AI systems
+AI Developers
+Model performance benchmarking and improvement direction
+Marketing and reliability verification materials
+Regulatory Bodies
+Reference material for AI medical device approval processes
+Supplementary indicator for safety assessments
+📌 Key Takeaways
+FACTS Grounding = Global standard for verifying AI medical accuracy
+Medical-Specific Evaluation = Testing based on 236 real healthcare scenarios
+Transparent Operation = Open system accessible to all participants
+Real-World Impact = Promoting safer and more reliable medical AI development
+"What cannot be measured cannot be improved" - FACTS Grounding objectively measures medical AI reliability, contributing to global healthcare AI advancement.
+We look forward to more research teams and developers joining this challenge to create better AI for human health! 🌍