BigLAM: BigScience Libraries, Archives and Museums

non-profit

https://github.com/bigscience-workshop/lam

Activity Feed Request to join this org

AI & ML interests

🤗 Hugging Face x 🌸 BigScience initiative to create open source community resources for LAMs.

Recent Activity

christopher authored a paper 21 days ago

Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem

Zaid authored a paper about 2 months ago

Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures

stefan-it authored a paper about 2 months ago

SindBERT, the Sailor: Charting the Seas of Turkish NLP

View all activity

Organization Card

Community About org cards

📚 BigLAM: Machine Learning for Libraries, Archives, and Museums

BigLAM is a community-driven initiative to build an open ecosystem of machine learning models, datasets, and tools for Libraries, Archives, and Museums (LAMs).

We aim to:

🗃️ Share machine-learning-ready datasets from LAMs via the Hugging Face Hub
🤖 Train and release open-source models for LAM-relevant tasks
🛠️ Develop tools and approaches tailored to LAM use cases

✨ Background

BigLAM began as a datasets hackathon within the BigScience 🌸 project, a large-scale, open NLP collaboration.

Our goal: make LAM datasets more discoverable and usable to support researchers, institutions, and ML practitioners working with cultural heritage data.

📂 What You'll Find

The BigLAM organization hosts:

Datasets: image, text, and tabular data from and about libraries, archives, and museums
Models: fine-tuned for tasks like:
- Art/historical image classification
- Document layout analysis and OCR
- Metadata quality assessment
- Named entity recognition in heritage texts
Spaces: tools for interactive exploration and demonstration

🧩 Get Involved

We welcome contributions! You can:

Use our datasets and models
Join the discussion on GitHub
Contribute your own tools or data
Share your work using BigLAM resources

🌍 Why It Matters

Cultural heritage data is often underrepresented in machine learning. BigLAM helps address this by:

Supporting inclusive and responsible AI
Helping institutions experiment with ML for access, discovery, and preservation
Ensuring that ML systems reflect diverse human knowledge and expression
Developing tools and methods that work well with the unique formats, values, and needs of LAMs