README / README.md
davanstrien's picture
davanstrien HF Staff
Update README.md
2435f5a verified
---
title: README
emoji: πŸ“š
colorFrom: pink
colorTo: gray
sdk: static
pinned: false
---
# πŸ“š BigLAM: Machine Learning for Libraries, Archives, and Museums
**BigLAM** is a community-driven initiative to build an open ecosystem of machine learning models, datasets, and tools for **Libraries, Archives, and Museums (LAMs)**.
We aim to:
- πŸ—ƒοΈ Share machine-learning-ready datasets from LAMs via the [Hugging Face Hub](https://huggingface.co/biglam)
- πŸ€– Train and release open-source models for LAM-relevant tasks
- πŸ› οΈ Develop tools and approaches tailored to LAM use cases
---
<details>
<summary><strong>✨ Background</strong></summary>
BigLAM began as a [datasets hackathon](https://github.com/bigscience-workshop/lam) within the [BigScience 🌸](https://bigscience.huggingface.co/) project, a large-scale, open NLP collaboration.
Our goal: make LAM datasets more discoverable and usable to support researchers, institutions, and ML practitioners working with cultural heritage data.
</details>
<details>
<summary><strong>πŸ“‚ What You'll Find</strong></summary>
The [BigLAM organization](https://huggingface.co/biglam) hosts:
- **Datasets**: image, text, and tabular data from and about libraries, archives, and museums
- **Models**: fine-tuned for tasks like:
- Art/historical image classification
- Document layout analysis and OCR
- Metadata quality assessment
- Named entity recognition in heritage texts
- **Spaces**: tools for interactive exploration and demonstration
</details>
<details>
<summary><strong>🧩 Get Involved</strong></summary>
We welcome contributions! You can:
- Use our [datasets and models](https://huggingface.co/biglam)
- Join the discussion on [GitHub](https://github.com/bigscience-workshop/lam/discussions)
- Contribute your own tools or data
- Share your work using BigLAM resources
</details>
## 🌍 Why It Matters
Cultural heritage data is often underrepresented in machine learning. BigLAM helps address this by:
- Supporting inclusive and responsible AI
- Helping institutions experiment with ML for access, discovery, and preservation
- Ensuring that ML systems reflect diverse human knowledge and expression
- Developing tools and methods that work well with the unique formats, values, and needs of LAMs