Spaces:
Running
Running
title: README | |
emoji: 📚 | |
colorFrom: pink | |
colorTo: gray | |
sdk: static | |
pinned: false | |
# 📚 BigLAM: Machine Learning for Libraries, Archives, and Museums | |
**BigLAM** is a community-driven effort to build an open ecosystem of machine learning models, datasets, and tools for **Libraries, Archives, and Museums (LAMs)**. | |
We aim to make cultural heritage data more accessible and usable for machine learning by: | |
- 🗃️ **Curating and sharing LAM datasets** with potential for ML applications, hosted openly on the [Hugging Face Hub](https://huggingface.co/biglam). | |
- 🤖 **Training and releasing open-source models** tailored to LAM-relevant tasks, including classification, generation, and object detection. | |
--- | |
## ✨ Origins and Purpose | |
BigLAM began as a [datasets hackathon](https://github.com/bigscience-workshop/lam) within the [BigScience 🌸](https://bigscience.huggingface.co/) project—an open scientific collaboration involving over 600 researchers from 50 countries and 250 institutions. | |
Our initial goal was to make LAM data more discoverable and usable on the Hugging Face Hub. We're continuing this work with the broader aim of: | |
- Helping LAM data reach new audiences. | |
- Supporting researchers and practitioners working at the intersection of AI and cultural heritage. | |
- Ensuring that machine learning datasets reflect the diversity and richness of human culture. | |
--- | |
## 📂 What You'll Find Here | |
The [BigLAM organization on Hugging Face](https://huggingface.co/biglam) hosts: | |
- 🧠 **Datasets** from and about libraries, archives, and museums, including image, text, and tabular formats. | |
- ⚙️ **Models** fine-tuned for LAM tasks, such as: | |
- Art and historical image classification | |
- OCR and document understanding | |
- Metadata quality assessment | |
- 🧪 **Spaces and tools** for exploring datasets and running models interactively. | |
--- | |
## 🧩 Get Involved | |
We welcome contributions and collaborations! | |
You can: | |
- Explore our [datasets and models](https://huggingface.co/biglam). | |
- Join the conversation by opening a [New Discussion](https://huggingface.co/spaces/biglam/README/discussions/new) on the BigLAM space. | |
- Submit datasets, models, or tools that support AI for cultural heritage. | |
- Use our datasets in your own research or projects—and share what you build! | |
--- | |
## 🌍 Why It Matters | |
Cultural heritage data is too often underrepresented in machine learning. By making LAM data more visible and usable: | |
- We support the responsible and inclusive development of AI. | |
- We help cultural institutions explore new forms of access and interpretation. | |
- We ensure that machine learning models learn from the full range of human knowledge—not just what's convenient to crawl. | |
- We develop tools and approaches that are tailored to the specific formats, challenges, and goals of libraries, archives, and museums—supporting long-term reuse and alignment with professional practices. | |