Spaces:

biglam
/

README

Running

App Files Files Community

README / README.md

davanstrien HF Staff

Update README.md

2435f5a verified 11 days ago

preview code

raw

history blame contribute delete

2.3 kB

	---
	title: README
	emoji: 📚
	colorFrom: pink
	colorTo: gray
	sdk: static
	pinned: false
	---

	# 📚 BigLAM: Machine Learning for Libraries, Archives, and Museums

	BigLAM is a community-driven initiative to build an open ecosystem of machine learning models, datasets, and tools for Libraries, Archives, and Museums (LAMs).

	We aim to:

	- 🗃️ Share machine-learning-ready datasets from LAMs via the [Hugging Face Hub](https://huggingface.co/biglam)
	- 🤖 Train and release open-source models for LAM-relevant tasks
	- 🛠️ Develop tools and approaches tailored to LAM use cases

	---

	<details>
	<summary><strong>✨ Background</strong></summary>

	BigLAM began as a [datasets hackathon](https://github.com/bigscience-workshop/lam) within the [BigScience 🌸](https://bigscience.huggingface.co/) project, a large-scale, open NLP collaboration.

	Our goal: make LAM datasets more discoverable and usable to support researchers, institutions, and ML practitioners working with cultural heritage data.
	</details>


	<details>
	<summary><strong>📂 What You'll Find</strong></summary>

	The [BigLAM organization](https://huggingface.co/biglam) hosts:

	- Datasets: image, text, and tabular data from and about libraries, archives, and museums
	- Models: fine-tuned for tasks like:
	- Art/historical image classification
	- Document layout analysis and OCR
	- Metadata quality assessment
	- Named entity recognition in heritage texts
	- Spaces: tools for interactive exploration and demonstration
	</details>

	<details>
	<summary><strong>🧩 Get Involved</strong></summary>

	We welcome contributions! You can:

	- Use our [datasets and models](https://huggingface.co/biglam)
	- Join the discussion on [GitHub](https://github.com/bigscience-workshop/lam/discussions)
	- Contribute your own tools or data
	- Share your work using BigLAM resources
	</details>

	## 🌍 Why It Matters

	Cultural heritage data is often underrepresented in machine learning. BigLAM helps address this by:

	- Supporting inclusive and responsible AI
	- Helping institutions experiment with ML for access, discovery, and preservation
	- Ensuring that ML systems reflect diverse human knowledge and expression
	- Developing tools and methods that work well with the unique formats, values, and needs of LAMs