Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Building on HF
24
5
48
Michael Anthony
PRO
MikeDoes
Follow
Girim9912's profile picture
NAMANDREWLV's profile picture
Noorhaizadsaid's profile picture
104 followers
·
48 following
http://www.ai4privacy.com
MikeDoesDo
MikeDoes
AI & ML interests
Privacy, Large Language Model, Explainable
Recent Activity
reacted
to
their
post
with ❤️
3 days ago
How do you prove a new AI privacy tool actually works? You test it against a world-class benchmark. That's why we're proud our data played a key role in the research for "Rescriber," a new browser extension for user-led anonymization. To objectively measure their tool's performance against other methods, the researchers needed a diverse and challenging evaluation set. They built their benchmark using 240 samples from the Ai4Privacy open dataset. This is a win-win for the ecosystem: our open-source data helps researchers validate their innovative solutions, and in turn, their work pushes the entire field of privacy-preserving AI forward. The "Rescriber" tool is a fantastic step towards on-device, user-controlled privacy. 🔗 Learn more about their data-driven findings in the full paper: https://arxiv.org/pdf/2410.11876 🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/ #DataPrivacy #AI #OpenSource #Anonymization #MachineLearning #HealthcareAI #Ai4Privacy
reacted
to
their
post
with 👀
3 days ago
State-of-the-art AI doesn't start with a model. It starts with the data. Achieving near-perfect accuracy for PII & PHI anonymization is one of the toughest challenges in NLP. A model is only as good as the data it learns from, providing this foundational layer is central to our mission. The ai4privacy/pii-masking-400k dataset was built for this exact purpose: to serve as a robust, large-scale, open-source training ground for building high-precision privacy tools. To see the direct impact of this data-first approach, look at the ner_deid_aipii model for Healthcare NLP by johnsnow lab. By training on our 400,000 labeled examples, the model achieved incredible performance: 100% F1-score on EMAIL detection. 99% F1-score on PHONE detection. 97% F1-score on NAME detection. This is the result of combining a cutting-edge architecture with a comprehensive, high-quality dataset. We provide the open-source foundation so developers can build better, safer solutions. Explore the dataset that helps power these next-generation privacy tools: https://huggingface.co/datasets/ai4privacy/pii-masking-400k 🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/ #DataPrivacy #AI #OpenSource #Anonymization #MachineLearning #HealthcareAI #Ai4Privacy
posted
an
update
3 days ago
How do you prove a new AI privacy tool actually works? You test it against a world-class benchmark. That's why we're proud our data played a key role in the research for "Rescriber," a new browser extension for user-led anonymization. To objectively measure their tool's performance against other methods, the researchers needed a diverse and challenging evaluation set. They built their benchmark using 240 samples from the Ai4Privacy open dataset. This is a win-win for the ecosystem: our open-source data helps researchers validate their innovative solutions, and in turn, their work pushes the entire field of privacy-preserving AI forward. The "Rescriber" tool is a fantastic step towards on-device, user-controlled privacy. 🔗 Learn more about their data-driven findings in the full paper: https://arxiv.org/pdf/2410.11876 🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/ #DataPrivacy #AI #OpenSource #Anonymization #MachineLearning #HealthcareAI #Ai4Privacy
View all activity
Organizations
MikeDoes
's Spaces
2
Sort: Recently updated
Running
1
Terminal Visualiser
💻
Create and download styled terminal screenshots
Running
1
TKG Visualiser
🌍
Visualize workflows from TSV data