Learning Machine

company

https://learning-machine.ai/

Learning-Machine-Inc

AI & ML interests

None defined yet.

Recent Activity

ExplorerFreda updated a collection 6 days ago

ExplorerFreda updated a collection 18 days ago

ExplorerFreda updated a collection 18 days ago

View all activity

updated a collection 6 days ago

ICL Evaluation

Data for ICL Evaluation • 7 items • Updated 6 days ago

updated a collection 18 days ago

ICL Evaluation

Data for ICL Evaluation • 7 items • Updated 6 days ago

updated a collection 18 days ago

ICL Evaluation

Data for ICL Evaluation • 7 items • Updated 6 days ago

updated a collection 27 days ago

ICL Evaluation

Data for ICL Evaluation • 7 items • Updated 6 days ago

updated a collection about 1 month ago

ICL Evaluation

Data for ICL Evaluation • 7 items • Updated 6 days ago

authored 2 papers about 1 year ago

FlashFormer: Whole-Model Kernels for Efficient Low-Batch Inference

Paper • 2505.22758 • Published May 28, 2025 • 1

PaTH Attention: Position Encoding via Accumulating Householder Transformations

Paper • 2505.16381 • Published May 22, 2025

authored 3 papers over 1 year ago

Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence

Paper • 2502.09927 • Published Feb 14, 2025 • 1

Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping

Paper • 2501.06589 • Published Jan 11, 2025

Selective Self-Rehearsal: A Fine-Tuning Approach to Improve Generalization in Large Language Models

Paper • 2409.04787 • Published Sep 7, 2024 • 1

authored a paper almost 2 years ago

Gated Slot Attention for Efficient Linear-Time Sequence Modeling

Paper • 2409.07146 • Published Sep 11, 2024 • 20

authored a paper almost 2 years ago

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

Paper • 2408.13359 • Published Aug 23, 2024 • 23

authored 4 papers almost 2 years ago

The infrastructure powering IBM's Gen AI model development

Paper • 2407.05467 • Published Jul 7, 2024 • 3

Scaling Granite Code Models to 128K Context

Paper • 2407.13739 • Published Jul 18, 2024 • 21

FlexAttention for Efficient High-Resolution Vision-Language Models

Paper • 2407.20228 • Published Jul 29, 2024 • 1

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

Paper • 2408.13359 • Published Aug 23, 2024 • 23

authored 2 papers almost 2 years ago

Enhancing Training Efficiency Using Packing with Flash Attention

Paper • 2407.09105 • Published Jul 12, 2024 • 17

Scaling Granite Code Models to 128K Context

Paper • 2407.13739 • Published Jul 18, 2024 • 21