Arthur Zucker's picture

Arthur Zucker

ArthurZ

·

AI & ML interests

None yet

Articles

Improving Hugging Face Training Efficiency Through Packing with Flash Attention

Fine-Tuning Gemma Models in Hugging Face

Code Llama: Llama 2 learns to code

Organizations

Posts 2

Post

mamba is now available in transformers. Thanks to @tridao and @albertgu for this brilliant model! 🚀 and the amazing mamba-ssm kernels powering this!
Checkout the collection here:
state-spaces/transformers-compatible-mamba-65e7b40ab87e5297e45ae406

Post

Just when I was about to go to bed....... Here we go again

Collections 1

Papers 2

arxiv:2404.07839

arxiv:2402.19173

models 75

ArthurZ/pixtral

Updated 26 days ago • 59 • 2

ArthurZ/tok-test

ArthurZ/new-t5-base

ArthurZ/mamba-2.8b

Text Generation • Updated Mar 4 • 7 • 1

ArthurZ/mamba-2.4b-english-quotes

ArthurZ/mamba-790m

Text Generation • Updated Feb 29 • 8

ArthurZ/mamba-1.4b

Text Generation • Updated Feb 29 • 20

ArthurZ/mamba-2.8b-slimpj

Text Generation • Updated Feb 19 • 5

ArthurZ/mamba-370m

Text Generation • Updated Feb 19 • 7

ArthurZ/mamba-130m

Text Generation • Updated Feb 19 • 33 • 3

datasets

None public yet