Ali El Filali's picture

Ali El Filali

alielfilali01

·

AI & ML interests

NLP (mainly for Arabic), Reinforcement Learning and Cognitive science

Articles

Introducing the Open Arabic LLM Leaderboard

Organizations

Posts 23

Post

275

Why nobdoy is talking about the new training corpus released by MBZUAI today.

TxT360 is +15 Trillion tokens corpus outperforming FineWeb on several metrics. Ablation studies were done up to 1T tokens.

Read blog here : LLM360/TxT360
Dataset : LLM360/TxT360

Post

2222

Don't you think we should add a tag "Evaluation" for datasets that are meant to be benchmarks and not for training ?

At least, when someone is collecting a group of datasets from an organization or let's say the whole hub can filter based on that tag and avoid somehow contaminating their "training" data.

Collections 4

Papers 1

arxiv:2404.00565

spaces 15

Jupyter Lab

LLM Training Cost Calculator

aya-101

jais-13b-chat

SambaLingo-Arabic-Chat

AceGPT-7B-chat

models 41

alielfilali01/PG7BB

Text Generation • Updated Jun 24 • 2.38k

alielfilali01/Q2AW1M-1001

Text Generation • Updated Jun 21 • 2.39k

alielfilali01/Q2AW1M-1111

Text Generation • Updated Jun 21 • 2.38k

alielfilali01/Q2AW1M-0000

Text Generation • Updated Jun 21 • 2.38k

alielfilali01/Q2AW1M-1000

Text Generation • Updated Jun 21 • 2.38k

alielfilali01/Q2AW1M-0100

Text Generation • Updated Jun 21 • 2.37k

alielfilali01/Q2AW1M-1100

Text Generation • Updated Jun 21 • 2.35k

alielfilali01/Q2AW1M-0010

Text Generation • Updated Jun 21 • 2.37k

alielfilali01/Q2AW1M-1010

Text Generation • Updated Jun 21 • 2.33k

alielfilali01/Q2AW1M-0001

Text Generation • Updated Jun 21 • 2.37k

datasets 26

alielfilali01/Bactrian-X-ar-SFT

Viewer • Updated Jun 24 • 67k • 2

alielfilali01/wikipedia-20231101.ar-100k

Viewer • Updated May 20 • 100k • 2

alielfilali01/MA-Culture-Vision-v0.2

Viewer • Updated May 18 • 93 • 2

alielfilali01/MA-Culture-Vision-v0.1

Viewer • Updated May 18 • 120 • 2 • 2

alielfilali01/ary-wikipedia-20231101-MT-PC

Viewer • Updated May 5 • 8k • 2

alielfilali01/AOT

Viewer • Updated Mar 19 • 6.58M • 3

alielfilali01/buisness_corpora

Viewer • Updated Mar 17 • 1 • 2

alielfilali01/TARJAMAT-UNPC-EN-ZH

Viewer • Updated Mar 16 • 37.4M • 3

alielfilali01/Arabic-AYA

Viewer • Updated Mar 14 • 26.1M • 4

alielfilali01/linuxscout__aghlat

Viewer • Updated Feb 22 • 85.2k • 2