Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
Edit Datasets filters
Main
Tasks
Libraries
Languages
1
Licenses
Other
Reset Languages
Marathi
Hindi
Tamil
Bengali
English
Malayalam
Telugu
Gujarati
Urdu
French
Spanish
Portuguese
Indonesian
Panjabi
Arabic
Russian
Vietnamese
German
Kannada
Italian
Dutch
Chinese
Japanese
Korean
Ukrainian
Turkish
Romanian
Persian
Nepali
Hungarian
Danish
Polish
Catalan
Thai
Finnish
Czech
Greek
Oriya
Swedish
Lithuanian
Serbian
Bulgarian
Estonian
Basque
Slovak
Slovenian
Afrikaans
Assamese
Hebrew
Swahili
Macedonian
Azerbaijani
Latvian
Armenian
Galician
Icelandic
Georgian
Welsh
Kazakh
Croatian
Burmese
Uzbek
Belarusian
Amharic
Albanian
Kyrgyz
Pashto
Sinhala
Esperanto
Irish
Mongolian
Malay
Yoruba
Norwegian
Maltese
Tagalog
Hausa
Khmer
Igbo
Sindhi
Lao
Somali
Breton
Kurdish
Tatar
Scottish Gaelic
Latin
Uyghur
Sanskrit
Xhosa
+ 1780 languages
Languages with no match
rna
Agusan Manobo
Ambai
Guanano
Kisar
Paranan
Tboli
Ghomálá'
btk
Denya
Arapaho
Blagar
Siksika
Southern Carrier
Chipaya
Cavineña
Dogrib
Eastern Bontok
Wipi
Eastern Bolivian Guaraní
Gunwinggu
Golin
Ignaciano
Kaingang
Lacandon
Macushi
Maca
Coatlán Mixe
Martu Wangka
Mwera (Chimwera)
Nggem
Chumburung
Nyangumarta
Southeastern Puebla Nahuatl
Pogolo
Safwa
Epena
Takia
Tiruray
Tacana
Tuyuca
Iduna
Wapishana
Vwanji
Wik-Mungkan
Walmajarri
Xavánte
Cajonos Zapotec
Zaramo
Yatzachi Zapotec
Zigula
Zoogocho Zapotec
Arifama-Miniafia
Angor
Amanab
Amo
Bumbita Arapesh
Apinayé
Western Apache
Western Arrarnta
Waimaha
Kaluli
Beaver
Beami
Bughotu
Barok
Bedjond
Baruga
Bakairí
Ghayavi
Muinane
Buamu
Baeggu
Qaqet
Mapos Buang
Carapana
Cacua
Comaltepec Chinantec
Eastern Khumi Chin
Tabasco Chontal
Ashéninka Pajonal
Lealao Chinantec
Cerma
Lalana Chinantec
Tepetotutla Chinantec
Palantla Chinantec
Sochiapam Chinantec
Western Highland Chatino
Usila Chinantec
Cuiba
+ 6213 languages
Apply filters
Datasets
229
Full-text search
Edit filters
Sort: Trending
Active filters:
mr
Clear all
allenai/c4
Viewer
•
Updated
Jan 9
•
10.4B
•
314k
•
287
wikimedia/wikipedia
Viewer
•
Updated
Jan 9
•
61.6M
•
51.1k
•
546
unimelb-nlp/wikiann
Viewer
•
Updated
Feb 22
•
2M
•
807
•
96
uonlp/CulturaX
Viewer
•
Updated
Jul 23
•
7.18B
•
12k
•
469
mozilla-foundation/common_voice_17_0
Viewer
•
Updated
Jun 16
•
13M
•
52.1k
•
150
haoranxu/X-ALMA-Preference
Viewer
•
Updated
3 days ago
•
772k
•
4
•
2
statmt/cc100
Updated
Mar 5
•
763
•
71
ai4bharat/indic_glue
Viewer
•
Updated
Jan 4
•
887k
•
1.11k
•
10
Helsinki-NLP/opus-100
Viewer
•
Updated
Feb 28
•
55.1M
•
3.14k
•
140
oscar-corpus/oscar
Updated
Mar 21
•
399
•
173
Helsinki-NLP/tatoeba
Updated
Jan 18
•
35
•
37
legacy-datasets/wikipedia
Updated
Mar 11
•
976
•
548
google/xtreme
Viewer
•
Updated
Feb 22
•
2.77M
•
2.72k
•
88
ai4bharat/samanantar
Updated
Dec 7, 2022
•
207
•
21
csebuetnlp/xlsum
Viewer
•
Updated
Apr 18, 2023
•
1.35M
•
7.37k
•
106
oscar-corpus/OSCAR-2201
Updated
May 30, 2023
•
173
•
113
wikimedia/wit_base
Viewer
•
Updated
Nov 4, 2022
•
108k
•
433
•
51
sil-ai/bloom-speech
Updated
Feb 15, 2023
•
86
•
21
CohereForAI/xP3x
Updated
Apr 10
•
1.08k
•
67
anujsahani01/English-Marathi
Viewer
•
Updated
Jun 29, 2023
•
3.52M
•
6
•
5
facebook/belebele
Viewer
•
Updated
Aug 12
•
110k
•
67.3k
•
94
saillab/taco-datasets
Viewer
•
Updated
Dec 1, 2023
•
3.2M
•
1.97k
•
15
ayymen/Pontoon-Translations
Viewer
•
Updated
Jan 19
•
3.56M
•
507
•
9
mozilla-foundation/common_voice_16_0
Viewer
•
Updated
Dec 21, 2023
•
8.2M
•
2.38k
•
65
alexandrainst/m_arc
Viewer
•
Updated
Jan 15
•
87.4k
•
126k
•
4
alexandrainst/m_mmlu
Viewer
•
Updated
Mar 11
•
488k
•
326k
•
13
TrainingDataPro/llm-dataset
Viewer
•
Updated
Apr 25
•
1.17k
•
4
•
4
google/IndicGenBench_xquad_in
Updated
May 4
•
239
•
3
google/IndicGenBench_flores_in
Updated
May 4
•
1.1k
•
5
google/IndicGenBench_crosssum_in
Updated
May 4
•
218
•
4
Previous
1
2
3
...
8
Next