Clem 🤗 PRO
AI & ML interests
Organizations
clem's activity
The paper is here: https://ai.meta.com/static-resource/movie-gen-research-paper
The reasons are not just altruistic, it's also because sharing your science and your models pushes you to build AI faster (which is key in a fast-moving domain like AI), attracts the best scientists & engineers and generates much more visibility, usage and community contributions than if you were 100% closed-source. The same applies to big tech companies as we're seeing with Meta and Google!
More startups and companies should release research & open-source AI, it's not just good for the world but also increases their probability of success!
congrats!
Well done
@martinigoyanes @rafa-hernandez @Vidusharma @frisokingma @hannahwright @jeanmarcs @antonioramos & the whole https://huggingface.co/adyen team. Could be useful to cross-post here: https://huggingface.co/blog/community
I guess https://huggingface.co/docs/huggingface_hub/v0.5.1/en/package_reference/hf_api? @Wauplin is the expert I think on the topic
great video!
It depends what you want to do but you can embed gradio/spaces (https://huggingface.co/docs/hub/en/spaces-sdks-gradio#embed-gradio-spaces-on-other-webpages), enable sign in with hf (https://huggingface.co/docs/hub/en/oauth) or just redirect to your org page (or any HF page)
http://hf.co/datasets
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model (2211.05100)
bigscience/bloom
bigscience/bloomz
Especially noteworthy at a time when most AI startups wouldn’t survive a year or two without VC money. Yay!
nice!
very cool!
very cool!
Beautiful team work!
very cool! feel free to create the HF for Legal org and share about it and we can amplify!
Beautiful work!
omg would be sick!
Despite getting much less $$, recognition & visibility than entrepreneurs, the scientists who publish their groundbreaking research openly are the cornerstone of technological progress & massively contribute to making the world a better place!
very cool! cc @clefourrier
any info on when it's going to be released though?
do you have a link?
congrats!
very cool, thanks for sharing!
Interesting update! They can open-source GPT4 now haha
congratulations! well deserved!
gotta catch them all!
you should create an org on HF for it
Spotted by @jack-kumar
Everyone should fine-tune their own models for their use-cases, languages, industry, infra constraints,...
10,000 llama3 variants by the end of next week?
Thank you! You should tweet it mentioning @elonmuskceo !
This is the explanation that @WizardLM communicated a few hours ago: https://huggingface.co/posts/WizardLM/329547800484476#661e0d17bca1a6038b60503e
We apologize for the inconvenience & are trying to get in touch with the author & Microsoft in order to try to find a good resolution for community members. Let us know if you have any questions!
Anthropic/persuasion
It stands as the largest and most diverse synthetic Text-to-SQL dataset available to-date.
The dataset includes:
- 105,851 records partitioned into 100,000 train and 5,851 test records
~23M total tokens, including ~12M SQL tokens
- Coverage across 100 distinct domains/verticals
- Comprehensive array of SQL tasks: data definition, retrieval, manipulation, analytics & reporting
- Wide range of SQL complexity levels, including subqueries, single joins, multiple joins, aggregations, window functions, set operations
- Database context, including table and view create statements
- Natural language explanations of what the SQL query is doing
- Contextual tags to optimize model training
Blogpost: https://gretel.ai/blog/synthetic-text-to-sql-dataset
Dataset: gretelai/synthetic_text_to_sql
Thanks for sharing!
Welcome @josefprusa !
Unpopular opinion: this is the most impactful release of the day (because open)!
would be cool to have some integration with the HF hub
This is awesome!
very cool!
🇫🇷🇫🇷🇫🇷
Very cool!
very useful! This is the link to the leaderboard btw: https://huggingface.co/spaces/PatronusAI/enterprise_scenarios_leaderboard
very cool!
@vikhyatk (especially the last answer 😝😝😝).
Open multi-modal models have gone a long way!
Model: vikhyatk/moondream1
Let's go!
Excited to see this dataset release in French by @Pclanglais @carbonbasedLLM @anastasiastasenko :
PleIAs/French-PD-Newspapers
"To give you an idea of the size, the full French Wikipedia is about 2 billon words. This is 40 times larger."
Very cool!
https://huggingface.co/blog/gcp-partnership
https://finance.yahoo.com/video/google-hugging-face-alliance-spur-173016882.html
https://www.theverge.com/2024/1/25/24050445/google-cloud-hugging-face-ai-developer-access
https://www.bloomberg.com/news/articles/2024-01-25/google-to-team-up-with-startup-hugging-face-to-host-ai-software
https://www.reuters.com/technology/google-cloud-partners-with-hugging-face-attract-ai-developers-2024-01-25/