Join the conversation
Join the community of Machine Learners and AI enthusiasts.
Sign UpAll HF Hub posts
Post
489
NSFW Erotic Novel AI Generation
-NSFW Text (Data) Generator for Detecting 'NSFW' Text: Multilingual Experience
The multilingual NSFW text (data) auto-generator is a tool designed to automatically generate and analyze adult content in various languages. This service uses AI-based text generation to produce various types of NSFW content, which can then be used as training data to build effective filtering models. It supports multiple languages, including English, and allows users to input the desired language through the system prompt in the on-screen options to generate content in the specified language. Users can create datasets from the generated data, train machine learning models, and improve the accuracy of text analysis systems. Furthermore, content generation can be customized according to user specifications, allowing for the creation of tailored data. This maximizes the performance of NSFW text detection models.
URL: https://fantaxy-erotica.hf.space
Usage Warnings and Notices: This tool is intended for research and development purposes only, and the generated NSFW content must adhere to appropriate legal and ethical guidelines. Proper monitoring is required to prevent the misuse of inappropriate content, and legal responsibility lies with the user. Users must comply with local laws and regulations when using the data, and the service provider is not liable for any issues arising from the misuse of the data.
-NSFW Text (Data) Generator for Detecting 'NSFW' Text: Multilingual Experience
The multilingual NSFW text (data) auto-generator is a tool designed to automatically generate and analyze adult content in various languages. This service uses AI-based text generation to produce various types of NSFW content, which can then be used as training data to build effective filtering models. It supports multiple languages, including English, and allows users to input the desired language through the system prompt in the on-screen options to generate content in the specified language. Users can create datasets from the generated data, train machine learning models, and improve the accuracy of text analysis systems. Furthermore, content generation can be customized according to user specifications, allowing for the creation of tailored data. This maximizes the performance of NSFW text detection models.
URL: https://fantaxy-erotica.hf.space
Usage Warnings and Notices: This tool is intended for research and development purposes only, and the generated NSFW content must adhere to appropriate legal and ethical guidelines. Proper monitoring is required to prevent the misuse of inappropriate content, and legal responsibility lies with the user. Users must comply with local laws and regulations when using the data, and the service provider is not liable for any issues arising from the misuse of the data.
Post
795
The Nobel Prize background for Hopfield and Hinton's work on neural networks is pure gold. It's a masterclass in explaining AI basics.
Key takeaways from the conclusion:
- ML applications are expanding rapidly. We're still figuring out which will stick.
- Ethical discussions are crucial as the tech develops.
- Physics ๐ค AI: A two-way street of innovation.
Some mind-blowing AI applications in physics:
- Discovering the Higgs particle
- Cleaning up gravitational wave data
- Hunting exoplanets
- Predicting molecular structures
- Designing better solar cells
We're just scratching the surface. The interplay between AI and physics is reshaping both fields.
Bonus: The illustrations accompanying the background document are really neat. (Credit: Johan Jarnestad/The Royal Swedish Academy of Sciences)
#AI #MachineLearning #Physics #Ethics #Innovation
Key takeaways from the conclusion:
- ML applications are expanding rapidly. We're still figuring out which will stick.
- Ethical discussions are crucial as the tech develops.
- Physics ๐ค AI: A two-way street of innovation.
Some mind-blowing AI applications in physics:
- Discovering the Higgs particle
- Cleaning up gravitational wave data
- Hunting exoplanets
- Predicting molecular structures
- Designing better solar cells
We're just scratching the surface. The interplay between AI and physics is reshaping both fields.
Bonus: The illustrations accompanying the background document are really neat. (Credit: Johan Jarnestad/The Royal Swedish Academy of Sciences)
#AI #MachineLearning #Physics #Ethics #Innovation
MonsterMMORPGย
posted an update
2 days ago
Post
3400
Huge news for Kohya GUI - Now you can fully Fine Tune / DreamBooth FLUX Dev with as low as 6 GB GPUs without any quality loss compared to 48 GB GPUs - Moreover, Fine Tuning yields better results than any LoRA training could
Config Files
I published all configs here : https://www.patreon.com/posts/112099700
Tutorials
Fine tuning tutorial in production
Windows FLUX LoRA training (fine tuning is same just config changes) : https://youtu.be/nySGu12Y05k
Cloud FLUX LoRA training (RunPod and Massed Compute ultra cheap) : https://youtu.be/-uhL2nW7Ddw
LoRA Extraction
The checkpoint sizes are 23.8 GB but you can extract LoRA with almost no loss quality - I made a research and public article / guide for this as well
LoRA extraction guide from Fine Tuned checkpoint is here : https://www.patreon.com/posts/112335162
Info
This is just mind blowing. The recent improvements Kohya made for block swapping is just amazing.
Speeds are also amazing that you can see in image 2 - of course those values are based on my researched config and tested on RTX A6000 - same speed as almost RTX 3090
Also all trainings experiments are made at 1024x1024px. If you use lower resolution it will be lesser VRAM + faster speed
The VRAM usages would change according to your own configuration - likely speed as well
Moreover, Fine Tuning / DreamBooth yields better results than any LoRA could
Installers
1-Kohya GUI accurate branch and Windows Torch 2.5 Installers and test prompts shared here : https://www.patreon.com/posts/110879657
The link of Kohya GUI with accurate branch : https://github.com/bmaltais/kohya_ss/tree/sd3-flux.1
Config Files
I published all configs here : https://www.patreon.com/posts/112099700
Tutorials
Fine tuning tutorial in production
Windows FLUX LoRA training (fine tuning is same just config changes) : https://youtu.be/nySGu12Y05k
Cloud FLUX LoRA training (RunPod and Massed Compute ultra cheap) : https://youtu.be/-uhL2nW7Ddw
LoRA Extraction
The checkpoint sizes are 23.8 GB but you can extract LoRA with almost no loss quality - I made a research and public article / guide for this as well
LoRA extraction guide from Fine Tuned checkpoint is here : https://www.patreon.com/posts/112335162
Info
This is just mind blowing. The recent improvements Kohya made for block swapping is just amazing.
Speeds are also amazing that you can see in image 2 - of course those values are based on my researched config and tested on RTX A6000 - same speed as almost RTX 3090
Also all trainings experiments are made at 1024x1024px. If you use lower resolution it will be lesser VRAM + faster speed
The VRAM usages would change according to your own configuration - likely speed as well
Moreover, Fine Tuning / DreamBooth yields better results than any LoRA could
Installers
1-Kohya GUI accurate branch and Windows Torch 2.5 Installers and test prompts shared here : https://www.patreon.com/posts/110879657
The link of Kohya GUI with accurate branch : https://github.com/bmaltais/kohya_ss/tree/sd3-flux.1
Post
991
๐ฅ ๐-๐๐ฎ๐ฅ: ๐๐๐๐ข๐ญ๐ข๐จ๐ง-๐๐ง๐ฅ๐ฒ ๐๐ฎ๐ฅ๐ญ๐ข๐ฉ๐ฅ๐ข๐๐๐ญ๐ข๐จ๐ง ๐๐๐ง ๐ฌ๐ฅ๐๐ฌ๐ก ๐๐จ๐ฆ๐ฉ๐ฎ๐ญ๐๐ญ๐ข๐จ๐ง๐๐ฅ ๐๐จ๐ฌ๐ญ๐ฌ ๐๐ฒ ๐๐%!
Microsoft researchers dropped a groundbreaking technique that could slash the energy use in transformer computations : their novel "linear-complexity multiplication" (L-Mul) algorithm approximates floating-point multiplication using energy-efficient integer addition instead of costly multiplications.
๐ก Quick reminder on how floats are coded on 8 bits (FP8):
In the e4m3 FP8 standard, you encode a number as:
Sign (1 bit) | Exponent (4 bits) | Mantissa (3 bits)
Example: 0 (positive) | 1000 (8) | 101 (1/2 + 1/8 = 0.625)
Calculation: you add one to the mantissa, and multiply it by 2 power (the exponent - a bias term which is 7 for e4m3):
โก๏ธย You get (1 + 0.625) ร 2^(8-7) = 3.25
Now back to the paper. ๐๐ฒ๐ ๐ถ๐ป๐๐ถ๐ด๐ต๐๐:
โก๏ธ Multiplication is extremely energy-intensive compared to addition. For 32-bit operations, multiplication (3.7 pJ) uses 37x more energy than addition (0.1 pJ)!
๐งฎ Traditional floating-point multiplication go like (noting xm the mantissa and xe the exponent): Mul(x,y) = (1 + xm) ยท 2^xe ยท (1 + ym) ยท 2^ye = (1 + xm + ym + xm ยท ym) ยท 2^(xe+ye)
๐ก L-Mul cleverly approximates this as: L-Mul(x,y) = (1 + xm + ym + 2^-l(m)) ยท 2^(xe+ye), eliminating the costly xm ยท ym term
๐ง l(m) term is adaptively set based on mantissa size for optimal accuracy
๐ Benchmarks on the Llama-3.1-8B-Instruct model show L-Mul preserves precision across various NLP tasks, with performance nearly identical to full BFloat16 precision
๐ฌ Authors claim: "We can achieve the same model inference performance while reducing the energy cost of attention computations by 80%."
This breakthrough is still theoretical and would need implementation on dedicated hardware to confirm real-world gains, but itโs a really exciting path for more sustainable AI! ๐ฑ
Read the paper here ๐ย Addition is All You Need for Energy-efficient Language Models (2410.00907)
Microsoft researchers dropped a groundbreaking technique that could slash the energy use in transformer computations : their novel "linear-complexity multiplication" (L-Mul) algorithm approximates floating-point multiplication using energy-efficient integer addition instead of costly multiplications.
๐ก Quick reminder on how floats are coded on 8 bits (FP8):
In the e4m3 FP8 standard, you encode a number as:
Sign (1 bit) | Exponent (4 bits) | Mantissa (3 bits)
Example: 0 (positive) | 1000 (8) | 101 (1/2 + 1/8 = 0.625)
Calculation: you add one to the mantissa, and multiply it by 2 power (the exponent - a bias term which is 7 for e4m3):
โก๏ธย You get (1 + 0.625) ร 2^(8-7) = 3.25
Now back to the paper. ๐๐ฒ๐ ๐ถ๐ป๐๐ถ๐ด๐ต๐๐:
โก๏ธ Multiplication is extremely energy-intensive compared to addition. For 32-bit operations, multiplication (3.7 pJ) uses 37x more energy than addition (0.1 pJ)!
๐งฎ Traditional floating-point multiplication go like (noting xm the mantissa and xe the exponent): Mul(x,y) = (1 + xm) ยท 2^xe ยท (1 + ym) ยท 2^ye = (1 + xm + ym + xm ยท ym) ยท 2^(xe+ye)
๐ก L-Mul cleverly approximates this as: L-Mul(x,y) = (1 + xm + ym + 2^-l(m)) ยท 2^(xe+ye), eliminating the costly xm ยท ym term
๐ง l(m) term is adaptively set based on mantissa size for optimal accuracy
๐ Benchmarks on the Llama-3.1-8B-Instruct model show L-Mul preserves precision across various NLP tasks, with performance nearly identical to full BFloat16 precision
๐ฌ Authors claim: "We can achieve the same model inference performance while reducing the energy cost of attention computations by 80%."
This breakthrough is still theoretical and would need implementation on dedicated hardware to confirm real-world gains, but itโs a really exciting path for more sustainable AI! ๐ฑ
Read the paper here ๐ย Addition is All You Need for Energy-efficient Language Models (2410.00907)
Post
1495
Meta AI vision has been cooking
@facebook
They shipped multiple models and demos for their papers at @ECCV ๐ค
Here's a compilation of my top picks:
- Sapiens is family of foundation models for human-centric depth estimation, segmentation and more, all models have open weights and demos ๐
All models have their demos and even torchscript checkpoints!
A collection of models and demos: facebook/sapiens-66d22047daa6402d565cb2fc
- VFusion3D is state-of-the-art consistent 3D generation model from images
Model: facebook/vfusion3d
Demo: facebook/VFusion3D
- CoTracker is the state-of-the-art point (pixel) tracking model
Demo: facebook/cotracker
Model: facebook/cotracker
They shipped multiple models and demos for their papers at @ECCV ๐ค
Here's a compilation of my top picks:
- Sapiens is family of foundation models for human-centric depth estimation, segmentation and more, all models have open weights and demos ๐
All models have their demos and even torchscript checkpoints!
A collection of models and demos: facebook/sapiens-66d22047daa6402d565cb2fc
- VFusion3D is state-of-the-art consistent 3D generation model from images
Model: facebook/vfusion3d
Demo: facebook/VFusion3D
- CoTracker is the state-of-the-art point (pixel) tracking model
Demo: facebook/cotracker
Model: facebook/cotracker
alielfilali01ย
posted an update
2 days ago
Post
2202
Don't you think we should add a tag "Evaluation" for datasets that are meant to be benchmarks and not for training ?
At least, when someone is collecting a group of datasets from an organization or let's say the whole hub can filter based on that tag and avoid somehow contaminating their "training" data.
At least, when someone is collecting a group of datasets from an organization or let's say the whole hub can filter based on that tag and avoid somehow contaminating their "training" data.
TuringsSolutionsย
posted an update
1 day ago
Post
1022
Neural Network Chaos Monkey: Randomly shuts off parts of the neural network during training. The Chaos Monkey is super present at Epoch 1, is gone by the final Epoch. My hypothesis was that this would either increase the robustness of the model, or it would make the outputs totally worse. You can 100% reproduce my results, chaos wins again.
https://youtu.be/bWA9unotJ7k
https://youtu.be/bWA9unotJ7k
Post
1622
๐ Evaluating Long Context #1: Long Range Arena (LRA)
Accurately evaluating how well language models handle long contexts is crucial, but it's also quite challenging to do well. In this series of posts, we're going to examine the various benchmarks that were proposed to assess long context understanding, starting with Long Range Arens (LRA)
Introduced in 2020, Long Range Arens (LRA) is one of the earliest benchmarks designed to tackle the challenge of long context evaluation.
๐ Key Features of LRA
1๏ธโฃ Diverse Tasks: The LRA benchmark consists of a suite of tasks designed to evaluate model performance on long sequences ranging from 1,000 to 16,000 tokens. These tasks encompass different data types and modalities: Text, Natural and Synthetic Images, and Mathematical Expressions.
2๏ธโฃ Synthetic and Real-world Tasks: LRA is comprised of both synthetic probing tasks and real-world tasks.
3๏ธโฃ Open-Source and Extensible: Implemented in Python using Jax and Flax, the LRA benchmark code is publicly available, making it easy to extend.
๐ Tasks
1๏ธโฃ Long ListOps
2๏ธโฃ Byte-level Text Classification and Document Retrieval
3๏ธโฃ Image Classification
4๏ธโฃ Pathfinder and Pathfinder-X (Long-range spatial dependency)
๐จโ๐ป Long Range Arena (LRA) Github Repository: https://github.com/google-research/long-range-arena
๐ Long Range Arena (LRA) paper: Long Range Arena: A Benchmark for Efficient Transformers (2011.04006)
Accurately evaluating how well language models handle long contexts is crucial, but it's also quite challenging to do well. In this series of posts, we're going to examine the various benchmarks that were proposed to assess long context understanding, starting with Long Range Arens (LRA)
Introduced in 2020, Long Range Arens (LRA) is one of the earliest benchmarks designed to tackle the challenge of long context evaluation.
๐ Key Features of LRA
1๏ธโฃ Diverse Tasks: The LRA benchmark consists of a suite of tasks designed to evaluate model performance on long sequences ranging from 1,000 to 16,000 tokens. These tasks encompass different data types and modalities: Text, Natural and Synthetic Images, and Mathematical Expressions.
2๏ธโฃ Synthetic and Real-world Tasks: LRA is comprised of both synthetic probing tasks and real-world tasks.
3๏ธโฃ Open-Source and Extensible: Implemented in Python using Jax and Flax, the LRA benchmark code is publicly available, making it easy to extend.
๐ Tasks
1๏ธโฃ Long ListOps
2๏ธโฃ Byte-level Text Classification and Document Retrieval
3๏ธโฃ Image Classification
4๏ธโฃ Pathfinder and Pathfinder-X (Long-range spatial dependency)
๐จโ๐ป Long Range Arena (LRA) Github Repository: https://github.com/google-research/long-range-arena
๐ Long Range Arena (LRA) paper: Long Range Arena: A Benchmark for Efficient Transformers (2011.04006)
Post
331
On-device AI framework ecosystem is blooming these days:
1. llama.cpp - All things Whisper, LLMs & VLMs - run across Metal, CUDA and other backends (AMD/ NPU etc)
https://github.com/ggerganov/llama.cpp
2. MLC - Deploy LLMs across platforms especially WebGPU (fastest WebGPU LLM implementation out there)
https://github.com/mlc-ai/web-llm
3. MLX - Arguably the fastest general purpose framework (Mac only) - Supports all major Image Generation (Flux, SDXL, etc), Transcription (Whisper), LLMs
https://github.com/ml-explore/mlx-examples
4. Candle - Cross-platform general purpose framework written in Rust - wide coverage across model categories
https://github.com/huggingface/candle
Honorable mentions:
1. Transformers.js - Javascript (WebGPU) implementation built on top of ONNXruntimeweb
https://github.com/xenova/transformers.js
2. Mistral rs - Rust implementation for LLMs & VLMs, built on top of Candle
https://github.com/EricLBuehler/mistral.rs
3. Ratchet - Cross platform, rust based WebGPU framework built for battle-tested deployments
https://github.com/huggingface/ratchet
4. Zml - Cross platform, Zig based ML framework
https://github.com/zml/zml
Looking forward to how the ecosystem would look 1 year from now - Quite bullish on the top 4 atm - but open source ecosystem changes quite a bit! ๐ค
Also, which frameworks did I miss?
1. llama.cpp - All things Whisper, LLMs & VLMs - run across Metal, CUDA and other backends (AMD/ NPU etc)
https://github.com/ggerganov/llama.cpp
2. MLC - Deploy LLMs across platforms especially WebGPU (fastest WebGPU LLM implementation out there)
https://github.com/mlc-ai/web-llm
3. MLX - Arguably the fastest general purpose framework (Mac only) - Supports all major Image Generation (Flux, SDXL, etc), Transcription (Whisper), LLMs
https://github.com/ml-explore/mlx-examples
4. Candle - Cross-platform general purpose framework written in Rust - wide coverage across model categories
https://github.com/huggingface/candle
Honorable mentions:
1. Transformers.js - Javascript (WebGPU) implementation built on top of ONNXruntimeweb
https://github.com/xenova/transformers.js
2. Mistral rs - Rust implementation for LLMs & VLMs, built on top of Candle
https://github.com/EricLBuehler/mistral.rs
3. Ratchet - Cross platform, rust based WebGPU framework built for battle-tested deployments
https://github.com/huggingface/ratchet
4. Zml - Cross platform, Zig based ML framework
https://github.com/zml/zml
Looking forward to how the ecosystem would look 1 year from now - Quite bullish on the top 4 atm - but open source ecosystem changes quite a bit! ๐ค
Also, which frameworks did I miss?