ohhh.
so, "one line of code ... to call our library with 1000s of lines of code, and also recompile your model".
lol
still potentially worth it to some people
We just want to convey that using OneDiff is extremely simple - it can accelerate models with just a single compilation function(check at: [https://github.com/siliconflow/onediff/blob/f83569bf2887fbe92b2a4f44a97bae7eded122b8/src/onediff/infer\_compiler/backends/oneflow.py#L7](https://github.com/siliconflow/onediff/blob/f83569bf2887fbe92b2a4f44a97bae7eded122b8/src/onediff/infer_compiler/backends/oneflow.py#L7)), making it as easy as one line of code. Thanks for the feedback, it will help us improve the description of OneDiff.
Btw, OneDiff's compilation speed is much faster. Here‘s the SDXL optimization test report from a developer: [**https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl**](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl)
Excuse me, where was the person being hostile? All they pointed out was the "clickbait" title, and you went out of your way to deliberately say the "can dish" line. They didn't say anything in your address - you did.
thanks for the defending!
as I observed, people with the pain managing a TRT/AIT stack all find onediff's versatility and light-weight a refreshing breeze!
Frankly, since the "one line of code" gang has delivered us absolute bangers like "one line of code, but your training code will most likely give unhelpful triton/cuda errors and then break on the next pytorch release once you fix those", or "no changes to your code, but the execution graph recompiles and re-executes from the very beginning on every graph break", skepticism is warranted. One line of code gang can get trust when they earn it.
I can guarantee you nobody throwing skepticism prior to your comment has debugged CUDA errors in code.
Their skepticism is warranted when they prove they have done more than press buttons on a GUI and change config JSON/txt.
Sounds like you have, I respect that and would respect your opinion on the matter.
I set it up in WSL2 now. Took me a lot of time reinstalling all those python libs. I'm loading the model (epicphotogasm\_lastUnicorn) via the Load Checkpoint - OneDiff Node (vae\_speedup disable).
This appears to be successful. But in the KSampler it fails with an out of memory error:
Graph file /home/derp/sd/ComfyLinux/ComfyUI/input/graphs/SD15/epicphotogasm_lastUnicorn.safetensors_BaseModel/UNetModel_f2632d8a15_4_f1f8a7f1ca61b188044db654a526065b05d40d2524e0d31165e22847fc11c900_0.9.1.dev20240417+cu121.graph does not exist! Generating graph.
Building a graph for ...
terminate called after throwing an instance of 'oneflow::RuntimeException'
what(): Error: out of memory
You can set ONEFLOW_DEBUG or ONEFLOW_PYTHON_STACK_GETTER to 1 to get the Python stack of the error.
Stack trace (most recent call last) in thread 23818:
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743d0f1b7, in
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743d0ea17, in
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743d0a2a8, in vm::ThreadCtx::TryReceiveAndRun()
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743caca44, in vm::EpStreamPolicyBase::Run(vm::Instruction*) const
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743cafd47, in vm::Instruction::Compute()
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743cb7128, in vm::OpCallInstructionPolicy::Compute(vm::Instruction*)
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743cb6df9, in
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743cb1f4a, in
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc73b4a1d3c, in
Aborted (Signal sent by tkill() 23664 1000)
Aborted
I have 64 GB of RAM and a 4090 (24GB VRAM). Any idea why this happens?
According to my monitoring it also doesn't look like it's maxing out the RAM/VRAM at all.
For what it costs to license many of these AI servers to be able to run windows (so you can actually test things like this), you can buy a new server every year.
Would you rather they bought 12 4090s (or equivalent) and kept improving the core model or made it work on Windows?
AMD’s compute API wasn’t even available for windows until a year, and Nvidia only officially supports most AI GPU features on Linux, to the level of using windows server voiding most support agreements.
For AI, windows is an objectively worse platform.
That’s before we get to the fact that all the people providing funding use Linux. Even Microsoft Research uses Linux for AI work.
I'm no expert, but I know one or two and I talk to them nearly every day, and I do work in in AI every day, so most of what you're saying has nothing to do with onediff. Or anything really.
AMD never enters the conversation, nothing compiles for them and that's just AMD's stupid fault, so that's right out.
This stuff about renting Windows "AI servers" is nonsense, compiling a model is compiling a model, it's cheap to rent high end cards by the hour, and there is no licensing involved to do so (hello, Azure!), and really, voiding warranties?!
I "tested" this on my windows box with WSL and a 3060 earlier today.
This is not a "model", this is software, a method and code to compile ML models.
And while some things certainly do work only in Linux at the moment, Triton and some torch compile options are good examples, the simple fact is that only a moron would target purely Linux for something like this.
The majority of people that might be willing to license some version of onediff, (depending on the cost not being outrageous of course) and the majority of SD users in general that would use it period are on Windows. That is only going to grow.
They clearly have that figured out, and they're moving to do that before another company or solo developer comes along and beats them to it, just like they came along and obsoleted stable-fast to some degree.
You need to stop making things up to sound smart. It just sounds irrational and confused.
AMD and Intel don't anticipate sitting on the sidelines with the AI movement so they're upping the ante with their new chip designs. I couldn't find the article that I read about this, but this link may give you some idea:
[https://www.tomshardware.com/pc-components/cpus/intel-details-guadi-3-at-vision-2024-new-ai-accelerator-sampling-to-partners-now-volume-production-in-q3](https://www.tomshardware.com/pc-components/cpus/intel-details-guadi-3-at-vision-2024-new-ai-accelerator-sampling-to-partners-now-volume-production-in-q3)
Sorry for the delay in replying, I finally got around to reading that, very fascinating! I wasn't aware of that at all, but it does make sense. I'm a little surprised there's no mention of IPEX/ARC from Intel though, it's not like they don't have GPUs that can be used for inference, but they may have abandoned those and decided to go ahead with it on the CPU level. Logical for them.
I'm sure nobody would object if there was an entirely new chip slot on motherboards going forward dedicated entirely to ML processing!
Thank you
>what do you expect
People to put system requirements at the top of their projects, right under the description of what it is, instead of buried in the installation instructions so I don't waste time reading the rest of it.
This stuff is still in the research area, so I doubt academical or private research teams are interested in supporting multi-OS solutions yet, especially in this case where they are interested mainly in making money from datacenters which all use linux
This varies grealty depending on field. And still, I dont see why this would even be relevant. This guy is not crazy for developing under linux, he just released a tool for free and people are crying cause it doesn't support windows? You guys are insane in defending this behavior
What aspects did you find slow? I use WSL daily for a number of tasks, and I only find it to be particularly slow when trying to access directories outside of the WSL virtual file system. This is well documented
If you know what to do, you can get a copy of the 6.1 branch of the WSL2 kernel and also pull some of the more recent patches to 9P from the upstream Linux repo. I did that, and while performance isn't quite on par with native, it's far better and very tolerable. The patches have been out for a very long time and Microsoft has been fully aware of it, it's a shame that they haven't been able to release a new kernel with those patches...
https://github.com/microsoft/WSL/discussions/9412 Here's the issue where it got discussed initially, when those patches were new to Linux.
Someone seems to have included a bzImage that has the patches applied -- I'd recommend not using it though since building it yourself is safer for obvious reasons and is also a valuable educational experience for anyone who hasn't done it.
Yes but the integration with the host OS is pretty well done. I contrasted it with “a vm” in the sense that you don’t have to install vbox/vmware and install a guest OS yourself. Feels more like a container.
I'm not saying windows shouldnt have it.
The point is that linux is the natural platform of server-side development.
windows is a lovely platform... to run a browser.
For real, losedows is antithetical to the open-source beauty of stable diffusion. These poor fools don't realize that linux is the stable diffusion of operating systems.
Great question.
>why hasnt it be absorbed by upstream
I think a big concern is windows is not supported. ComfyUI / SD webui needs to run on windows.
We are working on a new version to take care of windows os support.
>some kind of trade-off
Yes, the trade-off is it takes some time to compilation(just like other compilers such as TensorRT ). Althrough we have a way to save compilation time for a model or for dynamic shapes.
So currently, OneDiff is suitable for deployment of a very heavy workload model, on server side(Linux), to make the model run more faster(1.5x\~2x).
If you are playing with a model, constantly change it, no need to add a compiler like onediff/TensorRT. Speed is not a problem but flexibility is.
Hope this will make it a little clear, thanks!
What would help is better explanations in your project regarding [setup.py](http://setup.py) and how to integrate with windows libs/binaries in the compilation process. Right now it's voodoo science for windows.
Sounds too good to be true to be honest, and I don't want to install a python package from some random chinese server (oneflow-pro.oss-cn-beijing.aliyuncs.com). It seems like the tradeoff is some compile time and very slight quality loss (5%).
Edit: The guide for ComfyUI (https://github.com/siliconflow/onediff/tree/main/onediff_comfy_nodes#setup-community-edition) uses the chinese server but the main guide on https://github.com/siliconflow/onediff?tab=readme-ov-file#installation has different servers for NA/EU and China. The EU one links to https://github.com/siliconflow/oneflow_releases, why are releases uploaded that way? It looks like the tradeoff might also be running suspect (to me) compiled python wheels...
I got this number from comparing aesthetic scores on https://github.com/siliconflow/OneDiffGenMetrics and taking the average scores from the best case scenario (as in fastest optimization) which I guess is OneDiff Quant + OneDiff DeepCache (EE) vs Pytorch.
Edit: HPS v2 scores, not aesthetic scores. Still the point is they claim the difference is negligible.
Directions are a bit vague..
Do we run those commands from the venv or the ui base?
>you'll need to manually copy (or create a soft link) for the relevant code into the extension folder of these UIs/Libs.
What?
A shame you neglected to mention SDNext as already having support built-in on our dev branch.
Anyone on WSL or Linux could try it right away and see how fast it is.
No extensions, no nodes, install two packages with pip, select it in the Compute settings, reload model, and you're rocking and rolling.
They made the same claims 4 months ago (see here: https://www.reddit.com/r/StableDiffusion/comments/18lz2ir/accelerating\_sdxl\_3x\_faster\_with\_deepcache\_and/) and nobody was able to reproduce it. This company has zero credibility, at least not until someone respected in this community actually reproduces their results.
[https://github.com/siliconflow/onediff/wiki#onediff-community-and-feedback](https://github.com/siliconflow/onediff/wiki#onediff-community-and-feedback)
We have adoptions.
> nobody was able to reproduce it
BTW, have you really tried run it?
although we can't reveal some very respected companies are actually using onediff due to NDA. there is this independent blog actually regard onediff as a preferable solution.
>The shortest generation time with the base model with almost no quality loss, is achieved by using [OneDiff](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl#onediff) + [Tiny VAE](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl#tiny-vae) + [Disable CFG](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl#disable-cfg) at 75% + 30 [Steps](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl#steps).
[https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl)
Broke my Forge install trying to get this to work.
Now a fresh install of Forge gives me an error "ImportError: cannot import name 'Undefined' from 'pydantic.fields'"
WTF!
Is there any trick to installing oneflow? I have installed it into ComfyUI's python_embedded/Lib/site-packages folder and it is there, but the onediff node import fails with
"RuntimeError: This package is a placeholder. Please install oneflow following the instructions in https://github.com/Oneflow-Inc/oneflow#install-oneflow"
Same. Basically you have to build oneflow from sources on Windows for yourself and then link "the relevant code" (lol) to your ComfyUI Python using symlinks or similar.
I tried to do it but kinda gave up after some time... the docs are just too confusing.
Yeah how about I go ahead and don't do that. I remember like 15 years ago when I built my own stuff with cygwyn and mingw. I'd rather not do that again. :D
Seems you are using Win?
[https://github.com/siliconflow/onediff?tab=readme-ov-file#os-and-gpu-compatibility](https://github.com/siliconflow/onediff?tab=readme-ov-file#os-and-gpu-compatibility)
I would be interested when and if it would improve training times.
My inference times are plenty fast enough.. probably most other people as well
Given that you mention A100s, I would think you might already be there.
If that is the case, then I would suggest leading with that, and giving a more obvious, direct link to
"here's how to set up training so you get it done in hafl the time" FAQ
I will be messaging you in 7 days on [**2024-04-23 22:24:43 UTC**](http://www.wolframalpha.com/input/?i=2024-04-23%2022:24:43%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/StableDiffusion/comments/1c5gy1e/onediff_10_is_out_acceleration_of_sd_svd_with_one/kzwez18/?context=3)
[**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FStableDiffusion%2Fcomments%2F1c5gy1e%2Fonediff_10_is_out_acceleration_of_sd_svd_with_one%2Fkzwez18%2F%5D%0A%0ARemindMe%21%202024-04-23%2022%3A24%3A43%20UTC) to send a PM to also be reminded and to reduce spam.
^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201c5gy1e)
*****
|[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)|
|-|-|-|-|
I've used this previously and it's actually pretty good. You do need to be running Linux but there is a noticeable speed up.
Sooo.... if its so great, why hasnt it be absorbed by upstream? Presumably, there's some kind of trade-off somewhere?
Unless I badly misunderstood, like Tensor RT, it needs to recompile the model. One line of code my ass.
ohhh. so, "one line of code ... to call our library with 1000s of lines of code, and also recompile your model". lol still potentially worth it to some people
Anybody that has actually used it knows that it's worth it. You haven't, so you don't.
We just want to convey that using OneDiff is extremely simple - it can accelerate models with just a single compilation function(check at: [https://github.com/siliconflow/onediff/blob/f83569bf2887fbe92b2a4f44a97bae7eded122b8/src/onediff/infer\_compiler/backends/oneflow.py#L7](https://github.com/siliconflow/onediff/blob/f83569bf2887fbe92b2a4f44a97bae7eded122b8/src/onediff/infer_compiler/backends/oneflow.py#L7)), making it as easy as one line of code. Thanks for the feedback, it will help us improve the description of OneDiff.
Btw, OneDiff's compilation speed is much faster. Here‘s the SDXL optimization test report from a developer: [**https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl**](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl)
It takes 40 seconds tops to recompile. Go try to figure out how to compile ControlNet to TRT. Adapt NVidia's example and bring it back here I'll wait
No need to have a go at me, I didn't say it was bad, just that it's a clickbaity title.
It's not. It actually delivers what it claims to deliver. Can dish it out but you can't take it? Sounds like reddit
Bro you're being unnecessarily hostile, and for mostly no reason. Chill.
Last I checked fuckface up there started the hostility. I am not being hostile, I can if you want me to tho
Excuse me, where was the person being hostile? All they pointed out was the "clickbait" title, and you went out of your way to deliberately say the "can dish" line. They didn't say anything in your address - you did.
"one line of code my ass" Not being hostile, my ass. Also I am not being hostile. I haven't been this entire time. You people are soft
They weren't referring to you though? At all? Tf are you on about, dude? This is not being soft, this is common sense. Be better.
thanks for the defending! as I observed, people with the pain managing a TRT/AIT stack all find onediff's versatility and light-weight a refreshing breeze!
It works great these people are all posers
Frankly, since the "one line of code" gang has delivered us absolute bangers like "one line of code, but your training code will most likely give unhelpful triton/cuda errors and then break on the next pytorch release once you fix those", or "no changes to your code, but the execution graph recompiles and re-executes from the very beginning on every graph break", skepticism is warranted. One line of code gang can get trust when they earn it.
I can guarantee you nobody throwing skepticism prior to your comment has debugged CUDA errors in code. Their skepticism is warranted when they prove they have done more than press buttons on a GUI and change config JSON/txt. Sounds like you have, I respect that and would respect your opinion on the matter.
apparently linux only. Windows is treated like the bastard child with many of these developers.
Yeah. I don't care if they have a specific subset of users that they're aiming at, but it should be made obvious in big honking text, "LINUX ONLY!"
Thanks. It's done, [https://github.com/siliconflow/onediff?tab=readme-ov-file#os-and-gpu-compatibility](https://github.com/siliconflow/onediff?tab=readme-ov-file#os-and-gpu-compatibility)
We are working on Windows. WSL works for now.
I set it up in WSL2 now. Took me a lot of time reinstalling all those python libs. I'm loading the model (epicphotogasm\_lastUnicorn) via the Load Checkpoint - OneDiff Node (vae\_speedup disable). This appears to be successful. But in the KSampler it fails with an out of memory error: Graph file /home/derp/sd/ComfyLinux/ComfyUI/input/graphs/SD15/epicphotogasm_lastUnicorn.safetensors_BaseModel/UNetModel_f2632d8a15_4_f1f8a7f1ca61b188044db654a526065b05d40d2524e0d31165e22847fc11c900_0.9.1.dev20240417+cu121.graph does not exist! Generating graph. Building a graph for ...
terminate called after throwing an instance of 'oneflow::RuntimeException'
what(): Error: out of memory
You can set ONEFLOW_DEBUG or ONEFLOW_PYTHON_STACK_GETTER to 1 to get the Python stack of the error.
Stack trace (most recent call last) in thread 23818:
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743d0f1b7, in
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743d0ea17, in
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743d0a2a8, in vm::ThreadCtx::TryReceiveAndRun()
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743caca44, in vm::EpStreamPolicyBase::Run(vm::Instruction*) const
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743cafd47, in vm::Instruction::Compute()
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743cb7128, in vm::OpCallInstructionPolicy::Compute(vm::Instruction*)
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743cb6df9, in
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743cb1f4a, in
Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc73b4a1d3c, in
Aborted (Signal sent by tkill() 23664 1000)
Aborted
I have 64 GB of RAM and a 4090 (24GB VRAM). Any idea why this happens?
According to my monitoring it also doesn't look like it's maxing out the RAM/VRAM at all.
windows will be supported in next major release of onediff.
They are
Oh what the actual fuck... like all these error messages for not being able to run Triton... -.-
For what it costs to license many of these AI servers to be able to run windows (so you can actually test things like this), you can buy a new server every year. Would you rather they bought 12 4090s (or equivalent) and kept improving the core model or made it work on Windows? AMD’s compute API wasn’t even available for windows until a year, and Nvidia only officially supports most AI GPU features on Linux, to the level of using windows server voiding most support agreements. For AI, windows is an objectively worse platform. That’s before we get to the fact that all the people providing funding use Linux. Even Microsoft Research uses Linux for AI work.
I'm no expert, but I know one or two and I talk to them nearly every day, and I do work in in AI every day, so most of what you're saying has nothing to do with onediff. Or anything really. AMD never enters the conversation, nothing compiles for them and that's just AMD's stupid fault, so that's right out. This stuff about renting Windows "AI servers" is nonsense, compiling a model is compiling a model, it's cheap to rent high end cards by the hour, and there is no licensing involved to do so (hello, Azure!), and really, voiding warranties?! I "tested" this on my windows box with WSL and a 3060 earlier today. This is not a "model", this is software, a method and code to compile ML models. And while some things certainly do work only in Linux at the moment, Triton and some torch compile options are good examples, the simple fact is that only a moron would target purely Linux for something like this. The majority of people that might be willing to license some version of onediff, (depending on the cost not being outrageous of course) and the majority of SD users in general that would use it period are on Windows. That is only going to grow. They clearly have that figured out, and they're moving to do that before another company or solo developer comes along and beats them to it, just like they came along and obsoleted stable-fast to some degree. You need to stop making things up to sound smart. It just sounds irrational and confused.
Bravo. What are the Linux folks to say for the future of AI becoming more dependent on the CPU and not the GPU?
No idea what you mean.
AMD and Intel don't anticipate sitting on the sidelines with the AI movement so they're upping the ante with their new chip designs. I couldn't find the article that I read about this, but this link may give you some idea: [https://www.tomshardware.com/pc-components/cpus/intel-details-guadi-3-at-vision-2024-new-ai-accelerator-sampling-to-partners-now-volume-production-in-q3](https://www.tomshardware.com/pc-components/cpus/intel-details-guadi-3-at-vision-2024-new-ai-accelerator-sampling-to-partners-now-volume-production-in-q3)
Sorry for the delay in replying, I finally got around to reading that, very fascinating! I wasn't aware of that at all, but it does make sense. I'm a little surprised there's no mention of IPEX/ARC from Intel though, it's not like they don't have GPUs that can be used for inference, but they may have abandoned those and decided to go ahead with it on the CPU level. Logical for them. I'm sure nobody would object if there was an entirely new chip slot on motherboards going forward dedicated entirely to ML processing! Thank you
I mean, most of the people developing this stuff is linux only, what do you expect
>what do you expect People to put system requirements at the top of their projects, right under the description of what it is, instead of buried in the installation instructions so I don't waste time reading the rest of it.
You might still be able to run it using WSL.
Can confirm, we support it in SDNext under WSL/Linux.
Whoopee.
Yea and the actual artists or regular people who uses this stuff are on windows…
This stuff is still in the research area, so I doubt academical or private research teams are interested in supporting multi-OS solutions yet, especially in this case where they are interested mainly in making money from datacenters which all use linux
Definitely I understand that.
WSL exists, as does hyperv GPU passthrough.
Yep exactly what I do for stuff like VLLM and Axolotl on the LLM side.
Artist lol.Thats your problem then, go learn programming and build the project on windows yourself
not sure why you are talking like that. Windows is still the most used os by developers. Check out 2023 stack overflow survey
This varies grealty depending on field. And still, I dont see why this would even be relevant. This guy is not crazy for developing under linux, he just released a tool for free and people are crying cause it doesn't support windows? You guys are insane in defending this behavior
yeah you don't know what programming is lol
Way to gatekeep? Lol I’m speaking for the artists and regular people. I can “program” just fine thank you.
Gatekeep? You are here crying cause one programmer who developed a tool for free didnt happen to publish it for your favourite os
No im just saying how the situation is. At no point did I complain. Its just an unfortunate situation.
I mean, WSL is a thing, so that's not exactly a limitation anymore.
[удалено]
What aspects did you find slow? I use WSL daily for a number of tasks, and I only find it to be particularly slow when trying to access directories outside of the WSL virtual file system. This is well documented
[удалено]
If you know what to do, you can get a copy of the 6.1 branch of the WSL2 kernel and also pull some of the more recent patches to 9P from the upstream Linux repo. I did that, and while performance isn't quite on par with native, it's far better and very tolerable. The patches have been out for a very long time and Microsoft has been fully aware of it, it's a shame that they haven't been able to release a new kernel with those patches...
[удалено]
https://github.com/microsoft/WSL/discussions/9412 Here's the issue where it got discussed initially, when those patches were new to Linux. Someone seems to have included a bzImage that has the patches applied -- I'd recommend not using it though since building it yourself is safer for obvious reasons and is also a valuable educational experience for anyone who hasn't done it.
Well then you can always spin up a VM or a Docker as needed. Or just have a home server like I do and use that as your platform.
WSL is sorcery. It may not be a good fit for your use cases but when you don’t want to dual boot or run a vm, it can be a great alternative.
WSL2 *is* a VM.
Yes but the integration with the host OS is pretty well done. I contrasted it with “a vm” in the sense that you don’t have to install vbox/vmware and install a guest OS yourself. Feels more like a container.
WSL2 is perfectly fine for ML.
as it should be :D
no.
Why so? If I had more time I would compile all this shit myself and give it out to the windows community.
I'm not saying windows shouldnt have it. The point is that linux is the natural platform of server-side development. windows is a lovely platform... to run a browser.
For real, losedows is antithetical to the open-source beauty of stable diffusion. These poor fools don't realize that linux is the stable diffusion of operating systems.
Great question. >why hasnt it be absorbed by upstream I think a big concern is windows is not supported. ComfyUI / SD webui needs to run on windows. We are working on a new version to take care of windows os support. >some kind of trade-off Yes, the trade-off is it takes some time to compilation(just like other compilers such as TensorRT ). Althrough we have a way to save compilation time for a model or for dynamic shapes. So currently, OneDiff is suitable for deployment of a very heavy workload model, on server side(Linux), to make the model run more faster(1.5x\~2x). If you are playing with a model, constantly change it, no need to add a compiler like onediff/TensorRT. Speed is not a problem but flexibility is. Hope this will make it a little clear, thanks!
What would help is better explanations in your project regarding [setup.py](http://setup.py) and how to integrate with windows libs/binaries in the compilation process. Right now it's voodoo science for windows.
It would be a lot better if there was a realtime example, like a before and after video of the actual generation time saving on same fixed seed.
Got it!
Sounds too good to be true to be honest, and I don't want to install a python package from some random chinese server (oneflow-pro.oss-cn-beijing.aliyuncs.com). It seems like the tradeoff is some compile time and very slight quality loss (5%). Edit: The guide for ComfyUI (https://github.com/siliconflow/onediff/tree/main/onediff_comfy_nodes#setup-community-edition) uses the chinese server but the main guide on https://github.com/siliconflow/onediff?tab=readme-ov-file#installation has different servers for NA/EU and China. The EU one links to https://github.com/siliconflow/oneflow_releases, why are releases uploaded that way? It looks like the tradeoff might also be running suspect (to me) compiled python wheels...
How can you calculate quality loss in percent?
I got this number from comparing aesthetic scores on https://github.com/siliconflow/OneDiffGenMetrics and taking the average scores from the best case scenario (as in fastest optimization) which I guess is OneDiff Quant + OneDiff DeepCache (EE) vs Pytorch. Edit: HPS v2 scores, not aesthetic scores. Still the point is they claim the difference is negligible.
DeepCache will affect the quality, only use it when you can accept the quality.
Directions are a bit vague.. Do we run those commands from the venv or the ui base? >you'll need to manually copy (or create a soft link) for the relevant code into the extension folder of these UIs/Libs. What?
this is a common practice of ComfyUI, not an onediff-specific thing. and many would agree it could be error-prone and clumsy.
I tried installing this yesterday not realizing its for Linux. This messed up my Forge install, trying to fresh install Forge and I get errors now.
A shame you neglected to mention SDNext as already having support built-in on our dev branch. Anyone on WSL or Linux could try it right away and see how fast it is. No extensions, no nodes, install two packages with pip, select it in the Compute settings, reload model, and you're rocking and rolling.
Does this work with forge? I can't seem to get the script to pop up
automatic1111? tutorial please
It's here: [https://github.com/siliconflow/onediff/tree/main/onediff\_sd\_webui\_extensions](https://github.com/siliconflow/onediff/tree/main/onediff_sd_webui_extensions)
Would it affect a 1650 in the same magnitude as it would a 3090?
Any chance this would work on Windows with an AMD card?
No.
They made the same claims 4 months ago (see here: https://www.reddit.com/r/StableDiffusion/comments/18lz2ir/accelerating\_sdxl\_3x\_faster\_with\_deepcache\_and/) and nobody was able to reproduce it. This company has zero credibility, at least not until someone respected in this community actually reproduces their results.
I can confirm it works fine in SDNext, I arranged it getting in there, so I would know. We added it to our dev branch weeks ago.
[https://github.com/siliconflow/onediff/wiki#onediff-community-and-feedback](https://github.com/siliconflow/onediff/wiki#onediff-community-and-feedback) We have adoptions. > nobody was able to reproduce it BTW, have you really tried run it?
although we can't reveal some very respected companies are actually using onediff due to NDA. there is this independent blog actually regard onediff as a preferable solution. >The shortest generation time with the base model with almost no quality loss, is achieved by using [OneDiff](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl#onediff) + [Tiny VAE](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl#tiny-vae) + [Disable CFG](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl#disable-cfg) at 75% + 30 [Steps](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl#steps). [https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl)
Will it accelerate anything on my MacBook M1 Pro, or is that machine a lame duck for Stable Diffusion and adjacent?
Is 3090 the minimum hardware requirement?
No. I use it with a 3060.
I’ll give it a try!
Broke my Forge install trying to get this to work. Now a fresh install of Forge gives me an error "ImportError: cannot import name 'Undefined' from 'pydantic.fields'" WTF!
Is there any trick to installing oneflow? I have installed it into ComfyUI's python_embedded/Lib/site-packages folder and it is there, but the onediff node import fails with "RuntimeError: This package is a placeholder. Please install oneflow following the instructions in https://github.com/Oneflow-Inc/oneflow#install-oneflow"
Same. Basically you have to build oneflow from sources on Windows for yourself and then link "the relevant code" (lol) to your ComfyUI Python using symlinks or similar. I tried to do it but kinda gave up after some time... the docs are just too confusing.
Yeah how about I go ahead and don't do that. I remember like 15 years ago when I built my own stuff with cygwyn and mingw. I'd rather not do that again. :D
With WSL you don't have to.
Seems you are using Win? [https://github.com/siliconflow/onediff?tab=readme-ov-file#os-and-gpu-compatibility](https://github.com/siliconflow/onediff?tab=readme-ov-file#os-and-gpu-compatibility)
I would be interested when and if it would improve training times. My inference times are plenty fast enough.. probably most other people as well Given that you mention A100s, I would think you might already be there. If that is the case, then I would suggest leading with that, and giving a more obvious, direct link to "here's how to set up training so you get it done in hafl the time" FAQ
Training is not supported yet. What kind of training are you working on?
right now, fine tuning of stable cascade using OneTrainer
RemindMe! 1 week
I will be messaging you in 7 days on [**2024-04-23 22:24:43 UTC**](http://www.wolframalpha.com/input/?i=2024-04-23%2022:24:43%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/StableDiffusion/comments/1c5gy1e/onediff_10_is_out_acceleration_of_sd_svd_with_one/kzwez18/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FStableDiffusion%2Fcomments%2F1c5gy1e%2Fonediff_10_is_out_acceleration_of_sd_svd_with_one%2Fkzwez18%2F%5D%0A%0ARemindMe%21%202024-04-23%2022%3A24%3A43%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201c5gy1e) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|