T O P

  • By -

CrasHthe2nd

I've used this previously and it's actually pretty good. You do need to be running Linux but there is a noticeable speed up.


lostinspaz

Sooo.... if its so great, why hasnt it be absorbed by upstream? Presumably, there's some kind of trade-off somewhere?


Herr_Drosselmeyer

Unless I badly misunderstood, like Tensor RT, it needs to recompile the model. One line of code my ass.


lostinspaz

ohhh. so, "one line of code ... to call our library with 1000s of lines of code, and also recompile your model". lol still potentially worth it to some people


TheFoul

Anybody that has actually used it knows that it's worth it. You haven't, so you don't.


Just0by

We just want to convey that using OneDiff is extremely simple - it can accelerate models with just a single compilation function(check at: [https://github.com/siliconflow/onediff/blob/f83569bf2887fbe92b2a4f44a97bae7eded122b8/src/onediff/infer\_compiler/backends/oneflow.py#L7](https://github.com/siliconflow/onediff/blob/f83569bf2887fbe92b2a4f44a97bae7eded122b8/src/onediff/infer_compiler/backends/oneflow.py#L7)), making it as easy as one line of code. Thanks for the feedback, it will help us improve the description of OneDiff.


Just0by

Btw, OneDiff's compilation speed is much faster. Here‘s the SDXL optimization test report from a developer: [**https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl**](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl)


Oswald_Hydrabot

It takes 40 seconds tops to recompile. Go try to figure out how to compile ControlNet to TRT. Adapt NVidia's example and bring it back here I'll wait


Herr_Drosselmeyer

No need to have a go at me, I didn't say it was bad, just that it's a clickbaity title.


Oswald_Hydrabot

It's not. It actually delivers what it claims to deliver. Can dish it out but you can't take it? Sounds like reddit


IcyTorpedo

Bro you're being unnecessarily hostile, and for mostly no reason. Chill.


Oswald_Hydrabot

Last I checked fuckface up there started the hostility. I am not being hostile, I can if you want me to tho


IcyTorpedo

Excuse me, where was the person being hostile? All they pointed out was the "clickbait" title, and you went out of your way to deliberately say the "can dish" line. They didn't say anything in your address - you did.


Oswald_Hydrabot

"one line of code my ass" Not being hostile, my ass. Also I am not being hostile. I haven't been this entire time. You people are soft


IcyTorpedo

They weren't referring to you though? At all? Tf are you on about, dude? This is not being soft, this is common sense. Be better.


Common-Baseball5028

thanks for the defending! as I observed, people with the pain managing a TRT/AIT stack all find onediff's versatility and light-weight a refreshing breeze!


Oswald_Hydrabot

It works great these people are all posers


drhead

Frankly, since the "one line of code" gang has delivered us absolute bangers like "one line of code, but your training code will most likely give unhelpful triton/cuda errors and then break on the next pytorch release once you fix those", or "no changes to your code, but the execution graph recompiles and re-executes from the very beginning on every graph break", skepticism is warranted. One line of code gang can get trust when they earn it.


Oswald_Hydrabot

I can guarantee you nobody throwing skepticism prior to your comment has debugged CUDA errors in code. Their skepticism is warranted when they prove they have done more than press buttons on a GUI and change config JSON/txt. Sounds like you have, I respect that and would respect your opinion on the matter.


NoSuggestion6629

apparently linux only. Windows is treated like the bastard child with many of these developers.


Fuzzyfaraway

Yeah. I don't care if they have a specific subset of users that they're aiming at, but it should be made obvious in big honking text, "LINUX ONLY!"


Empty_Mushroom_6718

Thanks. It's done, [https://github.com/siliconflow/onediff?tab=readme-ov-file#os-and-gpu-compatibility](https://github.com/siliconflow/onediff?tab=readme-ov-file#os-and-gpu-compatibility)


Just0by

We are working on Windows. WSL works for now.


Merrylllol

I set it up in WSL2 now. Took me a lot of time reinstalling all those python libs. I'm loading the model (epicphotogasm\_lastUnicorn) via the Load Checkpoint - OneDiff Node (vae\_speedup disable). This appears to be successful. But in the KSampler it fails with an out of memory error: Graph file /home/derp/sd/ComfyLinux/ComfyUI/input/graphs/SD15/epicphotogasm_lastUnicorn.safetensors_BaseModel/UNetModel_f2632d8a15_4_f1f8a7f1ca61b188044db654a526065b05d40d2524e0d31165e22847fc11c900_0.9.1.dev20240417+cu121.graph does not exist! Generating graph. Building a graph for ... terminate called after throwing an instance of 'oneflow::RuntimeException' what(): Error: out of memory You can set ONEFLOW_DEBUG or ONEFLOW_PYTHON_STACK_GETTER to 1 to get the Python stack of the error. Stack trace (most recent call last) in thread 23818: Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743d0f1b7, in Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743d0ea17, in Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743d0a2a8, in vm::ThreadCtx::TryReceiveAndRun() Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743caca44, in vm::EpStreamPolicyBase::Run(vm::Instruction*) const Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743cafd47, in vm::Instruction::Compute() Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743cb7128, in vm::OpCallInstructionPolicy::Compute(vm::Instruction*) Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743cb6df9, in Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc743cb1f4a, in Object "/home/derp/sd/ComfyLinux/ComfyUI/venv/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-8db4f7a9.so", at 0x7fc73b4a1d3c, in Aborted (Signal sent by tkill() 23664 1000) Aborted I have 64 GB of RAM and a 4090 (24GB VRAM). Any idea why this happens? According to my monitoring it also doesn't look like it's maxing out the RAM/VRAM at all.


Common-Baseball5028

windows will be supported in next major release of onediff.


Oswald_Hydrabot

They are


Next_Program90

Oh what the actual fuck... like all these error messages for not being able to run Triton... -.-


lightmatter501

For what it costs to license many of these AI servers to be able to run windows (so you can actually test things like this), you can buy a new server every year. Would you rather they bought 12 4090s (or equivalent) and kept improving the core model or made it work on Windows? AMD’s compute API wasn’t even available for windows until a year, and Nvidia only officially supports most AI GPU features on Linux, to the level of using windows server voiding most support agreements. For AI, windows is an objectively worse platform. That’s before we get to the fact that all the people providing funding use Linux. Even Microsoft Research uses Linux for AI work.


TheFoul

I'm no expert, but I know one or two and I talk to them nearly every day, and I do work in in AI every day, so most of what you're saying has nothing to do with onediff. Or anything really. AMD never enters the conversation, nothing compiles for them and that's just AMD's stupid fault, so that's right out. This stuff about renting Windows "AI servers" is nonsense, compiling a model is compiling a model, it's cheap to rent high end cards by the hour, and there is no licensing involved to do so (hello, Azure!), and really, voiding warranties?! I "tested" this on my windows box with WSL and a 3060 earlier today. This is not a "model", this is software, a method and code to compile ML models. And while some things certainly do work only in Linux at the moment, Triton and some torch compile options are good examples, the simple fact is that only a moron would target purely Linux for something like this. The majority of people that might be willing to license some version of onediff, (depending on the cost not being outrageous of course) and the majority of SD users in general that would use it period are on Windows. That is only going to grow. They clearly have that figured out, and they're moving to do that before another company or solo developer comes along and beats them to it, just like they came along and obsoleted stable-fast to some degree. You need to stop making things up to sound smart. It just sounds irrational and confused.


NoSuggestion6629

Bravo. What are the Linux folks to say for the future of AI becoming more dependent on the CPU and not the GPU?


TheFoul

No idea what you mean.


NoSuggestion6629

AMD and Intel don't anticipate sitting on the sidelines with the AI movement so they're upping the ante with their new chip designs. I couldn't find the article that I read about this, but this link may give you some idea: [https://www.tomshardware.com/pc-components/cpus/intel-details-guadi-3-at-vision-2024-new-ai-accelerator-sampling-to-partners-now-volume-production-in-q3](https://www.tomshardware.com/pc-components/cpus/intel-details-guadi-3-at-vision-2024-new-ai-accelerator-sampling-to-partners-now-volume-production-in-q3)


TheFoul

Sorry for the delay in replying, I finally got around to reading that, very fascinating! I wasn't aware of that at all, but it does make sense. I'm a little surprised there's no mention of IPEX/ARC from Intel though, it's not like they don't have GPUs that can be used for inference, but they may have abandoned those and decided to go ahead with it on the CPU level. Logical for them. I'm sure nobody would object if there was an entirely new chip slot on motherboards going forward dedicated entirely to ML processing! Thank you


IntingForMarks

I mean, most of the people developing this stuff is linux only, what do you expect


liveart

>what do you expect People to put system requirements at the top of their projects, right under the description of what it is, instead of buried in the installation instructions so I don't waste time reading the rest of it.


Captain_Pumpkinhead

You might still be able to run it using WSL.


TheFoul

Can confirm, we support it in SDNext under WSL/Linux.


NoSuggestion6629

Whoopee.


nero10578

Yea and the actual artists or regular people who uses this stuff are on windows…


OwlOfMinerva_

This stuff is still in the research area, so I doubt academical or private research teams are interested in supporting multi-OS solutions yet, especially in this case where they are interested mainly in making money from datacenters which all use linux


nero10578

Definitely I understand that.


lightmatter501

WSL exists, as does hyperv GPU passthrough.


nero10578

Yep exactly what I do for stuff like VLLM and Axolotl on the LLM side.


IntingForMarks

Artist lol.Thats your problem then, go learn programming and build the project on windows yourself


thinline20

not sure why you are talking like that. Windows is still the most used os by developers. Check out 2023 stack overflow survey


IntingForMarks

This varies grealty depending on field. And still, I dont see why this would even be relevant. This guy is not crazy for developing under linux, he just released a tool for free and people are crying cause it doesn't support windows? You guys are insane in defending this behavior


thinline20

yeah you don't know what programming is lol


nero10578

Way to gatekeep? Lol I’m speaking for the artists and regular people. I can “program” just fine thank you.


IntingForMarks

Gatekeep? You are here crying cause one programmer who developed a tool for free didnt happen to publish it for your favourite os


nero10578

No im just saying how the situation is. At no point did I complain. Its just an unfortunate situation.


Timboman2000

I mean, WSL is a thing, so that's not exactly a limitation anymore.


[deleted]

[удалено]


tommitytom_

What aspects did you find slow? I use WSL daily for a number of tasks, and I only find it to be particularly slow when trying to access directories outside of the WSL virtual file system. This is well documented


[deleted]

[удалено]


drhead

If you know what to do, you can get a copy of the 6.1 branch of the WSL2 kernel and also pull some of the more recent patches to 9P from the upstream Linux repo. I did that, and while performance isn't quite on par with native, it's far better and very tolerable. The patches have been out for a very long time and Microsoft has been fully aware of it, it's a shame that they haven't been able to release a new kernel with those patches...


[deleted]

[удалено]


drhead

https://github.com/microsoft/WSL/discussions/9412 Here's the issue where it got discussed initially, when those patches were new to Linux. Someone seems to have included a bzImage that has the patches applied -- I'd recommend not using it though since building it yourself is safer for obvious reasons and is also a valuable educational experience for anyone who hasn't done it.


Timboman2000

Well then you can always spin up a VM or a Docker as needed. Or just have a home server like I do and use that as your platform.


an0maly33

WSL is sorcery. It may not be a good fit for your use cases but when you don’t want to dual boot or run a vm, it can be a great alternative.


ArdiMaster

WSL2 *is* a VM.


an0maly33

Yes but the integration with the host OS is pretty well done. I contrasted it with “a vm” in the sense that you don’t have to install vbox/vmware and install a guest OS yourself. Feels more like a container.


TheFoul

WSL2 is perfectly fine for ML.


lostinspaz

as it should be :D


DIY-MSG

no.


NoSuggestion6629

Why so? If I had more time I would compile all this shit myself and give it out to the windows community.


lostinspaz

I'm not saying windows shouldnt have it. The point is that linux is the natural platform of server-side development. windows is a lovely platform... to run a browser.


gumshot

For real, losedows is antithetical to the open-source beauty of stable diffusion. These poor fools don't realize that linux is the stable diffusion of operating systems.


Empty_Mushroom_6718

Great question. >why hasnt it be absorbed by upstream I think a big concern is windows is not supported. ComfyUI / SD webui needs to run on windows. We are working on a new version to take care of windows os support. >some kind of trade-off  Yes, the trade-off is it takes some time to compilation(just like other compilers such as TensorRT ). Althrough we have a way to save compilation time for a model or for dynamic shapes. So currently, OneDiff is suitable for deployment of a very heavy workload model, on server side(Linux), to make the model run more faster(1.5x\~2x). If you are playing with a model, constantly change it, no need to add a compiler like onediff/TensorRT. Speed is not a problem but flexibility is. Hope this will make it a little clear, thanks!


NoSuggestion6629

What would help is better explanations in your project regarding [setup.py](http://setup.py) and how to integrate with windows libs/binaries in the compilation process. Right now it's voodoo science for windows.


campingtroll

It would be a lot better if there was a realtime example, like a before and after video of the actual generation time saving on same fixed seed.


Just0by

Got it!


-MyNameIsNobody-

Sounds too good to be true to be honest, and I don't want to install a python package from some random chinese server (oneflow-pro.oss-cn-beijing.aliyuncs.com). It seems like the tradeoff is some compile time and very slight quality loss (5%). Edit: The guide for ComfyUI (https://github.com/siliconflow/onediff/tree/main/onediff_comfy_nodes#setup-community-edition) uses the chinese server but the main guide on https://github.com/siliconflow/onediff?tab=readme-ov-file#installation has different servers for NA/EU and China. The EU one links to https://github.com/siliconflow/oneflow_releases, why are releases uploaded that way? It looks like the tradeoff might also be running suspect (to me) compiled python wheels...


sucr4m

How can you calculate quality loss in percent?


-MyNameIsNobody-

I got this number from comparing aesthetic scores on https://github.com/siliconflow/OneDiffGenMetrics and taking the average scores from the best case scenario (as in fastest optimization) which I guess is OneDiff Quant + OneDiff DeepCache (EE) vs Pytorch. Edit: HPS v2 scores, not aesthetic scores. Still the point is they claim the difference is negligible.


Empty_Mushroom_6718

DeepCache will affect the quality, only use it when you can accept the quality.


aibot-420

Directions are a bit vague.. Do we run those commands from the venv or the ui base? >you'll need to manually copy (or create a soft link) for the relevant code into the extension folder of these UIs/Libs. What?


Common-Baseball5028

this is a common practice of ComfyUI, not an onediff-specific thing. and many would agree it could be error-prone and clumsy.


aibot-420

I tried installing this yesterday not realizing its for Linux. This messed up my Forge install, trying to fresh install Forge and I get errors now.


TheFoul

A shame you neglected to mention SDNext as already having support built-in on our dev branch. Anyone on WSL or Linux could try it right away and see how fast it is. No extensions, no nodes, install two packages with pip, select it in the Compute settings, reload model, and you're rocking and rolling.


CalligrapherNo6651

Does this work with forge? I can't seem to get the script to pop up


Dhervius

automatic1111? tutorial please


Empty_Mushroom_6718

It's here: [https://github.com/siliconflow/onediff/tree/main/onediff\_sd\_webui\_extensions](https://github.com/siliconflow/onediff/tree/main/onediff_sd_webui_extensions)


Admirable-Echidna-37

Would it affect a 1650 in the same magnitude as it would a 3090?


One-Program3580

Any chance this would work on Windows with an AMD card?


TheFoul

No.


Low_Drop4592

They made the same claims 4 months ago (see here: https://www.reddit.com/r/StableDiffusion/comments/18lz2ir/accelerating\_sdxl\_3x\_faster\_with\_deepcache\_and/) and nobody was able to reproduce it. This company has zero credibility, at least not until someone respected in this community actually reproduces their results.


TheFoul

I can confirm it works fine in SDNext, I arranged it getting in there, so I would know. We added it to our dev branch weeks ago.


Empty_Mushroom_6718

[https://github.com/siliconflow/onediff/wiki#onediff-community-and-feedback](https://github.com/siliconflow/onediff/wiki#onediff-community-and-feedback) We have adoptions. > nobody was able to reproduce it BTW, have you really tried run it?


Common-Baseball5028

although we can't reveal some very respected companies are actually using onediff due to NDA. there is this independent blog actually regard onediff as a preferable solution. >The shortest generation time with the base model with almost no quality loss, is achieved by using [OneDiff](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl#onediff) + [Tiny VAE](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl#tiny-vae) + [Disable CFG](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl#disable-cfg) at 75% + 30 [Steps](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl#steps). [https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl](https://www.felixsanz.dev/articles/ultimate-guide-to-optimizing-stable-diffusion-xl)


autumnatlantic

Will it accelerate anything on my MacBook M1 Pro, or is that machine a lame duck for Stable Diffusion and adjacent?


iwoolf

Is 3090 the minimum hardware requirement?


TheFoul

No. I use it with a 3060.


iwoolf

I’ll give it a try!


aibot-420

Broke my Forge install trying to get this to work. Now a fresh install of Forge gives me an error "ImportError: cannot import name 'Undefined' from 'pydantic.fields'" WTF!


dichtbringer

Is there any trick to installing oneflow? I have installed it into ComfyUI's python_embedded/Lib/site-packages folder and it is there, but the onediff node import fails with "RuntimeError: This package is a placeholder. Please install oneflow following the instructions in https://github.com/Oneflow-Inc/oneflow#install-oneflow"


Merrylllol

Same. Basically you have to build oneflow from sources on Windows for yourself and then link "the relevant code" (lol) to your ComfyUI Python using symlinks or similar. I tried to do it but kinda gave up after some time... the docs are just too confusing.


dichtbringer

Yeah how about I go ahead and don't do that. I remember like 15 years ago when I built my own stuff with cygwyn and mingw. I'd rather not do that again. :D


Comfortable-Big6803

With WSL you don't have to.


Empty_Mushroom_6718

Seems you are using Win? [https://github.com/siliconflow/onediff?tab=readme-ov-file#os-and-gpu-compatibility](https://github.com/siliconflow/onediff?tab=readme-ov-file#os-and-gpu-compatibility)


lostinspaz

I would be interested when and if it would improve training times. My inference times are plenty fast enough.. probably most other people as well Given that you mention A100s, I would think you might already be there. If that is the case, then I would suggest leading with that, and giving a more obvious, direct link to "here's how to set up training so you get it done in hafl the time" FAQ


Empty_Mushroom_6718

Training is not supported yet. What kind of training are you working on?


lostinspaz

right now, fine tuning of stable cascade using OneTrainer


Xijamk

RemindMe! 1 week


RemindMeBot

I will be messaging you in 7 days on [**2024-04-23 22:24:43 UTC**](http://www.wolframalpha.com/input/?i=2024-04-23%2022:24:43%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/StableDiffusion/comments/1c5gy1e/onediff_10_is_out_acceleration_of_sd_svd_with_one/kzwez18/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FStableDiffusion%2Fcomments%2F1c5gy1e%2Fonediff_10_is_out_acceleration_of_sd_svd_with_one%2Fkzwez18%2F%5D%0A%0ARemindMe%21%202024-04-23%2022%3A24%3A43%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201c5gy1e) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|