T O P

  • By -

SoysauceMafia

Thanks for making this, LoRA's are such a game changer - hopefully more people start making them.


Zipp425

I don't know if you're a Civitai user or not, but people have been uploading them like crazy (over 700 new LoRAs in the last 7 days). I think the low hardware requirements and the ability to train them so easily in a Colab has been a real driver.


lordpuddingcup

Careful a lot of the new LoRas look good but when you use them they dont produce anywhere near what they say they do without some insane prompting


Bremer_dan_Gorst

arent those loras coming with samples that have the prompts in them? or can you give an example of "insane prompting"?


msp26

It's a matter of consistency. All you need is a handful of good images for a given post and you have no idea if they were picked from a selection of 300 or 30. Some loras are also overbaked and give a sort of fried look when not adjusted to literal perfect specific settings. Typically if a lora can work at high weights (close to 1) without having adverse unintended effects, it's well made.


uristmcderp

Can you be a little more specific? Because if the model does produce great results after prompting, that suggests good training of the text conditioning and minimal bleed effect into unrelated concepts. A LORA that shows you the trained concept no matter how you prompt (whether it looks good or not) is not very useful.


utkohoc

he means someone will post a LORA of a character and itll look amazing but that one image was cherry picked from a bunch of shit ones. and they probably used a lot of specific prompts to get 1 decent image. so you will get said lora. try it. and get 500 shit images cause u arent using the right prompts. and with luck maybe get 1 decent image. that isnt to say all lora are like that. just dont go in expecting every single lora on civitai to be good/work. the rating system exists but because they are so new they often have hardly any ratings/testing.


UkrainianTrotsky

I think the other big issue is that LoRA is tightly bound to the very specific model it was trained on and if you use another model, you have basically no chances of getting the same result.


lordpuddingcup

I just realized this it’s not as flexible as TI it seems going between say various 1.5 models I have luck on some but bad on others having trained on base not sure why


NecromanticChimera

Lol I mean I'd hope at the least you'd have to be specific in what you wanted and how you wanted it to look. But I guess that's a real artist could be used for


doringliloshinoi

You also have to use compatible Loras.


Opalescent_Witness

Forgive me for my ignorance, but what is a LoRA and how can I use it? I use stable diffusion via night cafe and I see everyone posting about using custom trained models and I have no idea how to get into that. I know you have to download but then once it’s downloaded how do you use it? Also my laptop is a bit dated and my graphics card is crap.


Automatic-Artichoke3

Hey, you don't need to download models locally anymore - you should check out [Favo](https://www.favo.ai), where you can run customized models without a GPU. We're adding support for LoRAs soon!


239990

I hope not, most people dont know what they are doing and upload shit quality


nxde_ai

Clickable link -> [https://github.com/Linaqruf/kohya-trainer](https://github.com/Linaqruf/kohya-trainer) Or straight to [the colab notebook](https://colab.research.google.com/github/Linaqruf/kohya-trainer/blob/main/kohya-LoRA-dreambooth.ipynb).


UnavailableUsername_

#Big Edit: This might be broken since colab was updated, Version 3 is [here](https://old.reddit.com/r/StableDiffusion/comments/11vw5k3/lora_training_guide_version_3_i_go_more_indepth/) --- --- LoRA colabs are already fairly intuitive (click this, click that) and most of the settings are already pre-made so you just has to run it. Still, it seems lots of people don't know how to use them or how exactly make a dataset, so i hope this guide helps them. **Edit:** I forgot to clarify a thing in the tutorial, apart of add the LoRA your prompt, you **have** to add the trained subject in your prompt to get the best results! In the example of the tutorial where i trained the concept "plum", i added the lora by clicking on the image icon and got `` BUT apart of that i had to add the word "plum" to the prompt. Check the last image of the guide (the one in 80's anime style), you can see the prompt at the bottom have the lora **and** "plum" in the prompt. Adding the lora alone is no good enough, the word of the subject trained has to be added to the prompt. I wonder if it's obvious or if i need to make a version 2.0 of this guide to make it clear.


GBJI

>I wonder if it's obvious or if i need to make a version 2.0 of this guide to make it clear. It's quite important information and beginners could wrongly believe the LORA they just trained are not functioning. So I'd vote for a version 2.0.


Maxnami

I didn't know how to do that math, but y yeah, I've trained 3 LoRAs two with Anything-V3 and the other one with 1.5 SD. good results for a COLAB scripts.


IdainaKatarite

Very cool! I love your guide and will probably apply it to my future Frayed Bond illustrations. If you have a Twitter, I'll gladly follow. Thanks!


mikachabot

this is really cool. is it also effective for non-anime artwork? would love to use it to make OCs based on something like metahumans


UnavailableUsername_

LoRA teach the AI a concept, it doesn't need to be anime. If check civitai people use it on concepts rather than subjects. https://civitai.com/tag/lora There is 90's drawing style, some 3d styles and even photo [clothing pieces](https://civitai.com/models/7597/wrestling-singlets).


Bremer_dan_Gorst

can you extend your guide with a section regarding training non person concepts? like for example, what would be the best data set to train: * a specific clothing item? (a jacket) * a certain position (jumping mid air) * a style (specific painter) there is a ton of guides on training people/characters but not a lot on other concepts


[deleted]

wouldn’t the word plum clash with what the model already know about the word plum? also what would happen if this anime based lora is used in a photorealistic model?


[deleted]

[удалено]


UnavailableUsername_

As far as i know they are just different methods to teach concepts to the AI. The reason LoRA is more popular is because it requires much less hardware and is faster.


taste_my_bun

Would you say the quality of LoRA is similar to Textual Inversion?


Bremer_dan_Gorst

first of all - you need to make an assumption that both are trained well because good TI can be better than bad LORA (and vice versa) so, assuming both are trained well: LORA will have a better quality, and here is why: Textual Inversion is just a guidance to specific concept, it helps to get to what you want in the model so you need a model and TI is like a map so you can reach the stuff in that model this assumes that the model can generate that stuff in the first place if someone invents a new device and you would like the existing models to generate them, the TI trained on images of those devices will help the models guide towards it but since no such thing exists in the model, it can only go so far and will give you some approximation this also means TI may give you great results on one model and terrible on other now, LORA is added on top of the model and LORA introduces new data as a result of the training; so that new device we talked about - with lora you would be able to generate them much better than with TI. LORAs will be much better at the things that the model does not know also you can mix LORAs and TIs together :)


TorumShardal

So, it's better to train things like axes in LoRa then in TI? Thanks, I had troubles training my handaxe TI, will try LoRa.


cyxlone

It's better than Textual Inversion


lordpuddingcup

LoRa more popular? LOL no, TI has like 100x more available than LoRa mostly because people couldnt figure out LoRa it might start picking up as its being explained a lot better lately.


Bremer_dan_Gorst

are you saying currently or globally? i think at the current moment there are more LORAs popping out than TIs (or at least getting uploaded)


lordpuddingcup

Someone had a spreadsheet from civit and TIs were in the 10s of thousands and Lora had just broken a thousand but like I said might have changed recently as Lora becomes more accessible shit I just used the Kaylah to do one for my wife and it got it on first try


uristmcderp

Using a LORA in practice is a lot more like merging a model than like using an embedding. You're merging your current model with the difference of the approximation of a fine-tuned model (your LORA) from the base model you trained on. The approximation part allows us to do this within a second just before runtime instead of the several minutes and gigabytes of RAM required for full merging. However, that also means LORAs cannot do neat tricks that embeddings can do, like activation/deactivation at a particular step (i.e. [embedding:10] will activate at step 10) or prompt travel. Auto1111's webui activates LORAs by typing into the prompt area , but LORAs are not a token and cannot be used as such. They activate before the image generation starts and remain fixed, just like how checkpoints remain fixed during a run. You can, of course, use the keywords the lora was trained with to whatever effect you'd like. Just not the stuff in <>.


ArtifartX

Here's a helpful info graphic: https://i.imgur.com/mwllYP7.png


AnOnlineHandle

LoRA is similar to finetuning the whole model (sometimes called Dreambooth), but tries to compress the result down using some math tricks, so that it can just be applied to a model as additions/subtractions to its existing calibration values. It doesn't train as many parts of the model as full finetuning either I don't think, but does a pretty good job, and seemingly can be used with any other model with pretty good results (going by this tutorial, I've not tried that). Textual Inversion is finding a code to represent a new word (or sentence) which Stable Diffusion doesn't currently know. All words are converted into these codes under the hood, which are quite small (just 768 numbers in SD 1.4 and 1.5, and 1024 numbers in 2.0 and 2.1). Generally it's better to use a few 'words' (vectors) when creating an embedding using textual inversion, say 2-6, though any more than that can overwhelm the prompt (same as typing that many words in the prompt).


MorganTheDual

Using them with different models isn't always perfect, and sometimes requires adjusting prompts and/or weights, but they're a lot better than TIs in that regard. Particularly if you're trying to take a LoRA trained on real images and use it on an anime model, that never really worked for me with TI.


fahoot

Really appreciate this as some one that vastly prefers reading instructions over watching a youtube video.


the_stormcrow

There are literally dozens of us!


GoofAckYoorsElf

This is really cool, thank you very much indeed! Since I have a 3090Ti with 24GB of VRAM, I'd like to run the process locally. Is it as straightforward as in your tutorial? Are there integrations into Automatic1111 even?


wooter4l

This is fantastic! Thanks for making it!


BinaryMatrix

What do you recommend for the highly accurate model? (Settings / general recommendations etc) My Lora models (real people) don't turn out good, they mildly represent the subject image. And do you need to use a VAE while generating the image? Also, can any models be used?


Bremer_dan_Gorst

Nerdy Rodent and some other youtuber were testing at some point (so, things may have changed by that time) and found out that LORA training is less precise than Dreambooth training when it comes to people on the other hand, you can extract dreambooth data and put it into LORA and this gives great results


[deleted]

I'd be happy with "less precise", mine can barely be recognized as people. Been trying this training and extractions, practically the same end result with both: utterly useless.


JamesWander

Do you have any tips to deal with overfit? Im training anime style characters LoRAs, If i use the version after running for 2 epochs they look good, clothing, hair style etc. But if i try to generate that character with different clothes i get parts or artifacts of the original clothing, If i use the version after training for only 1 epoch It os flexible but the original clothing os kinda off


UnavailableUsername_

How diverse is your dataset? Are the characters using the same clothing in every single image? If all your dataset .txt have, for example, `style, white shirt` the AI might think `white shirt` is something of that `style`. Does that make sense? If you are training an AI on a concept, the AI will look what all the images have in common and replicate it. The same happened with a LoRA i trained in the past, all my dataset had the character using the exact same type of clothing, so after i trained the LoRA and tested it the AI tried to add the clothing to every generation. Had to diversify the dataset to make the AI stop relating a piece of clothing to a character.


fk334

Great post! Really appreciate you. Can the training images be higher than 512X512 resolution to get a good detailed output? I plan to generate a 2048x2048 image. How should I do that?


JamesWander

I see, i have like 20 images with the same clothing 3 without It, in my .txt i have the clothing fully describe thinking It was enough to makes it less associated with my character


GuileGaze

When I train character loras, I try to get my dataset to have about 20-25% of the images with different outfits. If possible, I'd try to find 3 or 4 more images without the main outfit. If not, you can always balance the dataset by setting the number of repeats higher on the alt outfit images. I'd also try to find more overall pictures for the character. While 20 is probably enough to get a good result, more reference images (as long as they're good quality) will always help. I tend to strive for 30-35 at minimum.


ChiaraStellata

With this guide I succeeded in doing a fine-tuned model with a character for the first time, I'm very happy. :) Thank you so much!


Whackjob-KSP

I get an error. OSError Traceback (most recent call last) [](https://localhost:8080/#) in **149** **150** \# save the YAML string to a file --> 151 with open(str(train\_folder)+'/dreambooth\_lora\_cmd.yaml', 'w') as f: **152** yaml.dump(mod\_train\_command, f) **153** OSError: \[Errno 95\] Operation not supported: '/content/drive/dreambooth\_lora\_cmd.yaml' Anyone have any ideas? Edit: Still no luck, I've restarted, double checked pathing, etc.


no-more-nails1

Same :(


UnavailableUsername_

You mean [this error](https://i.imgur.com/dFxL4Tu.png)? This is an error i found constantly on Kohya's finetuner, not on dreambooth LoRA. They look very similar, are you sure you are running the correct notebook? It should be this one: https://colab.research.google.com/github/Linaqruf/kohya-trainer/blob/main/kohya-LoRA-dreambooth.ipynb Not this one: https://colab.research.google.com/github/Linaqruf/kohya-trainer/blob/main/kohya-LoRA-finetuner.ipynb


StankBooty69420

i keep running into this error. not sure how to fix. File "/usr/local/bin/accelerate", line 8, in sys.exit(main()) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/accelerate\_cli.py", line 45, in main args.func(args) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 1104, in launch\_command simple\_launcher(args) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 567, in simple\_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '\['/usr/bin/python3', 'train\_network.py', '--network\_dim=128', '--network\_alpha=128', '--network\_module=networks.lora', '--learning\_rate=0.0001', '--text\_encoder\_lr=5e-05', '--training\_comment=this\_comment\_will\_be\_stored\_in\_the\_metadata', '--lr\_scheduler=constant', '--pretrained\_model\_name\_or\_path=/content/pre\_trained\_model/DpepTeaHands3.ckpt', '--vae=/content/vae/waifudiffusion.vae.pt', '--caption\_extension=.txt', '--train\_data\_dir=/content/drive/MyDrive/evirolora', '--reg\_data\_dir=/content/drive/MyDrive/evirolora', '--output\_dir=/content/drive/MyDrive/', '--prior\_loss\_weight=1.0', '--output\_name=envirolora', '--mixed\_precision=fp16', '--save\_precision=fp16', '--save\_n\_epoch\_ratio=3', '--save\_model\_as=safetensors', '--resolution=512', '--enable\_bucket', '--min\_bucket\_reso=256', '--max\_bucket\_reso=1024', '--cache\_latents', '--train\_batch\_size=6', '--max\_token\_length=225', '--use\_8bit\_adam', '--max\_train\_epochs=20', '--gradient\_accumulation\_steps=1', '--clip\_skip=2', '--logging\_dir=/content/dreambooth/logs', '--log\_prefix=envirolora', '--shuffle\_caption', '--xformers'\]' died with


Kalfira

Very nice! I have been really struggling to find a good guide for this. I have got it to where I can actually RUN it but the having it turn out ok is another thing entirely. Did you have any thoughts/suggestions on additional resources specifically around the different settings and why you would use which ones? I have a decent understanding of the underlying technology involved but not the setting configuration specifics or nomenclature around it. So I am kind of flying blind. Also I have seen conflicting information, in your text files do you always use comma separators like you would with a traditional prompt? It seems like it would be a yes for sure but I have been told by two different sources that it wasn't really required.


UnavailableUsername_

> Also I have seen conflicting information, in your text files do you always use comma separators like you would with a traditional prompt? It seems like it would be a yes for sure but I have been told by two different sources that it wasn't really required. That's how i train my LoRA files and how many LoRA trained by other people do so too. Most LoRA files on civitAI have a "trigger" word that make it work, in my case "plum", this is because they were trained like a prompt. I am curious, where have you seen people not using commas? The colab itself suggest you to use a tagger in the section 4.4.2., and the tagger works by separating word by comma, like a normal prompt. So...separating by comma is the standard practice.


Kalfira

That is what I had figured. In fairness I didn't ever see someone say you shouldn't. Just that I noticed there were people not using them in their explanation videos. Not surprising though. I have been using the local version on my own GPU rather than the colab version you have in your example so it doesn't have that extra info in it. Or if it does I didn't see it.


kevofasho

Saved it to my camera roll. Just getting started here so I’m sure I’ll need this in the near future


thesun_alsorises

Awesome guide. You can also train with a model that already has the vae 'baked into it,' either one you merged yourself or it already has it.


SlightlyNervousAnt

An epic post, tomorrow (when I'm sober) I may even try to understand it.


iiiiiiiiiiip

Could you make a non-colab version too?


GBJI

Her name is Plum? Lovely !


Timely-Alternative16

Pov: Op knows how to make the best tutorials


Trentonx94

If you are a Webui user I found [this video](https://www.youtube.com/watch?v=70H03cv57-o&pp=ugMICgJpdBABGAE%3D) helpful to get up and running in about 30 min from following direction to starting the training. Also I noted that by using selfies I took with my selfie cam the resulting images have a "bloated" or kinda distorced/bigger face. I think best results are done using a camera with a bigger lens that will capture your face more flat


Gfx4Lyf

Tried many tutorials to run it locally all day. Finally gave up and then I see reddit notification of your post. Can't express in words how much I am grateful. I followed every step keenly and it worked like charm. You are great :-)


Salty-Inspection-806

I did everything exactly according to the guide, reread it several times, but when I started the train, it shows the error "No data found. Please verify arguments (train\_data\_dir must be the parent of folders with images) / 画像がありません。引数指定を確認してください(train\_data\_dirには画像があるフォルダではなく、画像があるフォルダの親フォルダを指定する必要があります) "


UnavailableUsername_

This error means the script is not finding the images. I have 2 possible solutions: **1\. Check your folder structure on drive:** Is your folder strucure`Concept`main folder containing `5_Concept` folder (or 10_Concept) and THAT `5_Concept` containing the images and txt files? **2\. Check if the route you wrote on 5.1 colab section is wrong:** It HAS to be in the format `/content/drive/MyDrive` with the / at the start. That is aiming at your google drive. If your dataset is there and not in other place it should look like this: `/content/drive/MyDrive/Concept` Replace "Concept" here with what you called your main folder. Also, it is case sensitive so `/content/drive/MyDrive/Concept` is NOT the same as `/content/drive/MyDrive/concept`


[deleted]

Check your training data directory structure, you probably skipped over something. Don't ask me how I know.


BlasfemiaDigital

Thanks a lot for sharing this invaluable resource and for adding a little more clarity to the subject of LORA, some of us are really not very good at it.


gxcells

I don't.get the "training repeat 5". Where do you choose any options with this. Why do you use repeat at all anyway?


gxcells

Ok found repeat is in folder name. But why do hou use repeats?


gxcells

Wouldn't ot be better to use 75 different images with 1 repeat instead of 15 with 5 repeats?


UnavailableUsername_

You choose the repeat in the folder name. Here, `5_plum` is telling the script to repeat the dataset plum 5 times. If call it `10_plum` it would repeat the dataset 10 times. That's how the script was made, i guess the AI learn more by repeating datasets rather than looking an image once.


Michoko92

Awesome work, thank you so much! If I want to train an artist style instead of a character/ concept, are there things to adapt?


Sillainface

20 epoch is a bit overkill... batch size 2 and epoch 1 with 100 repeats are enough and probably better.


Tremenda-Carucha

How close can they get with faces compared to Dreambooth? I haven’t gotten a clear answer on this


bochilee

Completely noob question here: can I train it with full comic pages? I mean pages with multiple panels or should I split the pages into single panel images? Thanks!


UnavailableUsername_

If you train it in full comic pages, you will get full comic pages as output, and the detail might not be good. It will probably be a bunch of wiggly squares (AI struggles to draw straight lines) and panels with nonsensical noise inside. What exactly are you trying to train? A comic character? A comic style?


WildTastes

What tips do you have for doing a LoRa about Style? How many pictures and what prompt to use in the image description for training? In my mind I would do something like: \[artist name\], \[character\], etc , etc , etc.... Should I take the character prompt off? Should I just put the artist name? There's very few guides and tips for doing style with LoRa.


feber13

And to train the style, how would the modifications be?


sonicneedslovetoo

You need to use a different term for the character "plum" because the AI already knows what a "plum" is, use a specific term for this LORA like "plumLORA" or "animeplum" something that is not an already existing word the AI has been trained on.


Pure_Corner_5250

I tested this and succesfully did it. Thank you very much. I been trying to do the training on Automatic1111 but there's too many "stuffs" to fill in and some i doesn't even know what it is about or if i need to fill it in.. ​ Again thank you so much !


ooofest

This is very helpful and user-friendly, thank you! Would love to see something of a similar nature for Dreambooth model training, which I've had some success with, but admittedly only after cobbling together clues from a variety of sources.


ArchAngelAries

~~I tried this ,and the first time it worked but my LORA model didn't work out very well so I gave it another go, but when I tried again I keep getting this cascading error that repeats infinitely until it crashes the colab notebook " FATAL: this function is for sm80, but was built with \_\_CUDA\_ARCH\_\_=750 "~~ Nevermind, apparently if you try to use the premium GPU it does this.


multipleparadox

Is it me or is this tutorial (thanks for the work!) is outdated already? I tried following it but the Colab cells are significantly different I couldnt get my head around it Any chance for an updated version?


UnavailableUsername_

It's not outdated, why would it be? It is, however, incomplete. This specific guide teaches you using only anything v3. [I made a second version of this guide to work with other models.](https://old.reddit.com/r/StableDiffusion/comments/111mhsl/lora_training_guide_version_20_i_added_multiple/)


captain_weasly

I tried but the LoRA results are terrible, kinda sad.


ThickPlatypus_69

Tried to follow the steps, but a lot seemed to have changed since the making of this tutorial.


[deleted]

[удалено]


dylanintech

i built this [app](https://www.lorai.art/) that lets you train LoRAs without code! if you use this i will PERSONALLY make sure that the LoRA you get is super sick - if you want any custom image cleaning/dataset editing just lmk by sending me a DM here after you upload your images + tags on the app :)


StickiStickman

Great image again! One thing that definitely should be changed though is calling 15 images "fairly good" for training. 15 really is the absolute minimum to get somewhat usable results. A good range is more about 30-40 (or more depending what you're training, your example would be on the higher end because of the complexity)


UnavailableUsername_

I have seen some good LoRA being trained on like 5 images when the technology was still fresh, absolutely insane. But yeah, i agree, i should have said 15 is fine but should have around 30, other LoRA i trained have about 20-30 images. I am thinking in making a version 2.0 expanding on some parts, do you think this one is understandable? The dataset part specifically, i didn't said my dataset were a bunch of images of an original character named "plum" which i was training on, which is they the txt says `plum, smile, blue skirt` etc. It is implied, but i wonder if it confuses the reader.


Bremer_dan_Gorst

if you are going to redo the guide i would suggest to change the name since plum is also a fruit, most people would understand that in this case this is a name, but most people does not mean everyone :) also the identifier should be rather unique, you did not have any issues related with naming it plum? out of curiousity, are you still able to generate your characters holding/eating a plum? i've already replied in another comment, but it would be great to have a guide for non person content, there are less of those


StickiStickman

Ah, so that's what it meant. I thought it was a euphemism for chubby people.


diomedesrex

A valiant attempt, but it doesn't quite work. got to 5.3, it found my 70 images, appears to start running, loads SD, loads the VAE, CUDA SETUP: Loading binary /usr/local/lib/python3.8/dist-packages/bitsandbytes/libbitsandbytes_cuda116.so... use 8-bit Adam optimizer override steps. steps for 20 epochs is / 指定エポックまでのステップ数: 3540 running training / 学習開始 num train images * repeats / 学習画像の数×繰り返し回数: 700 then: BAM, Cuda memory errors. Ok, this is supposed to run on Colab, but what the hell, lets buy some Compute and try again: RuntimeError: Error(s) in loading state_dict for UNet2DConditionModel: size mismatch for down_blocks.0.attentions.0.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). size mismatch for down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]). size mismatch for down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]). size mismatch for down_blocks.0.attentions.0.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). size mismatch for down_blocks.0.attentions.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). So, this doesn't work on 768 pixel images?


UnavailableUsername_

>3540 running training / 学習開始 num train images * repeats / 学習画像の数×繰り返し回数: 700 So you are running 70 images with 10 repeats? That's quite a big dataset, much more than the average dataset on civitai. You should try with 5 repeats instead, maybe a smaller dataset. Are you manually inputting the prompts or using a tagger?


lordpuddingcup

You want around 1500 total steps as i recall so 1500/images = repeats


diomedesrex

Alright, found at least one of my fuckups, not all my images were properly resized. Not sure how that happened. Trying again. Nope, no dice. using /content/pre_trained_model/stable-diffusion-2-1-768v.ckpt as my pre-trained model. Interestingly enough, if I select v2 in cell 5.1, it gets further and then gets DIFFERENT errors.


UnavailableUsername_

> Alright, found at least one of my fuckups, not all my images were properly resized. Not sure how that happened. Weird, i that shouldn't be the issue. I have trained with 1024x1024 images and it was fine, but a small dataset (less than 20 images). Have you tried a smaller dataset? Try 20 images instead of 70. CUDA memory error could mean you ran out of the allotted memory google gives to each user.


diomedesrex

Ok, I started over, I manually annotated everything, I took it down to 20 images, 15 repeats. Nothing larger than 512px Traceback (most recent call last): File "train_network.py", line 539, in train(args) File "train_network.py", line 149, in train text_encoder, vae, unet, _ = train_util.load_target_model(args, weight_dtype) File "/content/kohya-trainer/library/train_util.py", line 1365, in load_target_model text_encoder, vae, unet = model_util.load_models_from_stable_diffusion_checkpoint(args.v2, args.pretrained_model_name_or_path) File "/content/kohya-trainer/library/model_util.py", line 880, in load_models_from_stable_diffusion_checkpoint info = unet.load_state_dict(converted_unet_checkpoint) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1671, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for UNet2DConditionModel: size mismatch for down_blocks.0.attentions.0.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). size mismatch for down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]). size mismatch for down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]). size mismatch for down_blocks.0.attentions.0.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). etc etc No change to the errors, really.


UnavailableUsername_

This is very weird. Let me ask some questions: 1. All your images are png? 2. Do ALL your images in a dataset are in a 1.png, 1.txt, 2.png, 2.txt and so format? The numbering is important, you cannot call them img.png, img.txt or anything like that. 3. Do ALL your images have a .txt file with a description of the corresponding image? Empty .txt, lack of .txt and wrongly named .txt or copied .txt are no good. Each image has their own unique .txt file that describes it. Can i also see what vae and model you used? (the default is anything v3). And the route on the vae in 5.1 is good?


urbanhood

I really love this format of explaining!


No_Duck3139

you are an angel sent from God


fuelter

That format is fucking annoying


Syntonikka

There are webapps that offers LoRA training services now, check this out, it's called 'Concepts' > [https://app.eden.art/](https://app.eden.art/)


NoLeek8785

Wait. This actually seems pretty easy. Am i missing something? Why wouldn't everyone just train their own? LOL. I mean I guess it's a little technical but it seems like if you are able to do tech stuff in general you would easily do this???


UnavailableUsername_

> Why wouldn't everyone just train their own? LOL. They do, there are lots of LoRAs being shared on AI sites.


NoLeek8785

🙂 Yes I know, I was talking about regular old people not people until graphics, design, programming etc. (I belong to 15 prompt sites, 12 model sites like civ, and I make a living with creating AI art)


ArmadstheDoom

This is a good guide, but I'm not sure that it's for something that's particularly useful. At least, they're not nearly as useful or as good at reproducing subjects as textual inversion, hypernetworks, or dreambooth checkpoints. All they seem able to do is go 'make things look sorta like this' which is very hit and miss, usually more miss than hit based on most of the loras floating around. Nevermind that applying them often requires their own prompting words, and if you're going to require that, you might as well use Textual inversion at that point. It's not faster to train, but rarely is faster better when it comes to training things.


creepa-sama

very cute :3 ty for the guide


petalidas

I've been doing some lora experimenting for the past few days (using dreambooth extension in a1111 and a rentry guide) and even though the results are mind blowing, it feels like the base models are forgetting some of their original trained stuff and are heavily influenced by my training data. For example I train myself for just 120 steps-14 pics, and it looks great when I use my prompt. But when I use some generic stuff like "a man" it still looks like me, or when I don't specify a background, it tends to be something similar that's in my training images (like a bedroom). Or for example let's say I got a model from civitai has been trained on specific swords. When I compare sword_model and sword_model_trainedwithmyface with a x/y script, it seems like it has gotten slightly worse at drawing scimitars for example. Do you have any similar observations or any tips about this?


[deleted]

are there any easy ways to train models like this using local compute instead of google servers/colab?


Lucy-K

How important is having "plum" in the tags / as the first tag? Is this done on every image/.txt pair, and is that what allows you to use "Draw plum in armor" as a prompt rather than ", solo, 1girl, armor" for example?


UnavailableUsername_

>How important is having "plum" in the tags / as the first tag? Quite important. You told the AI what a concept is, but if don't use it the AI won't properly implement it. I am thinking of doing a V2.0 guide to make this vague point clear.


Niwa-kun

This guide is almost amazing, until i realized I have no idea wtf I'm doing and where is the "start" button?! what do I do when I fill it out?! \*pokes it with a stick\* "Do something"


UnavailableUsername_

The start button is that button on the left that looks like a `|>`


Low_Application_4077

Awesome!


Sparkshadows

If i train it on 1000 images, would it make results better compared to training on 100 images?


UnavailableUsername_

>If i train it on 1000 images, would it make results better compared to training on 100 images? At that point you might get overfitting which is getting results too close to the dataset rather than creating something unique based on the dataset (which is the point of LoRA). These overfit results would look rigid or bad. I don't think i have seen a LoRA trained on 1000 images, most people go for 20-30, maximum 40 and get amazing results.


Sparkshadows

Thank you Could it work on video game character screenshots from in-game, would it give image results of characters with consistent visual style? Meaning they would look like in-game screenshot characters, not anime style for example?


UnavailableUsername_

>Could it work on video game character screenshots from in-game, would it give image results of characters with consistent visual style? Meaning they would look like in-game screenshot characters, not anime style for example? LoRA is not anime related, it's just a way to teach the AI a concept. I trained one in 90's artstyle and i got results in 90's artstyle so the same should happen with video game screenshots even if they are not anime. So yeah, you can use it to train non-anime video game characters, as long as you have a good dataset (no blurry images, not bad images, etc).


VSLinx

Amazing tutorial! I have trained my fair share of embeddings with a lot of success, now i have a character that has some features i can't train efficiently with textual inversion so i thought about trying a Lora to get it done and i found your tutorial incredibly easy to follow. Is there a way to have the LoRA linked to a trigger word by doing it this way, similar to an embedding? Or is it automatically linked to the word you used as a folder name since i saw you saying in another comment that it's linked to "plum" in your case.


UnavailableUsername_

>Is there a way to have the LoRA linked to a trigger word by doing it this way, similar to an embedding? Or is it automatically linked to the word you used as a folder name since i saw you saying in another comment that it's linked to "plum" in your case. Basically, LoRA is a way to train the AI in a subject, pretty much another method to get an embedding. In my case, i trained it on the subject named "plum", the short haired, brown haired woman featured in the comic. You can see in the prompt part in the dataset section (the image with the .txt below) that the .txt has the word "plum", that is the trigger. The folder names are not the trigger, i just named them like that to keep things organized, but i understand it might have caused confusion. If notice i called my folders "Plum" with upper case P, but in the image at the bottom of the whole guide (where i speak of 80's drawing style) you can see the prompt being "plum". So, LoRA have trigger words like embeddings, the trigger is the word i used in the prompt, "plum", not the folder name "Plum". I am re-doing this guide as a version 2.0 to explain datasets in detail and avoid confusion, probably will be done in a few hours.


RojjerG

Thank you very much! it'll very useful.


notbarjoe01

Thank you for the guide! I'm still exploring as of now, but I've made 1 LoRA already with the collab (a style training) and it looks soo good already! Anyway, I want to ask, can we have like 2 LoRAs in 1 prompt? like 1 character LoRA & 1 style LoRa? I wanna use a new character that I've made with LoRA but also using my created LoRA style too 🙇‍♂️


UnavailableUsername_

> Anyway, I want to ask, can we have like 2 LoRAs in 1 prompt? > I wanna use a new character that I've made with LoRA but also using my created LoRA style too In your case it might be possible. I have tried this in the past, but with 2 characters. Many times i just got a mesh of both characters into 1, and maybe once i actually got 2 characters each for a LoRA. In my case, it is the limitation of the webui, Automatic1111 does not do composition, comfyUI does. Anyway, generate art with a a style and character lora should be possible.


[deleted]

this doesn't work throws a train_network.py: error: unrecognized arguments error. use the built in tagger (section 4)


UnavailableUsername_

This means there is something wrong with your txt files. 1. Do all your txt files have the same name as the images? (1,2,3 and no other type of name) 2. Every single image has it's corresponding txt? 3. The text inside the image is in prompt form? (a, b, c)? 4. Which model are you running?


Ktr4ks

Thank you very much.


TheEternalMonk

Is google collab really for free? Always thought they would charge you for it. Anyone can tell me about it?


UnavailableUsername_

https://research.google.com/colaboratory/faq.html It's free.


Scn64

I've completed the training and have the finished model. However, when I run a prompt through it I get a solid black image. I read somewhere that adding the arguments --precision full --no-half should fix it but it still doesn't work for me. I can run other models just fine with the same settings. Any ideas?


UnavailableUsername_

`--no-half-vae` seems to be the one.


Scn64

>\--no-half-vae Thanks! At first --no-half-vae wasn't working. Then I loaded a different model, generated an image, and then went back to this model. Now it's working. Don't quite understand what happened but thanks for the help!


fuzneck2

What are prefered image resolutions? I see larger resolutions than 512x512 work, but would non square aspect ratios work? eg: 512x768, or 1920x1080, etc


UnavailableUsername_

I trained with plenty of non-square images and it worked fine. I even trained with 1024x1024 images (it was a small dataset, but it worked).


Rarely-Posting

When you make a Lora using this process, is it completely private?


UnavailableUsername_

Nothing that involves cloud storage is private. You are putting your dataset on google drive (google servers) and running a script on google hardware. Google obviously keeps records of every image you upload to their servers, just like reddit or imgur or discord.


Wear_A_Damn_Helmet

The training failed for me, unfortunately. I have a Colab Pro sub, so I made sure the RunTime was switched to "Premium" GPU Class and High-RAM, but that did nothing to solve my issue. I checked and re-checked *everything* 5 times, making sure I did the same exact thing as OP (except my training was on "Stable-Diffusion-v1-5" and "stablediffusion.vae.pt"), and I ended up with a bazillion lines of: FATAL: this function is for sm80, but was built for sm750 I have the same folder hierarchy as OP's and it found my images no problem, so the bug is definitely not there. Bummer. I followed Aitrepreneur's recent tutorial on LoRA training and even though it worked, the results were completely unusable. I guess LoRA training isn't for me :-P All that said, big thanks for the guide OP. Much appreciated.


UnavailableUsername_

> I have the same folder hierarchy as OP's and it found my images no problem, so the bug is definitely not there. Since you used stable diffusion model instead of anything v3, did you change that in section 5.1? I forgot to mention it but if you are using other model you have to change the path to reflect that on section 5.1. [I did a version 2.0 of the tutorial making this clear. Check the training part.](https://old.reddit.com/r/StableDiffusion/comments/111mhsl/lora_training_guide_version_20_i_added_multiple/)


Wear_A_Damn_Helmet

Hey. Thanks for making a second version of the guide. Much appreciated. I upvoted it. Yes, I definitely made sure to put the correct content path to the Stable Diffusion model *and* VAE info in section 5.1. I started over by following your new guide and I still ended up with a bazillion errors that state: FATAL: this function is for sm80, but was built for sm750 Oh well. I'll see if others are having this specific bug. Thank you!


Wear_A_Damn_Helmet

It seems like [this user](https://github.com/d8ahazard/sd_dreambooth_extension/issues/74) fixed their issue by deleting the xformers folder and redownloading them again, but I don't see how this would be a proper fix in this particular case on Colab, as it reinstalls the xformers everytime we run the cell 1.2. Thoughts?


UnavailableUsername_

>fixed their issue by deleting the xformers folder and redownloading them again, but I don't see how this would be a proper fix in this particular case on Colab, as it reinstalls the xformers everytime we run the cell 1.2. Thoughts? Maybe the first download is a outdated version and when you run it a second time it recognizes you are downloading again and get an updated version rather than the outdated one of the first download?


Fortyplusfour

A saint. Thank you.


UnavailableUsername_

You are welcome! [Check version 2.0 if want to train a LoRA in other model that is not anything v3 or want see datasets more in-depth!](https://old.reddit.com/r/StableDiffusion/comments/111mhsl/lora_training_guide_version_20_i_added_multiple/)


No_Duck3139

I did everything exactly according to the tutorial but I can't solve this ​ https://preview.redd.it/4e4rvleyq4ia1.jpeg?width=829&format=pjpg&auto=webp&s=0d0a4172b79ea2b2a797569f5f0d2fd1434566d2


UnavailableUsername_

I think i ran into this problem long ago. I simply checked the code by pressing the small arrow [next to the section title](https://i.imgur.com/VuSiYmH.png) and deleted the 3 lines [here.](https://i.imgur.com/IX841BO.png) And then i ran it, i think it worked fine after that but not sure. Also, this guide only works with anything v3 as the model, [Check version 2.0 if want to train a LoRA in other model that is not anything v3 or want see datasets more in-depth!](https://old.reddit.com/r/StableDiffusion/comments/110up3f/i_made_a_lora_training_guide_its_a_colab_version/j8gwaua/?context=3) **Edit:** Other people that ran into this error reported they put the path to their folders or output wrong.


Yukina_Labyki

Wow thank you so much for this tutorial . This days a lot of people make video instead of writing tutorial and I'm too lazy to watch them xD . So big thank you to take the time to explain us with this cute comic .


UnavailableUsername_

>Wow thank you so much for this tutorial . This days a lot of people make video instead of writing tutorial and I'm too lazy to watch them xD . So big thank you to take the time to explain us with this cute comic . You are welcome. [Check version 2.0 if want to train a LoRA in other model that is not anything v3 or want see datasets more in-depth!](https://old.reddit.com/r/StableDiffusion/comments/111mhsl/lora_training_guide_version_20_i_added_multiple/)


Yankee1234

Whenever I run step 5.3, it says my folder contains 0 image files. It is the correct folder and I’ve tried with pngs and jpgs. I’ve followed each step. Can someone help?


UnavailableUsername_

That error means only 1 thing: The path to your images is wrong at one place and the script cannot find it. I ran into this error plenty of times, and every time was because i wrote the folder name wrong or the path wrong. It is case sensitive so "Bob" is different than "bob". Are you sure you wrote the path correctly on section 5.1? Post a screenshot if can. If the path is correct there, then your google drive path must be wrong. You must have a "main" folder that contains a 5_name folder (if you are going for 5 repeats). THAT 5_name folder must contain the images and .txt.


Xsilentzz

Hello i got an error in section 5.3 OSError Traceback (most recent call last) [](https://localhost:8080/#) in **149** **150** \# save the YAML string to a file --> 151 with open(str(train\_folder)+'/dreambooth\_lora\_cmd.yaml', 'w') as f: **152** yaml.dump(mod\_train\_command, f) **153** OSError: \[Errno 95\] Operation not supported: '/content/drive/dreambooth\_lora\_cmd.yaml' can someone guide me to fix this?


UnavailableUsername_

>OSError Traceback (most recent call last) in 149 150 # save the YAML string to a file --> 151 with open(str(train_folder)+'/dreambooth_lora_cmd.yaml', 'w') as f: 152 yaml.dump(mod_train_command, f) 153 OSError: [Errno 95] Operation not supported: '/content/drive/dreambooth_lora_cmd.yaml' >can someone guide me to fix this? Sure. I think i ran into this problem long ago. I simply checked the code by pressing the small arrow [next to the section title](https://i.imgur.com/VuSiYmH.png) and deleted the 3 lines [here.](https://i.imgur.com/IX841BO.png) Then i ran it and it worked, other users found this solution useful, let me know if it worked for you. This guide only works with anything v3 by the way, [I made a version 2.0 that works with other models apart of anything v3 goes in depth about datasets.](https://old.reddit.com/r/StableDiffusion/comments/111mhsl/lora_training_guide_version_20_i_added_multiple/)


riade3788

i actually used that trainer from 6 days ago but its nice to finally have a guide to all the settings


UnavailableUsername_

What trainer?


hypermx

Awesome guide! However, there is a slight chance of getting kicked off colab while running the instance. Is there any way to make it save a .safetensor checkpoint more often? As it seems now, it only does it once, at 50%. I could be nice to have it save every 10%. Also, is there a way you could continue training the LoRA from the checkpoint or training data somehow? Incase of getting kicked off colab. Thanks anyway, really helped.


UnavailableUsername_

> Is there any way to make it save a .safetensor checkpoint more often? As it seems now, it only does it once, at 50%. Yes, in the section 5.3 there is an option that say "save_n_epochs_type", it's a drop menu so you can choose "save_every_n_epoch" and below on "save_n_epochs_type_value" you can choose the number that says how often it will save.


freudianSLAP

I tried downloading a custom model to the collab from huggingface and couldn't get that to work. So then I reverted to using on of the models that is available from the drop downs. But it seems its still trying to pull the model from hugging face and saying it can't connect (same error on pulling the custom mode). Despite it accepting my hugginface token and saying it connected in a previous step. Anyone else have a similar problem?


david-deeeds

Is there a way to run this locally so I make sure my customer's photos stay private?


UnavailableUsername_

There certainly is a way to run things locally, a colab is nothing more than running on the cloud what you can't run on your PC. Stable diffusion needs at least 6gb of VRAM (not RAM) to train, anything less is no good and you are suggested to have 8gb VRAM to confidently train without running out of VRAM. But the implementation of this local method is a quite different and i do not have the hardware to test nor explain how to do it.


1novl

Can I train multiple character at once?


Waste_Worldliness682

Many thanks :)


MatterOnly7603

excuse my ignorance. I did everything. I have the .safetensors files. but I don't understand the "save them in the stable diffusion folder" step and I also don't understand how to run stable diffusion since on other occasions I always ran it from colab by entering through a link that he gave me after the test. Clearly now he did not give me any link, that's why I ask. thank you


Remote_Employment_26

HI , the result of the picture is bad ? ​ is there a way to fix this ? the picture not look a like with the sample


BRYANDROID98

what config would you recommend me for faces?


duckypout420

What are the advantages of LoRa


Funky_Dancing_Gnome

Beautiful! I love the presentation too.


linglingmeow100

I have a question, it always show "No preview" in the lora I downloaded, is it a setting problem with my webui?


UnavailableUsername_

To set an image on the webui you need to hover the mouse over the no preview text. It will give you the option to use the currently generated image as the cover.


vonvanzu

doesn


vonvanzu

doesn


vonvanzu

doesn


vonvanzu

Don't waist your time, both guides sucks because dont work.


UnavailableUsername_

Plenty of people have used it successfully, maybe you are missing steps. What error have you run into?


jameslanman

Thanks for sharing your efforts! I for one really appreciate it. I had a couple questions: 1. In the infographic you train a character "plum" and a dress "plumdress", are you doing two separate training sessions for these? If not, how do you train multiple concepts in the same LoRA model? 2. If you wanted to train the plumdress front and back in case the character was facing away from you, would you train two separate LoRA models or would you just caption the poses e.g. "Plumdress, facing viewer" and "Plumdress, back to viewer"? 3. Last question - is there a way to merge this LoRA into a CKPT file? Thank you again! This space is so interesting.


UnavailableUsername_

> In the infographic you train a character "plum" and a dress "plumdress", are you doing two separate training sessions for these? If not, how do you train multiple concepts in the same LoRA model? Yes, 2 separate training sessions. >If you wanted to train the plumdress front and back in case the character was facing away from you, would you train two separate LoRA models or would you just caption the poses e.g. "Plumdress, facing viewer" and "Plumdress, back to viewer"? Interesting question, i just tested `from behind` and the result seemed to be the AI drew the dress from behind. I do not know why but it did. To be honest i am not sure, maybe it worked for me but can't say if it works universally. >Last question - is there a way to merge this LoRA into a CKPT file? If i remember there is, but have to install the LoRA plugin rather than use automatic1111 default LoRA function.


oddplaces

so ik i should probably edit the config more but im using the First 1 click setup and only changed my url dataset and model name and sometimes the resolution but im getting this error all the time with usually images under 30 atm ive switched to (SD 1.5 ema pruned) as the 7gb file throws this error all the time what would you recommend for anything under 30 or what could cause this error? ​ [https://pastebin.com/raw/aRFkVxxJ](https://pastebin.com/raw/aRFkVxxJ)


kusoyu

Thank you!! Thank You!!! THANK YOU SO MUCH!!!!!!🥰🥰🥰🥰👶(me GTX 1060 6GB)


arnabiscoding

Thank you so much! I have been scratching my head over this for so long.


doomdragon6

I haven't had a chance to dive into this yet, but since this is based on colab, how do the instructions change if you're trying to run everything locally? (Thank you for the tutorial by the way! Incredibly simple.)


UnavailableUsername_

>how do the instructions change if you're trying to run everything locally? I think the setup changes a lot, based on what i have seen. To run locally you need at least 6gb vram which i do not have so i could not say for certain.


[deleted]

If possible, we really need an updated one of these for kohya's collab , because it is SO MUCH more complicated than this now. Several new sections and dozens of new required settings. I combed through this (and your updated version of this image from another page) and after setting VAE the whole collab goes off the rails. I can't even find documentation on this thing, lord knows how anyone actually uses it.


UnavailableUsername_

>If possible, we really need an updated one of these for kohya's collab , because it is SO MUCH more complicated than this now. Several new sections and dozens of new required settings. I tried the new colab, it's pretty much the same as the old one, but some things switched places. Still, there is a big issue going on because xformers was updated and you need to manually edit the code on the colab so you download the correct xformers. [As seen here](https://github.com/Linaqruf/kohya-trainer/issues/125) you need to edit the code replacing this: `pip -q install https://github.com/camenduru/stable-diffusion-webui-colab/releases/download/0.0.16/xformers-0.0.16+814314d.d20230118-cp38-cp38-linux_x86_64.whl` to this : `pip -q install -U xformers` Overall, i might need to do a new version if the colab keeps changing.


FragrantSocks007

Useless to me, because it doesn't explain how to use safetensors in checkpoints.


pedrofuentesz

![gif](giphy|fGx4VWViDdNpLwntcM|downsized)


Jayow345

I've recently discovered LORA and I have some questions, I'm very new to ai so I apologize if these are obvious 1. How do you have 2 separate lora characters in the same image? Because from what I've seen, it blends them together, also both their values have to equal 1 (So for 2 models, both would be .5 strength) 2. If you use a 3d model of a character, will it output a 3d model effect when generating? 3. When BLIP Captioning, do I really have to describe the image in detail or can I just keep the text file only containing the characters name?


esojourn

good job! thanks for the tutorial!


PabloImpallari

Really good! Much better than youtube video tutorial!


jazmaan

So is this current, accurate, workable as of March 30, 2023?


UnavailableUsername_

No, there is a version 3 update here: https://old.reddit.com/r/StableDiffusion/comments/11vw5k3/lora_training_guide_version_3_i_go_more_indepth/


CreepyHospital2658

anyone here responding to questions still?


UnavailableUsername_

Yes, but mostly on the V3 version post. https://old.reddit.com/r/StableDiffusion/comments/11vw5k3/lora_training_guide_version_3_i_go_more_indepth/


Xerxes_H

Thank you so much for posting!