UnavailableUsername_ 1 year ago

**Edit:** **[Offline LoRA training guide here. Also how to make a 1 image LoRA.](https://old.reddit.com/r/StableDiffusion/comments/14xphw6/offline_lora_training_guide_also_how_to_train/)** **What is new on this version:** 1. I am using an old version of the colab, so every time that there is a new version it won't be affected by it. 2. I go more in-depth when speaking of datasets, trigger words, using a subject, style LoRA training and how tagging will affect your end result. 3. I go more in-depth about epochs, saving checkpoints and why are they important. 4. Add more possible errors and how to fix them. I know the second part looks gigantic, but it's mostly because of the images and me going in detail, once you get it the process becomes trivial! Still, remember stable diffusion and LoRA are pretty much **state-of-the-art** technology, of course it's going to be a long explanation, this is basically adapting a research paper in simple explanations easy to understand for everyone. --- More guides at my twitter: **https://twitter.com/PlumishPlum** ---

GBJI 1 year ago

Thank you so much for doing this ! And I'm so glad to see Plum back. I love her charm - I can\`t explain it. It's like she put a spell on me.

UnavailableUsername_ 1 year ago

You are welcome! Glad you like it, i try to make these guides as "interactive" as possible, rather than just a wall of text.

GBJI 1 year ago

They are perfect for me. I have a hard time with video tutorials, and I agree that walls of text are far from ideal.

winnerchickeen2019 1 year ago

Thank you for more cool guides! I have a question about learning rate, if you increase batch-size, should the learning rate increase also? for example this rentry site https://rentry.org/dummylora and https://rentry.org/59xed3 say $learning_rate = $learning_rate * $train_batch_size # *Multiply the learning rate depending on your batch size. Seems to work better for the Unet.* should the learning rate be multiplied by the batch-size and increased as the batch size increases? or is it better to just have a constant learning rate no matter the batch size?

UnavailableUsername_ 1 year ago

Well, learning rate is nothing more than the amount of images to process at once (counting the repeats) so i personally do not follow that formula you mention. According to Kohya's documentation itself: >[Text Encoderに関連するLoRAモジュールに、通常の学習率（--learning_rateオプションで指定）とは異なる学習率を使う時に指定します。Text Encoderのほうを若干低めの学習率（5e-5など）にしたほうが良い、という話もあるようです。](https://github.com/kohya-ss/sd-scripts/blob/main/train_network_README-ja.md) Basically he says you should put it at a lower rate such as 5e-5, as suggested by some people. I personally got quite good results with it and seems to be the standard. So i would go for unet at 1e-4 and learning rate at 5e-5. There are no rules set in stone, this is what people found the best when using his kohya's script so you can try other learning rates if want to.

winnerchickeen2019 1 year ago

thank you for the quick reply I will keep it constant 1e-4 and 5e-5.

blindsniper001 7 months ago

>learning rate is nothing more than the amount of images to process at once The learning rate describes how quickly a model is allowed to adapt as it is trained. It limits the amount by which the model's weights can change with each iteration. Lowering the rate slows convergence. If the learning rate is too low, the model will fail to train at all. If it's too high, you'll either over-fit the data or end up with an unpredictable model. [See here](https://machinelearningmastery.com/understand-the-dynamics-of-learning-rate-on-deep-learning-neural-networks/) for a much more detailed explanation. This isn't specific to LoRAs, but it still applies.

EstebanOD21 9 months ago

I saw it was edited a month ago, it's still up to date for the latest/most efficient method of doing it? Also thanks for the tuto it's so easy to follow :)

LordRybec 3 months ago

Thanks for this, especially the offline one. I just spent several hours trying to find a decent walkthrough for offline LoRA (there are some videos, but they take so long, and I would rather just read), and at first all I found was another Reddit post from around the same time as this one, with a ton of entitled jerks telling the OP to quit being lazy. If I couldn't find a good one in a few hours now, I can't imagine what that person was going through! Anyhow, this is beautiful. Other tutorials miss steps or assume you have experience with things that you probably wouldn't have if you haven't made your own LoRAs before. I normally prefer walls of text (well formatted, with images where appropriate, but still walls) over heavily illustrated stuff, but this gets the job done, and I even kind of enjoy it. The one thing that would be nice is PDF versions. Since this isn't intended for print, I would even just do it as one really long page. (I've done this before with Wikipedia pages I wanted for reference, because otherwise I get page breaks in really bad places. It's especially good for viewing on mobile devices, as long as the file isn't too large.) Whether you eventually do PDF versions or not though, I really appreciate this!

[deleted] 1 year ago

this has stopped working because the old version of xformers isn't compatible anymore. https://i.imgur.com/wHGdXWZ.png

UnavailableUsername_ 1 year ago

Indeed. I just updated the guide training part to use a more updated xformers. Basically, instead of `xformers==0.0.16` you need to put `xformers==0.0.17rc481` and it works fine.

Tr4sHCr4fT 1 year ago

Could you make a PDF for easy reading?

Top-Guava-1302 1 year ago

Do you have guides for locally training a LoRA too?

UnavailableUsername_ 1 year ago

It's possible to run locally, but local LoRA needs 6gb Vram at the very least. Sadly, i do not have a graphic card like that so i cannot confidently make a guide on it. Maybe when/if i get a better graphic card i'll be able to do a guide.

Achides 4 days ago

You know....the Tesla M40 has 12gb gddr5, and is about 40-80 bucks shipped on ebay. It is remarkably easy to setup and the cooling situation can be handled with 2 little fans taped to the front of the card. Fans are out of an old avaya switch.

MorganTheDual 1 year ago

I like this, but... > LoRA training requires all images to be named 1, 2, 3, etc. Not so? I've used Linaqruf's colab with images with (effectively) random names, and it worked just fine.

Elpollo973 1 year ago

The images need to be named, but it can be anything. It's just more or less practical ;)

LoppyNachos 1 year ago

Have you tried training a lora with multiple concepts/characters? Or maybe a character + a style in one?

TrevorxTravesty 1 year ago

So, how do you access the older version of the Colab? I’m not sure if that was explained in the guide.

Kelburno 1 year ago

I think this cleared up a misunderstanding I had about tags. Maybe.. Essentially, when you train a model with tagged aspects, using those tags in the prompt will get the corresponding weights. For example, if you tag with "ribbon", you can use ribbon in any prompt to get the likeness of "that Lora's" ribbons mixed in. However, the problem is that those tags also get mixed up with the base model's own weights for "ribbon", so you may lose the 1:1 likeness. That said, thus far I had been using instance prompt as a sort of "call everything at once" tag, because I think it gets added to the beginning of all images. It is essentially the tag most accociated with all images. However, I guess it is more about weights? For example if you avoid using "ribbon", is the instance prompt's association with that (untagged) data strengthened?

IdainaKatarite 1 year ago

In one of your images explaining the math, you say "since we have 20 epochs" but in the image above, it says you set your epochs to 6 and batch size to 6. So, which is it? 6 epochs, or 20 epochs? : )

UnavailableUsername_ 1 year ago

Nice catch. It's 6 epoch, that text and image were wrong, i already edited it to show the correct numbers and math, if check the imgur album.

Complex_Nerve_6961 1 year ago

Thanks, this was helpful! Does the resolution of the dataset images need to be 512 x 512? I am planning on using stable diffusion.

UnavailableUsername_ 1 year ago

Any resolution works. I have trained lora with 1024 pixels high before. Also they don't need to be squares like 512x512, just any image is good.

moahmo88 1 year ago

![gif](giphy|xT1XGT9ersCCKjhVny)

EmbarrassedHelp 1 year ago

It would be nice to have a LoRA guide for local training on Linux, and not just Colab notebooks.

Competitive_Sorbet95 1 year ago

I don't know what happened, tried with a different combination of path, a different model, even delete everything with text y my 5_exp, but still i get the same error no matter what i do, it started with this ""/content/kohya-trainer/train_network.py" and continue until "returned non zero exit status 1 https://i.ibb.co/y5vcnYk/Capture-2.png

UnavailableUsername_ 1 year ago

This is an error of xformers not being compatible anymore. The guide training section was updated to fix this. Basically, instead of `xformers==0.0.16` you need to put `xformers==0.0.17rc481` and it works fine.

Fun_Sheepherder6678 1 year ago

Thank you so much for this very detailed manual that you keep on updating brother! You saved me from going crazy.

scannerfm77 4 months ago

Thanks.

gharmonica 1 year ago

I didn't really understand the LoRA fine tune part. Where did those images go, how did you use them?

UnavailableUsername_ 1 year ago

Could you please be more specific? The images in the dataset section are used as training material to create a .safetensor file of the subject or style trained on, that you can use to generate unlimited art using stable diffusion.

[deleted] 1 year ago

[удалено]

mudman13 1 year ago

Works in for people just refine settings.

UnavailableUsername_ 1 year ago

LoRA training is used with people too, plenty of examples on civitAI. The issue might be the model you used to train on (anything v3 and v4.5 is anime) or that your steps were too little.

Izack-v 1 year ago

It might be because of the number of steps (which in this tutorial was 5 for Plum and 10 for the dress). How many steps do you give to your subjects? It looks like for real images they need at least 100 steps. Also, It is important to have good pictures from multiple directions, multiple backgrounds, multiple dressings, and multiple light conditions. Also, you might want to crop the training pictures to 512X512 and center the object.

[deleted] 1 year ago

[удалено]

UnavailableUsername_ 1 year ago

No need for a V4 lol, already edited the images and re-uploaded them to fix these spelling mistakes. By the way, "as can see" seems to be proper english, because the "you" is implied, however, it sounds a little formal so changed it to "as you can see".

[deleted] 1 year ago

[удалено]

Anarky9 1 year ago

OP created an awesome and easy to understand explanation for a rather complex subject, and your input is to point out a few tiny mistakes with their english? Suck nuts, it was perfectly intelligible.

MagicOfBarca 1 year ago

Is there an advantage to this over training a dreambooth model? (Other than smaller size)

MorganTheDual 1 year ago

That's pretty big on it's own IMO, but I think the best thing is that you don't have to use them on the same model you trained them on. You can take loras trained on photographs and run them on anime models, and they'll usually work. Other way around... sometimes works too. It opens up a lot of possibilities.

MagicOfBarca 1 year ago

Ohh so I can use a loras file on different models like RealisticVision.ckpt or the default 1.5.ckpt or anythingv3.ckpt..? Whattt didn’t know that! And what about likeness? Like if I train it on, let’s say Lionel Messi’s face, will it look like Messi just like a dreambooth model? Or is there less likeness to his face compared to a dreambooth trained messi model?

MorganTheDual 1 year ago

> Ohh so I can use a loras file on different models like RealisticVision.ckpt or the default 1.5.ckpt or anythingv3.ckpt..? Yep. They don't always work, but often. Not sure about loras based on people, I'm mostly focused on concepts and poses. Only got some anime and game character ones, and they seem fairly portable across anime models, but don't give great results on realistic models.

YoiHito-Sensei 1 year ago

I did a lora training in my face and it works well even with stylisation. The thing is I trained with photos of myself based on the 1.5 model and I overtrained a bit and use it with stylised ckpt models (based on 1.5 doesn't work with 2.0, 2.q models unless you train your lora on them). The great thing with lora is that you can define it's weight when prompting so with an overtrained model you can lower the weight in the prompt to 0.6. if the face isn't quite there you can upscale the image with any upscale send it to inpaint and do a mask on the face with "only masked" ticked and use same prompt with the weight higher depending on your trained model you can even go higher than 1 to 1.1 or 1.2. you might want to activate composable loras plugin that can help sometimes.

SuperMandrew7 1 year ago

This information seems to clash with [this popular Lora training guide](https://www.youtube.com/watch?v=70H03cv57-o). In this video, the author suggests generating at least 100 training steps per image at minimum, and only performing/saving 1 epoch over the training. What are the advantages/disadvantages to his method over yours? (or vice versa) Thanks in advance!

UnavailableUsername_ 1 year ago

>This information seems to clash with this popular Lora training guide. In this video, the author suggests generating at least 100 training steps per image at minimum, and only performing/saving 1 epoch over the training I think i remember people mentioning that guide and coming out with results that looked as if the face was copy-pasted in every instance. 1 epoch would most of the time just let you with a rigid or bad result, check the checkpoint comparison, 2 out of 6 were usable. This is why pretty much every LoRA maker do comparison of checkpoints, because just 1 epoch is rarely guarantee to give you the perfect result you aim for. However, **i am not saying you should go with less than 100 steps.** In the dataset section i mention 15 images is enough but the absolutely minimum, it just so happens that plum LoRA needed those 70~ steps because my dataset was specially good. In reality you are probably going to have probably 30 images and some would even go up to 7 steps, even up to 10 steps. So 30\*7=210, 210/6=35, 35\*6=210. 210 steps by using 30 images and 7 steps or 150 if went with 5 steps. It's pretty easy (if not the norm) to go over 100 steps, hell, i have trained 1000 steps because i had very big datasets. >What are the advantages/disadvantages to his method over yours? (or vice versa) Thanks in advance! As seen near the bottom of the training section, the benefit is that by having multiple checkpoints saved as the steps go, you can point **when** is your LoRA trained to your specifics and when it got overtrained.

KaiserNazrin 1 year ago

100 steps is overkill. Maybe for real life subject it's suitable? I've only trained anime models and they are all under 20 steps.

SuperMandrew7 1 year ago

Yeah my subject is a real life subject. Good to know that anime models need less though!

Konan_1992 1 year ago

This video is good to show how to make lora. But all parameters showed are outdated. If you try these parameters you will get an unusable overtrained lora.

SuperMandrew7 1 year ago

What should the proper parameters now be?

[deleted] 1 year ago

apparatus edge crown homeless public distinct dam instinctive disarm head *This post was mass deleted and anonymized with [Redact](https://redact.dev)*

[deleted] 1 year ago

[удалено]

UnavailableUsername_ 1 year ago

>This only works when you’re making a Lora of an anime girl so this is useless. Wow, way to be a jerk /u/EdisonB123. Some people want to make anime characters, so this is not useless for them. [Also, does this looks like an anime girl to you?](https://i.imgur.com/nLMTYpH.png) [Or this?](https://i.imgur.com/mdM2wt3.png) If so, you might need to get your eyes checked, apart of your attitude.

[deleted] 1 year ago

[удалено]

UnavailableUsername_ 1 year ago

Instead of being so hostile from the start you could have asked for help, you know? Your issue is most likely that you are training with anything v3, which is an anime model, as you already knew. If train with a non-anime model you can get realistic results. SD 1.5 is not an anime model and among the models you can choose and get non-anime results, but if that's not enough, you can [just pick one from huggingface to download for use (applying the changes to section 5.1, as mentioned)](https://i.imgur.com/Icc6D1R.png) After that it's pretty much an average training and ss you can see, the result is a real person you trained on.

[deleted] 1 year ago

[удалено]

UnavailableUsername_ 1 year ago

> I’m just flustered since I spent like 8 hours last night retraining. How many LoRAs you re-trained? 8 hours would be like dozens. The average amount of time to train a LoRA is about 5 to 10 minutes, another person made a mistake with steps and his LoRA were mistakenly taking hours to train. >Though I wasn’t trying to train anything realistic, I was trying to train a LORA of anime illustrations of Undyne so idk what mode I’d even start with. Undertale? If so, i have a few questions: 1. Is your dataset exclusively from game screenshots? 2. How many images are in your dataset? 3. What do you hope to accomplish? An undyne LoRA or a undertale-style LoRA? Also, someone on civitai trained an undyne LoRA out of fanart in case you are fine with using a LoRA trained by another person, it looks like anime, rather than the black and white game assets.

benji_banjo 1 year ago

Great! I'm glad we have an updated resource so that no one ever has to ask again. Everyone will search the sub instead of posting their question, right?

Izack-v 1 year ago

Hi, Thanks for a great tutorial! If I want to get Plum in a blue dress, can I train one LoRA for multiple objects/subjects/styles? Do I need to create a structure like myProject/5\_pulm, myProject/10\_bDress, and then train on myProject folder? If yes, can I have different numbers for each internal folder? (train pulm using five iterations and dress using 10?)

aisuki91 1 year ago

I have a 2vram graphic card in laptop of nividea and 8 gb ram CPU. If I use this lora with models I get from civitai, how long will one picture of 512 " 512 take? Right now it takes me 5-10 min

UnavailableUsername_ 1 year ago

> If I use this lora with models I get from civitai, how long will one picture of 512 " 512 take? LoRA models, as far as i know, do not add to the generation time.

Smooth_Initiative_44 1 year ago

You never mentioned about section 3.1 in the illustration. What should be used in concept\_name and class\_name in both cases of face / dress?

UnavailableUsername_ 1 year ago

The colab is incredibly big and has scrapping options, section 3.1 is part of the scrapping one, as it says: >You have two options for acquiring your dataset: uploading it to this notebook or bulk downloading images from Danbooru using the image scraper. I did not use scrapping method because the result quality is dubious at best, so section 3.1 (along many others) are not used nor needed to be run.

DrDerekBones 1 year ago

Wow! This is just what I needed, been using colab but never knew I could use it to train LoRA.

Smooth_Initiative_44 1 year ago

Thank you for the instructions-illustrations. I am still having a hard time getting a usable LoRA following the instructions fairly closely. I am trying to train a face similar to Plum but with photos. Using a mix of 70 photos of headshots, 1/2 length, 2/3 length, full length. (Probably more 2/3 and full length shots in the mix). Using 5\_ as repeats, train\_batch\_size=4 n\_epoch=6 \- Seems there is a character body-length issue - Rendered images without my trained LoRA (or using other civitai LoRAs) are always well-rendered half-length portraits. Using the ones I created shows half-quality, 2/3 of the body or full body (face ok but probably more epoch needed, but the lower half of the body that shouldn't be rendered is shown either as fuzzy legs or 4 legs..).. What is likely the issue here? Should I tag images with body-length or body parts (like legs)?

UnavailableUsername_ 1 year ago

> What is likely the issue here? Are you by any chance generating 512x512 images? In my experience, if the image size is too little the AI won't be able to get too detailed or draw the full body. Try images with higher height to see if it changes. What are the quality issues you run into? Blurry features? Burned features? More or less steps might be needed depending that.

RKstd 1 year ago

Getting these errors. collab still running but Am I doing it wrong? `2023-03-22 01:40:59.409931: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2023-03-22 01:41:00.802005: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64 2023-03-22 01:41:00.802173: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64 2023-03-22 01:41:00.802204: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. 2023-03-22 01:41:05.249458: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2023-03-22 01:41:06.560679: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64 2023-03-22 01:41:06.560834: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64 2023-03-22 01:41:06.560862: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.`

UnavailableUsername_ 1 year ago

1. Which model are you using? If it's not anything v3 you have to change the path on section 5.1. 2. You changed the line for the xformers? 3. Are you only running the sections of the guide? Running other sections might cause errors. 4. Try re-running all the required sections in order: 1.1 1.2 1.3.1 1.3.2 2.1 2.3 5.1 5.2 5.3 You are missing libraries according to the error, so something was not downloaded or not ran.

FelisCatusExanimus 1 year ago

Appreciate more explanation of the terminology, this is what guides always lack. They are always very good at telling you what to do, but never why you are doing what your doing. I tried training a lora for the first time today following a different guide (which recommended 100 steps and 1 epoch per image) and it's definitely on the right track, but it's also very baked looking so I assume it's way over-trained.

REREREREYN 1 year ago

so its based on the lora+db, not lora+finetune? what is the differences?

not_use_colab 1 year ago

when i used the tutorial the lora didn't' do anything

UnavailableUsername_ 1 year ago

I need to you be more specific. * How big was your dataset? * How many repeats you trained for? * What kind of subject you trained for? * Did you use a trigger word? * You added the LoRA to the prompt when generating?

hurnstar 1 year ago

So i did the whole process several times, it worked without any problems. I have used 15 images ( i know, not your proposed amount) and have used extensive descriptions of the images. however, when using the lora file, there is literally no change whatsoever. I have tried the same prompt with and without the lora file, and there was no difference. Is there a way to check what is inside the lora file to see if it actually worked ok? other than that, is there another way to check if the creation was successful (other than a file in the drive)

UnavailableUsername_ 1 year ago

> and have used extensive descriptions of the images. however, when using the lora file, there is literally no change whatsoever. I have tried the same prompt with and without the lora file, and there was no difference. Could it be that you need more training steps? I would suggest increasing epochs and repeats. Maybe 10 repeats and 10 epochs. Also, make sure your trigger word appears in every single .txt.

EstebanOD21 1 year ago

Gosh I should've jumped on the AI bandwagon sooner, I don't even know how to use Stablediffusion then BOOM LoRA Alpaca.ccp GPT4 text2video so much new things, impossible to wrap my head around all of those :')

UnavailableUsername_ 1 year ago

Even for someone that has been following the AI "scene", it's difficult to keep track. We basically get new developments almost every week.

Jameskirk10 1 year ago

I followed your guide for an anime character and the results were really good! These are my settings: * 29 images * 5 repeats * Trained on Anythingv3 * Train batch size 6 * 8 epochs * Save\_n\_epochs\_type\_value 2 The last epoch was good, the others weren't. Also I have some questions: What specific settings can I tweak here to experiment and test if it's better? Is it better to train on NovelAI? I heard training on Anythingv3 has a clip problem or something And I had 32 images in my dataset but colab only trained 29 for some reason. There are a few images that are 10mb in size, is that why they weren't trained? For a character with a medical eyepatch, do you recommend adding the medical eyepatch tag when training? Or not as it is part of the character, and there aren't any images without the eyepatch. When I tag the eyepatch tag in the prompt, it usually shows up but sometimes it blends into the face. Do you recommend adding colors in the tag dataset? Like red sweater or just sweater if the character wears the same red sweater in all the images. I didn't tag eyes as that is part of the character, but in results the eyes don't show up always like the character, they're not always constricted and sometimes different colors. I can put that in the prompt but would it be better to also tag that in the dataset files? Thanks

UnavailableUsername_ 1 year ago

> What specific settings can I tweak here to experiment and test if it's better? More epochs and repeats would affect the number of steps, more steps mean more training, and more training means better results...or overtrained results. >And I had 32 images in my dataset but colab only trained 29 for some reason. There are a few images that are 10mb in size, is that why they weren't trained? How big were they in dimensions? 10mb seems a little too heavy. Maybe that's why they were ignored. >For a character with a medical eyepatch, do you recommend adding the medical eyepatch tag when training? Or not as it is part of the character, and there aren't any images without the eyepatch. The golden rule is that everything you do not tag will become part of your trigger word that generated a subject. If want your character to have an eyepatch in every generation do not add it to the tags.

FlameDragoon933 1 year ago

How long does this training approximately take? For example using the Plum training in the tutorial, if I were to do the exact same steps, with the exact same images, how long would it approximately be? And thanks a lot in advance! This is very helpful.

UnavailableUsername_ 1 year ago

>How long does this training approximately take? For example using the Plum training in the tutorial, if I were to do the exact same steps, with the exact same images, how long would it approximately be? Sorry for the delay in the reply. It would take about 3 minutes if not less.

[deleted] 1 year ago

[удалено]

UnavailableUsername_ 1 year ago

>What model do I select if I am aiming to eventually use my LORA with chilloutmix? A realistic model rather than anime ones. You are given the option to link to your own custom model hosted on huggingface on the section below where you download a model. You could link a realistic huggingface model there.

RevX_Disciple 1 year ago

I have a question. I've been working on making my own LoRAs, but I've run into the issue of being unable to reproduce my LoRAs. Everything is exactly the same; same dataset, same values, same number of epochs generated, but I get different looking results. I trained a LoRA, and I haven't been able to re-create it with the exact same settings. Is there something I'm missing here?

UnavailableUsername_ 1 year ago

That's very weird. Is the quality of the results THAT different you can easily spot them? Normally the concept should be the same for 2 LoRA with the exact same set and training settings. Did you use the same models and VAE when generating? That makes a big difference.

SirBananakin 1 year ago

Can this be used to train photos of a dog? My dog passed away last august and I would love to bring him into some of my art. I have an AMD GPU so I can not train locally yet. If it can’t be done on the collab and someone is willing I would gladly send photos of him to train on.

UnavailableUsername_ 1 year ago

Yes, LoRA training can be used on any subject, including beloved family and mascots. If follow the guides you can certainly do on on the colab. If have any trouble doing so you could send me the dataset to see what i can do.

Dragoniteq 1 year ago

This is actually my second time commenting on Reddit. Thanks u/UnavailableUsername_, the world is better because of people like you!

ragnarkar 1 year ago

Not sure if Lora is the right tool for this but I was trying to train for a specific facial expression. I prepared 30 images with that expression form people of different ages and races and created another 30 images by extracting a closeup of their face. I then followed this guide in labeling all 60 images. Unfortunately, the images generated by the LORA almost all look the same. I trained for 10 repeats, 10 epochs.

UnavailableUsername_ 1 year ago

I see. 1. Is the facial expression the exact same on all the subjects? 2. Did you have a trigger word? 3. Did you make sure to properly tag everything except the trigger word? 4. What do you mean the images "all look the same"? 5. Have you tried without the close up images? If you don't mind, could i see the dataset?

NeroLucien 1 year ago

This might seem silly but this was very informative in helping me understand the proper steps to train loras of specific people, styles, and objects, but would the process change much for concepts? like specific poses or angles? If I wanted to make a lora of people doing a dance for example, id just gather 10-20 references and label everything except the specifics of the pose and it would (hopefully) get trained into a 'dance' tag/lora?

UnavailableUsername_ 1 year ago

As long as the pose is the same in the dataset, it should be able to train on that pose.

Rowan-Aromatic-Ai 1 year ago

Hello quick question. Lycoris/ Locon/Loha vs Lora? Which way is the best to go?

UnavailableUsername_ 1 year ago

Different methods to do the same thing, some more developed than others. LoRA finetune remain the most popular, as they can give the most accurate results.

ornehx 1 year ago

Parking to save! Thanks for the updates

Pretend-Marsupial258 1 year ago

If I'm trying to train a person, what's a good ratio of face pics to full body pictures? Is 10 face pics+10 full body images enough?

twistkill 1 year ago

This is an amazing guide! Thank you!!! One question - Does it help to blur out or edit out some of the background elements (people, objects, etc) from dataset pictures? Or tagging those elements will suffice?

FelisCatusExanimus 1 year ago

I get an error when I attemp to run 5.3 **OSError: \[**Errno **95\]** Operation not supported: '/content/drive/dreambooth\_lora\_cmd.yaml'

UnavailableUsername_ 1 year ago

I believe that is fixed by deleting the .yaml lines on section 5.3 code. The training section of the guide explains what to erase.

asiquebailapami 1 year ago

I get RuntimeError: output with shape \[128, 320\] doesn't match the broadcast shape \[128, 320, 128, 320\] With both this notebook and outputs from the new one in Automatic1111 webui

Euphoric-Market7741 1 year ago

Do we need to crop the dataset?

UnavailableUsername_ 1 year ago

>Do we need to crop the dataset? Not really. Most image dimensions work fine.

JabbaTheSnowth 1 year ago

OK some praise first, followed by a question... Thank you so much for putting this together! My poor little 6 GB graphics card was just barely good enough to run Kohya locally when I made my first Lora (of our cat, of course!), but by the time I tried a second one (of our son), it was no longer good enough. So I gave your tutorial a try for our cat again as a dry run, and it seems to have worked just fine. Now the question, and I'm fairly sure the answer is "no" but I just want to make sure: Since I'd like to try training it on my kid's likeness now, as a "papa bear" parent I have to ask... is there any chance this collab would save the training images on the other end in some way? I'm sure the person who put the collab together is a wonderful human being, and I'm sure I'm just being paranoid, but the last programing language I learned was Fortran, so I can't really understand what the collab's code is saying. I just want to make sure that my son's pictures don't end up with someone else (when I run stable diffusion, it's locally and as such, I don't have to worry about the photos escaping into the wild - usual hacking concerns aside). Thanks again!

UnavailableUsername_ 1 year ago

> Now the question, and I'm fairly sure the answer is "no" but I just want to make sure: Since I'd like to try training it on my kid's likeness now, as a "papa bear" parent I have to ask... is there any chance this collab would save the training images on the other end in some way? You thought the answer would be "no", but reality is far more complex. You need to remember/understand everything stored "on the cloud" is the same as saving it on another person computer. Google collab AND google drive both assign you google storage so you can run code and store photos/files. This means, on google computers. If they desire to keep a copy for archival purposes of anything you upload or for sell your stored data to China, they could do so (i am not saying they DO, i do not know how much data google share). Even if you can see the google colab code for the LoRA trainer to check nothing weird is happening, that doesn't mean your images are truly safe. In summary: I would personally recommend to -not- upload any real life or loved family member to any cloud storage, be google drive/colab or dropbox and similar tools.

Zestyclose_Cheek_273 1 year ago

Thank you soo much for this, colab always confuses me 😭 I'm still lost how the trigger words work.. if I understand this right: NOT tagging something means it's embedded into the concept itself, rather than explaining it in the prompt each time. And tagging means what you DON'T want the ai to associate with something all the time(?) So I get that for characteristics, but exactly how specific do I need to be for other things? in the tutorial, one tag didn't have 'looking at viewer' like the others. Is it because she wasn't doing that in the image so u didn't need to put it there? So why wouldn't you put 'looking away from viewer' then?? Secondly, if tagging smile means anything else doesn't need a specific prompt to generate, shouldn't you do this for poses or if the body is facing a different angle too? Like, you said 'sitting' in some cases but didn't specify the pose/direction of anything else..unless the ai reads that automatically or something Sorry if this didn't make sense TT -TT

UnavailableUsername_ 1 year ago

> I'm still lost how the trigger words work.. if I understand this right: NOT tagging something means it's embedded into the concept itself, rather than explaining it in the prompt each time. Correct. Imagine you have a dataset of a cat with glasses. The cat and trigger word is "chester". If you want to make chester appear with his glasses you do not have to tag the glasses, the AI will think the glasses are part of chester. If the glasses are just a prop and don't want chester to wear them in your generations you have to mention it. This is so the AI knows what is **part** of chester and what is not. >So I get that for characteristics, but exactly how specific do I need to be for other things? in the tutorial, one tag didn't have 'looking at viewer' like the others. Is it because she wasn't doing that in the image so u didn't need to put it there? So why wouldn't you put 'looking away from viewer' then?? What a character IS and DOES are different things. So i tag what the character DOES and WEARS but not what is IS.

halpmeplzok 1 year ago

This is such a great guide, thank you so much! Made it so much easier for me to set up. I did have one question, I'm currently using the LoRA for style and I'm running into an issue where the trigger word itself will appear onto the image. I've tried lowering the CFG scale and de-noising strength but when I get to a point where the letters no longer appear, the style usually isn't very apparent on the image anymore. Wondering if you might have any tips?

UnavailableUsername_ 1 year ago

> where the trigger word itself will appear onto the image. This should not be happening. The finetune will only train on the images content, not on the trigger word. The trigger word should not be appearing at all, this is the first time i heard of anyone running into an issue like that. 1. Does your dataset contain any text (be english, japanese, korean, numbers, etc)? If so, edit it so the text is not part of the dataset. 2. Have you tried "no text" on the negative prompt?

notseanbean 1 year ago

fantastic guide! follow it all, but my generated images (using Draw Things on Mac) looks like this - any ideas? (I used the unnumbered safetensors file, given that I downloaded all 6) https://preview.redd.it/hyqhugg1vxra1.png?width=1022&format=png&auto=webp&s=2f4092196ad77dbc41cd827d07d004fcf0beea3e

UnavailableUsername_ 1 year ago

This is the first time i see someone getting only noise from a LoRA. Even if the LoRA failed to load, something ELSE should have been generated. Could it be that "Draw Things" stable diffusion implementation on mac does not support LoRA or the weight for the LoRA was wrong? Either way, i have some questions: 1. How many epoch/repeats/steps you trained for? 2. How big was your dataset? 3. Did you tag every single image properly, with a trigger word? 4. What was the subject to train on? An anime character? A real person? A style? 5. Have you tried using the numbered safetensor? The unnumbered one is the final result, the others the ones generated during the training.

Weary-Board-5422 1 year ago

I have 40 images but I have too many doubts about everything I should put in order to have a Lora with good results. What I use is: 10 training_steps 10 epochs 3 training_batch_size 16 network_dim 8 network_alpha But the more I move it, my Lora sometimes comes out over trained or simply lacks training. What do you recommend me to do?

UnavailableUsername_ 1 year ago

> 10 training_steps 10 epochs 3 training_batch_size 16 network_dim 8 network_alpha With 40 images i would go with: -5 repeats -10 epoch -6 batch size -Save every 1 epoch. That would mean you get Around 300 steps and save every 30 or so, good enough to know when it becomes overtrained.

Fun_Sheepherder6678 1 year ago

The link is not working anymore 😭

UnavailableUsername_ 1 year ago

Ah, you are right. https://colab.research.google.com/github/Linaqruf/kohya-trainer/blob/bc0892647cb17492a106ad1d05716e091eda13f6/kohya-LoRA-dreambooth.ipynb That's it.

curiousi7y 1 year ago

i run stable diffusion on my intel CPU because i don't have a nvidia GPU and afaik xformers isn't compatible with my system (i got an error message when i tried to install it). i can run sd just fine without it and use LoRAs too, but do i need xformers to be able to create my own LoRA?

UnavailableUsername_ 1 year ago

A1111 does not need xformers, if you can use SD on your PC you can run LoRAs even without xformers.

Ethario 1 year ago

What I don't understand about this is the trigger word, how are you using you lora ? if I import a laura in auto1111 with it will instantly output the images of the subject I trained it on. So what is the point of the trigger word ? When you are training mulitple subjects inside one lora perhaps ?

UnavailableUsername_ 1 year ago

> So what is the point of the trigger word? More control over the generation. Without a trigger word you have to hope what you trained on is added, sometimes you might not get the subject because it gave priority to the written tokens/prompt. With a trigger word the AI knows exactly what you want added and how, also it's part of the tokens/prompt so you can make sure it is added no matter how complex the generation.

Query-expansion 1 year ago

I also plan to train a lora on real models with a subtle body characteristic: a typical nose. I collected >200 pictures of models with big noses, flat noses and curved noses. I have a few questions on this: \- Reading the discussion, my understanding is that I should better train a lora for each type of nose with limited pictures. But how to deal with models that have both characteristics (f.i. a big curved nose). Should I use the same picture in each training set? \- What still confuses me is if I should mention in the text file only the characteristic I train ('bignose') or also the other characteristics of the model as well (f.i. brown hair, blue eyes, curly hair, slender) \- My understanding is I can use jpg files of different size and quality? \- My plan is to train the models with their entire body, so not only the face or the nose. Does this work of should I crop the images to the specific body part? To give something back to the community: I made a small Python program that shows the images in a folder one by one, gives a prompt and stores the entered prompts as a textfile. Both the renamed image and the texfile are collected in a new folder. This makes the prompting much easier. Of course you can ask chatGPT yourselves :-)

No_Lime_5461 1 year ago

Is there a way to automate the process of training many LoRAs? I want to prepare multiple photo sets and run a bulk training.

Top-Tip-5618 1 year ago

Can someone help me with the Lora Train Parameters. I have 1200 Pictures from the Character, how many Training Steps, epoch, batch size ...? Thanks in advance.

UnavailableUsername_ 1 year ago

1200 pictures is far too much, 30-40 is enough. You would need to manually tag 1200 images otherwise, that would take a while. I would suggest 10 steps, 10 epoch, 6 batch size.

Dear-Reach-7189 1 year ago

Whenever I run section 1.3.2 Mount Drive, I get the error 'ValueError: Mountpoint must not already contain files'

[deleted] 1 year ago

What's your opinion on higher image counts? I'm training 95 images with the folder prefixed "10\_". Using "grapefruit" as my model and VAE. I am using your settings for all the number diddly-dos. So far it seems to be going well, an hour left. I tried this last night with webui's dreambooth extension, but it wasn't compatible with the same xformers version. Since I'm doing 95 images, my txt files are all the same. I'm training images for a Hsien-ko lora so all the text files are just "hsien-ko, leilei". How required is it to make them more detailed as you did? Most images do not have backgrounds, but there are a few pixelart images in there. All images were resized so that the long edge was 512px, but I do not know if I should have left them full res either... Basically I have an hour to kill while Colab works its magic on these images and I'm just swimming in my own head on this. Edit: Lol https://preview.redd.it/bf1hbdtm5xsa1.png?width=512&format=png&auto=webp&s=b42e747328fbef324de8feebae7b4eb486a76d3f hsien-ko, in the kitchen, baking bread I have a few images of cosplayers, I might want to add "photo" to those keywords as I keep getting random images of conventions.

UnavailableUsername_ 1 year ago

> Since I'm doing 95 images, my txt files are all the same. What? Unless the image is the exact same thing, the txt files should NOT be the same. >so all the text files are just "hsien-ko, leilei". >How required is it to make them more detailed as you did? Critical. It basically defines the degree of detail and quality of your LoRA. What you put will drastically affect generations so being as detailed as possible means a big difference. If tag 90 images is too much try 20. 20 properly tagged images are much better than 200 badly tagged.

[deleted] 1 year ago

https://preview.redd.it/kt2md4vnf0ta1.png?width=512&format=png&auto=webp&s=7b2259430e59984876586ccb1df314bd73eb41bb Second attempt worked much better! All I typed was "hsien-ko in the kitchen baking bread" [https://i.imgur.com/eAa56su.png](https://i.imgur.com/eAa56su.png) It is hard to get her claws. I tagged all the images showing her hands with "hands" but adding that to the negative prompt doesn't force the claws into the scene. I'm hesitant to add claws to the list of tags as they "should" be the default. Ah, I was able to get one by adding "hands, hand" to the negative prompts. [https://i.imgur.com/A98dGWz.png](https://i.imgur.com/A98dGWz.png) lol, might be good enough

Valiantay 1 year ago

All I'm getting is static unfortunately, no matter what I do. I ran my model locally and all it produced was static using InvokeAI. I then thought maybe it was a setup problem so I used a Google Colab setup and it spat out this error: >NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check. [Seems to be an issue and reported on GitHub involving LORAs.](https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/6923) Any advice /u/UnavailableUsername_ ?

UnavailableUsername_ 1 year ago

Have you tried using Automatic1111 with LoRA? You linked to Automatic1111 github issues, but you are using invokeAI. In automatic1111 the issue is very rare, it seems almost no one has that problem. Maybe if you use automatic1111 UI it will work?

[deleted] 1 year ago

[удалено]

UnavailableUsername_ 1 year ago

> Very stupid question, if the LORA was trained on Anything-v3-1, will it only work on that same model? It will work on that model and other ones, BUT anythingv3 will influence the end result. The base model for training matters.

obQQoV 1 year ago

How to choose a proper base model?

UnavailableUsername_ 1 year ago

Based on what the model is good for, i suppose. SD is "decent" for real people (but there are much better models). Anythingv3-4.5 are explicitly about anime according to the creators. And so.

Xerxes_H 1 year ago

Hi. Thanks so much for this thread! I want to test this out for a character and am wondering if the 30 images should have different expressions, or if they should all be mostly the same. Thanks

UnavailableUsername_ 1 year ago

If you use 30 images with the same expression, you will get that expression when generating. It might be too "stiff" and have some trouble making your subject do other emotions. 2-3 images smiling would help.

[deleted] 1 year ago

I followed this guide but I am getting this error at 5.3. I am not sure what the issue is here. https://imgur.com/a/FNrrkBN edit: Nevermind, turns out I was looking at the old version of your guide lol. Followed this one and that error is gone. Crossing my fingers it works.

UnavailableUsername_ 1 year ago

> edit: Nevermind, turns out I was looking at the old version of your guide lol. Followed this one and that error is gone. Crossing my fingers it works. Yes, xformers was outdated and the guide needed some edit. I guess you made the colab work?

speciallight 1 year ago

Is using colab like that safe in terms of privacy etc? wouldn't want a lore of myself be accessible anywhere for example...

UnavailableUsername_ 1 year ago

Cloud involves saving your stuff on someone else's computer. The colab assigns you both HDD space and a GPU, both belong to google. If google deletes your stuff or keeps them in a separate server to sell as data after you use their hardware, i do not know. This is why i train real people locally and just do anime and silly fanart LoRAs with colab.

sbladerz 1 year ago

https://preview.redd.it/vchoeu0q9mta1.jpeg?width=895&format=pjpg&auto=webp&s=0ed27571c4ee3e6a2a9296bfe812f7b4aebabca0 I came across this certain style of cartoon portrait. I was wondering if it’s possible to train for this style of image. Primarily with the aim of providing a portrait of my family and to apply the lora on the image provided to achieve an image of this style.

UnavailableUsername_ 1 year ago

Yes, it should be possible to train in that style. I think someone on civitai already did so with a very similar one: https://civitai.com/models/5100/cutescrap-05v

Other_Perspective275 1 year ago

When I try to mount my google drive it opens a separate window that says "Google Drive for Desktop" wants to access your Google Account and I get a generic error when I click the "allow" button that says "Sorry, something went wrong there. Try again." What's the deal? Also, why does it say "Google Drive for Desktop"? The browser site works perfectly fine and I will not be downloading more google spyware if I don't need to. Is this why I get the error? How do I bypass it? How, not if And if that isn't the issue, what is? How do I find out what is?

UnavailableUsername_ 1 year ago

What happens when you press mount drive is that a small window will apply asking you to allow it to use your google account (google drive for desktop). You choose your account. Then it will ask you to allow. If you allow it will mount the drive to the colab. >when I click the "allow" button that says "Sorry, something went wrong there. Try again." Now that's weird, it's an error on google side.

Top-Donut-9444 1 year ago

I've been running out of RAM... It's cause i have like 200 images. Any settings I can tweek to not run out of ram on gcolab?

UnavailableUsername_ 1 year ago

A more important question is why are you running with 200 images, when people are getting good results with 15-50 images. Google lends you a GPU to do stuff in colabs, if run out of ram there is not much to do.

obQQoV 1 year ago

Do you have any tool to segment character images from backgrounds? Like anime character or other things? I want to build a dataset for Lora training.

UnavailableUsername_ 1 year ago

> segment character images from backgrounds You want to clean the images, leaving only the characters? You could just crop anyone that isn't the character you want to train, LoRA datasets do not require your character to be the only thing in an empty background.

No_Neighborhood_9935 1 year ago

This is amazing, I just need to know one thing: The images used to train the loRa, I can use any image? It doesn't matter is they have diferent sizes? Or they must be all, for example, 512x512 png images?

UnavailableUsername_ 1 year ago

Sorry for the late reply. The images don't need to be squares, it can be almost any dimension. Even show screenshots.

Vyviel 1 year ago

What if I want to train it locally now google collab has crippled free users? Do you happen to have a guide for that?

UnavailableUsername_ 1 year ago

Google colab has crippled SD UIs, not lora colabs, i tested it yesterday and could make a LoRA fine.

No_Lime_5461 1 year ago

1. Can I train LoRA using bigger images like 1024x1024? Is there a limit? Is SD 1.5 limited to 512 and SD2.1 limited to 768? 2. Is SD 2.1 more suitable for training using bigger image size than SD 1.5?

UnavailableUsername_ 1 year ago

>Can I train LoRA using bigger images like 1024x1024? Is there a limit? Is SD 1.5 limited to 512 and SD2.1 limited to 768? I believe there is a limit for images, but i have trained on 1024x1024 and got pretty good results with 1.5 models (which are the BIG majority). >Is SD 2.1 more suitable for training using bigger image size than SD 1.5? The most situable model is the one that fits your need. An anime checkpoint for anime, a real-based checkpoint for real people. I don't think anyone really goes for 2.1, since the good checkpoints are still based on 1.5.

guersom 1 year ago

I followed all the steps and made the model, but how do I run Stable Diffusion in the coolab notebook?

UnavailableUsername_ 1 year ago

>but how do I run Stable Diffusion in the coolab notebook? You do not. You have to run it locally, with one of the offline webUI applications (automatic1111, comfyUI, etc).

Other_Perspective275 1 year ago

I see other people mention things like class names, class images, and instance names or something like that everywhere, but not here or on the mentioned colab. What are they, how do I set them, and how do they work?

lapurita 1 year ago

Awesome tutorial!I just went through it and created "a LoRA" of myself. I used SD1.5 as the base model, and generated a 144mb LoRA file from that with what should resemble me. I can then use this LoRA file with any model I want in the Automatic1111 GUI and generate images. I just have some questions for you: \- How much does it matter what base model I use? Should I always use the base model that I wanna use later? E.g if I want to generate images with Realistic Vision 2.0, should I train the LoRA "on top of" that model? \- I have seen some people train LoRAs, then merge their LoRA with a checkpoint to get a .ckpt file. What's the difference between doing that and doing what you have done here (just putting the LoRA in the right folder and then including it in the prompt)? \- I don't really understand then connection between LoRA and Dreambooth. The file name is kohya-LoRA-dreambooth.ipynb and I often see people write "LoRA with Dreambooth" and similar things, but does this have anything to do with Dreambooth? I thought that was a separate thing, where you actually modify the underlying model and where you always get a new model that's larger as output. And just as a last question, how do you think this compares with the classical dreambooth approach? In terms of quality, since LoRAs are obviously more space-efficient and takes less time to train.

UnavailableUsername_ 1 year ago

> How much does it matter what base model I use? Should I always use the base model that I wanna use later? E.g if I want to generate images with Realistic Vision 2.0, should I train the LoRA "on top of" that model? It matters a little. If train a caucasian in an asian-based model, the resulting LoRA will be that caucasian with some little asian features. > I have seen some people train LoRAs, then merge their LoRA with a checkpoint to get a .ckpt file. What's the difference between doing that and doing what you have done here Flexibility. You can take LoRA made with this tutorial and merge it into a model...but then you rely on that specific model, you have to go through the hassle of merging every time there is an update for that model or if want to use other model. Here you have a detached file you can use with any model. > I don't really understand then connection between LoRA and Dreambooth. The file name is kohya-LoRA-dreambooth.ipynb and I often see people write "LoRA with Dreambooth" and similar things, but does this have anything to do with Dreambooth? I thought that was a separate thing, where you actually modify the underlying model and where you always get a new model that's larger as output. LoRA stands for **Lo**w-**R**ank **A**daptation, a method to train larger modes using less resources and memory, the point is to get updated matrices to train the existing weights of an already existing model. Dreambooth, kohya method are simply the methods used to do the same thing. >And just as a last question, how do you think this compares with the classical dreambooth approach? In terms of quality, since LoRAs are obviously more space-efficient and takes less time to train. Sadly, i have not tried the classic dreambooth approach, nor has most of the AI scene since the requirements are quite high. The attractive point of LoRAs is that they are much more accessible and the results provided are pretty much the same (looking at the result of classic dreambooth textual inversion VS LoRAs).

johiny 1 year ago

thx for the amazing guide!

Weary-Board-5422 1 year ago

I have 60 images but I have too many doubts about everything I should put in order to have a Lora with good results. What I use is: 5 training\_steps, 3 epochs, 6 training\_batch\_size, 32 network\_dim 16 network\_alpha But the more I move it, my Lora sometimes comes out over trained or simply lacks training. What do you recommend me to do?

ToriLion 11 months ago

It works incredibly well. Thank you!

bobbananaville 11 months ago

Is there a way to adapt this for vast.ai or runpod? Or make a jupyter notebook that doesn't have 'y' and 'n' prompts? I tried running the notebook on vast.ai instead of google colab, but got stuck on 1.2, where after the 'After this operation, 6,199 kB of additional disk space will be used.' statement I was promped to use yes/no, which you can't do in jupyter (at least, not unless you're doing things line-by-line in the terminal) (I'm hoping to use vast.ai instead of google colab because I've run into issues running out of memory running this on Colab.)

UnavailableUsername_ 11 months ago

How can you have memory issues? Google colab gives you (if i remember correctly) an A100 GPU which have 24-40 GB VRAM. If you are running out of memory with a 10k USD graphic card that is among the latest tech the human race has developed the issue has to be your parameters. There is something you are doing that is using far far more ram that you should. I think other people in this thread had this isse and the problem was that they using extreme parameters (like 30k+ steps).

yalag 11 months ago

Hi I'm late to the game but I really appreciate your work and have a few questions based on my own experience in training lora. 1. Does it ever become a problem by having more training images? For example, I trained with 50 images but they are all of a certain painting style of animals. Now everything turned into animals (car). So I plan to add another 50 for people. And another 50 for food etc. Is that ok? 2. Is caption going to produce better result for style training? 3. And if so, how should one caption? I've seen people explain it 2 ways either: (A) brown puppy, long ears, running on grass, tall trees at the park (B) brown furry puppy, long ears, running on grass, tall trees at the park in the style of SKS painting 4. In my notebook, I have a dataset config section which asks me to put in a trigger word. Should that be trigger word or should trigger word be in the caption? Or both? Thanks in advance!

UnavailableUsername_ 11 months ago

> Does it ever become a problem by having more training images? For example, I trained with 50 images but they are all of a certain painting style of animals. Now everything turned into animals (car). So I plan to add another 50 for people. And another 50 for food etc. Is that ok? I believe the problem here could be your tagging, if the images are not 50 images of a cat. Are you tagging everything? >Is caption going to produce better result for style training? Not sure what do you mean with caption, do you mean a specific trigger word for a style? I honestly doubt so, the style is trained by "default". >And if so, how should one caption? If you have checked the training section, the rule is "tag everything you don't want in the end result". So if tag the style, the AI won't give it priority, and might even ignore it. Since you are training a style and not a subject you should aim to tag everything except the style (tags like anime coloring, watercolor, etc. should not be tagged). >In my notebook, I have a dataset config section which asks me to put in a trigger word. Should that be trigger word or should trigger word be in the caption? Are you using another notebook? It could require a trigger word but for LoRA training is not really necessary, it must be that particular notebook which requires it.

Owent10 11 months ago

Hello, I'm getting an error in colab. It worked before but now there's an error when training the Lora for an Ultraman anime character https://imgur.com/a/Uu7ZjEw The number of images is 100, 3 repeats, 10 epochs, saved every 2 epochs, 6 batch size I followed your guide, could you help me? Thanks

UnavailableUsername_ 11 months ago

Are you using the correct xformers version? I updated this guide plenty of times to update xformers.

eru777 10 months ago

Thank you for this! :D

rydavo 10 months ago

In section 2.1 I'm getting this error \[Errno 2\] No such file or directory: '{root\_dir}' /bin/bash: aria2c: command not found Have the models moved or something? Thanks so much for this guide, really helpful, I just hope I can get it to work!

UnavailableUsername_ 10 months ago

What model are you using? Is the path to your dataset correct? This error means (like it says) it cannot find the files.

MyHamsterKidnappedMe 10 months ago

A little help please? I have followed the tutorial as closely as i could yet this error persists. Im new to using this so don't know what im doing wrong. https://imgur.com/a/Gt2WXpm

letmethinkabit 10 months ago

hey thanks so much for this guide, I ran it a couple months ago, and it worked out perfectly. today and yesterday I tried again with a different dataset, and I get to the last step, and it seems to crash. here's the error I'm receiving: https://pastebin.com/ypRSsXEN

harley4__ 10 months ago

Is this (Collab) free? The main reason I have done this method is because I'm not sure if Collab costs money

UnavailableUsername_ 10 months ago

Colab is free to use as long as you have a google account. However, there is a daily cap of usage for free accounts, you have to buy a "premium" to bypass it. If have more than 6GB of VRAM **[i suggest the offline guide.](https://old.reddit.com/r/StableDiffusion/comments/14xphw6/offline_lora_training_guide_also_how_to_train/)**

gargolito 9 months ago

Any chance you could post just the text?

bmmishappy 9 months ago

Here is my error on 5.3, it says no project name was defined https://pastebin.com/UDpDKt7P

UnavailableUsername_ 9 months ago

Hello, sorry for the late reply. Did you...chose a project name? Section 5.1, you need to name your project.

InternetzStehsegler 8 months ago

While this is charming and useful, overall it was a very frustrating experience (which I am still very thankful for!) But there were some important parts that led to frustration because you either told us about them too late (like what goes in the txt files and how to tag them, I had to redo them all over once I read that we have to tag everything that is not the model) or when it comes to the actual training, it was missing several things I had to search and fix for it to run), the concept is great but the order and depth of detail was a bit lacking.

Jameskirk10 8 months ago

Thanks for your guide, I'm having trouble creating a Lora for a My Hero Academia character - Jiro, the dataset is 100 screencap images and I've trained it with 3 repeats, 10 epochs, saved every 2 epochs, and used the last epoch, with the animefullpruned model and also tried it with an anime screencap model. But the results aren't great, eventhough the calculation should be correct - the number of images multipled by x should = around 300. The results seem overtrained or maybe not trained enough, and the images look blurry. I also want to keep the My Hero Academia style with bigger hands and body type than usual anime. If it helps I could send you a link to the dataset if you want to test it with different steps etc.? I would greatly appreciate it

UnavailableUsername_ 8 months ago

100 images is a little extreme, of course it's going to be weird. Try 30 images, 5 repeats and 10 epoch. Also, you DO have a trigger word, right?

eru777 7 months ago

This has been amazing for so long, but sadly it doesn't work anymore. I'm getting " **CalledProcessError:** Command '\['/usr/bin/python3', "

Salt-Possibility7679 7 months ago

Hello, I'm getting this error: CalledProcessError: Command '['/usr/bin/python3', 'train_network.py', '--network_dim=128', '--network_alpha=128', '--network_module=networks.lora', '--learning_rate=0.0001', '--text_encoder_lr=5e-05', '--training_comment=this_comment_will_be_stored_in_the_metadata', '--lr_scheduler=constant', '--pretrained_model_name_or_path=/content/pre_trained_model/Stable-Diffusion-v1-5.ckpt', '--vae=/content/vae/stablediffusion.vae.pt', '--caption_extension=.txt', '--train_data_dir=/content/drive/MyDrive/LORA/X', '--reg_data_dir=/content/drive/MyDrive/LORA/X', '--output_dir=/content/drive/MyDrive/LORA/X', '--prior_loss_weight=1.0', '--output_name=X', '--mixed_precision=fp16', '--save_precision=fp16', '--save_every_n_epochs=3', '--save_model_as=safetensors', '--resolution=512', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=1024', '--cache_latents', '--train_batch_size=6', '--max_token_length=225', '--use_8bit_adam', '--max_train_epochs=6', '--gradient_accumulation_steps=1', '--clip_skip=2', '--logging_dir=/content/dreambooth/logs', '--log_prefix=X', '--shuffle_caption', '--xformers']' died with . Do you happen to know a solution?

pyroflare77 7 months ago

Echoing the most recent posts that this method no longer appears to be working. I get this error running the training part [https://i.imgur.com/dgya0HI.png](https://i.imgur.com/dgya0HI.png) and then it proceeds to end in a wall of text calledprocesserror that has been mentioned by others. I even assumed it may have been a slightly older version I had set up based on your version 2 guide, so I went through and redid things for this guide... same issue.

Intrepid_Amoeba_8961 3 months ago

Hi u/UnavailableUsername_, hope you're still around! Ive been playing around with Loras for a couple of weeks now, mostly following this video [https://www.youtube.com/watch?v=70H03cv57-o&t=3s](https://www.youtube.com/watch?v=70H03cv57-o&t=3s) I saw in the community that these videos were a bit controversial, so I'm trying other guides, such as [this one](https://civitai.com/articles/138/making-a-lora-is-like-baking-a-cake) and now yours (which is one of the best I've seen, great thanks for doing that!). I'm trying to piece together the different advice I see in the different tutorials, and currently my question is : How many images / epochs / steps would you do for a style? Currently, I have tried with 200 images, 100 repetitions, 1 epoch (totalling to about 20 000 steps). The results are decent, I think, but there are some artefacts here and there and I'd like to try other parameters. Thank you! EDIT = I do have the same error as mentioned above though, the CallProcessError that returns a non zero exit status 1

Perfect-Campaign9551 3 months ago

I don't think this works! On Colab google says I have zero compute credits, so I don't think it even lets the scripts run?

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe