Commit Graph

130 Commits

Author SHA1 Message Date
comfyanonymous
1a4bd9e9a6 Refactor the attention functions.
There's no reason for the whole CrossAttention object to be repeated when
only the operation in the middle changes.
2023-10-11 20:38:48 -04:00
comfyanonymous
fff491b032 Model patches can now know which batch is positive and negative. 2023-09-27 12:04:07 -04:00
comfyanonymous
afa2399f79 Add a way to set output block patches to modify the h and hsp. 2023-09-22 20:26:47 -04:00
comfyanonymous
1938f5c5fe Add a force argument to soft_empty_cache to force a cache empty. 2023-09-04 00:58:18 -04:00
Simon Lui
2da73b7073 Revert changes in comfy/ldm/modules/diffusionmodules/util.py, which is unused. 2023-09-02 20:07:52 -07:00
Simon Lui
4a0c4ce4ef Some fixes to generalize CUDA specific functionality to Intel or other GPUs. 2023-09-02 18:22:10 -07:00
comfyanonymous
0e3b641172 Remove xformers related print. 2023-09-01 02:12:03 -04:00
comfyanonymous
bed116a1f9 Remove optimization that caused border. 2023-08-29 11:21:36 -04:00
comfyanonymous
1c794a2161 Fallback to slice attention if xformers doesn't support the operation. 2023-08-27 22:24:42 -04:00
comfyanonymous
d935ba50c4 Make --bf16-vae work on torch 2.0 2023-08-27 21:33:53 -04:00
comfyanonymous
cf5ae46928 Controlnet/t2iadapter cleanup. 2023-08-22 01:06:26 -04:00
comfyanonymous
b80c3276dc Fix issue with gligen. 2023-08-18 16:32:23 -04:00
comfyanonymous
d6e4b342e6 Support for Control Loras.
Control loras are controlnets where some of the weights are stored in
"lora" format: an up and a down low rank matrice that when multiplied
together and added to the unet weight give the controlnet weight.

This allows a much smaller memory footprint depending on the rank of the
matrices.

These controlnets are used just like regular ones.
2023-08-18 11:59:51 -04:00
comfyanonymous
2b13939044 Remove some useless code. 2023-07-30 14:13:33 -04:00
comfyanonymous
95d796fc85 Faster VAE loading. 2023-07-29 16:28:30 -04:00
comfyanonymous
4b957a0010 Initialize the unet directly on the target device. 2023-07-29 14:51:56 -04:00
comfyanonymous
9ba440995a It's actually possible to torch.compile the unet now. 2023-07-18 21:36:35 -04:00
comfyanonymous
ddc6f12ad5 Disable autocast in unet for increased speed. 2023-07-05 21:58:29 -04:00
comfyanonymous
103c487a89 Cleanup. 2023-07-02 11:58:23 -04:00
comfyanonymous
78d8035f73 Fix bug with controlnet. 2023-06-24 11:02:38 -04:00
comfyanonymous
05676942b7 Add some more transformer hooks and move tomesd to comfy_extras.
Tomesd now uses q instead of x to decide which tokens to merge because
it seems to give better results.
2023-06-24 03:30:22 -04:00
comfyanonymous
fa28d7334b Remove useless code. 2023-06-23 12:35:26 -04:00
comfyanonymous
f87ec10a97 Support base SDXL and SDXL refiner models.
Large refactor of the model detection and loading code.
2023-06-22 13:03:50 -04:00
comfyanonymous
9fccf4aa03 Add original_shape parameter to transformer patch extra_options. 2023-06-21 13:22:01 -04:00
comfyanonymous
8883cb0f67 Add a way to set patches that modify the attn2 output.
Change the transformer patches function format to be more future proof.
2023-06-18 22:58:22 -04:00
comfyanonymous
ae43f09ef7 All the unet weights should now be initialized with the right dtype. 2023-06-15 18:42:30 -04:00
comfyanonymous
7bf89ba923 Initialize more unet weights as the right dtype. 2023-06-15 15:00:10 -04:00
comfyanonymous
e21d9ad445 Initialize transformer unet block weights in right dtype at the start. 2023-06-15 14:29:26 -04:00
comfyanonymous
21f04fe632 Disable default weight values in unet conv2d for faster loading. 2023-06-14 19:46:08 -04:00
comfyanonymous
9d54066ebc This isn't needed for inference. 2023-06-14 13:05:08 -04:00
comfyanonymous
6971646b8b Speed up model loading a bit.
Default pytorch Linear initializes the weights which is useless and slow.
2023-06-14 12:09:41 -04:00
comfyanonymous
274dff3257 Remove more useless files. 2023-06-13 02:22:19 -04:00
comfyanonymous
f0a2b81cd0 Cleanup: Remove a bunch of useless files. 2023-06-13 02:19:08 -04:00
comfyanonymous
b8636a44aa Make scaled_dot_product switch to sliced attention on OOM. 2023-05-20 16:01:02 -04:00
comfyanonymous
797c4e8d3b Simplify and improve some vae attention code. 2023-05-20 15:07:21 -04:00
BlenderNeko
d9e088ddfd minor changes for tiled sampler 2023-05-12 23:49:09 +02:00
comfyanonymous
cb1551b819 Lowvram mode for gligen and fix some lowvram issues. 2023-05-05 18:11:41 -04:00
comfyanonymous
bae4fb4a9d Fix imports. 2023-05-04 18:10:29 -04:00
comfyanonymous
ba8a4c3667 Change latent resolution step to 8. 2023-05-02 14:17:51 -04:00
comfyanonymous
66c8aa5c3e Make unet work with any input shape. 2023-05-02 13:31:43 -04:00
comfyanonymous
5282f56434 Implement Linear hypernetworks.
Add a HypernetworkLoader node to use hypernetworks.
2023-04-23 12:35:25 -04:00
comfyanonymous
6908f9c949 This makes pytorch2.0 attention perform a bit faster. 2023-04-22 14:30:39 -04:00
comfyanonymous
3696d1699a Add support for GLIGEN textbox model. 2023-04-19 11:06:32 -04:00
comfyanonymous
73c3e11e83 Fix model_management import so it doesn't get executed twice. 2023-04-15 19:04:33 -04:00
EllangoK
e5e587b1c0 seperates out arg parser and imports args 2023-04-05 23:41:23 -04:00
comfyanonymous
e46b1c3034 Disable xformers in VAE when xformers == 0.0.18 2023-04-04 22:22:02 -04:00
comfyanonymous
539ff487a8 Pull latest tomesd code from upstream. 2023-04-03 15:49:28 -04:00
comfyanonymous
809bcc8ceb Add support for unCLIP SD2.x models.
See _for_testing/unclip in the UI for the new nodes.

unCLIPCheckpointLoader is used to load them.

unCLIPConditioning is used to add the image cond and takes as input a
CLIPVisionEncode output which has been moved to the conditioning section.
2023-04-01 23:19:15 -04:00
comfyanonymous
0d972b85e6 This seems to give better quality in tome. 2023-03-31 18:36:18 -04:00
comfyanonymous
18a6c1db33 Add a TomePatchModel node to the _for_testing section.
Tome increases sampling speed at the expense of quality.
2023-03-31 17:19:58 -04:00
comfyanonymous
61ec3c9d5d Add a way to pass options to the transformers blocks. 2023-03-31 13:04:39 -04:00
comfyanonymous
3ed4a4e4e6 Try again with vae tiled decoding if regular fails because of OOM. 2023-03-22 14:49:00 -04:00
comfyanonymous
c692509c2b Try to improve VAEEncode memory usage a bit. 2023-03-22 02:45:18 -04:00
comfyanonymous
54dbfaf2ec Remove omegaconf dependency and some ci changes. 2023-03-13 14:49:18 -04:00
comfyanonymous
83f23f82b8 Add pytorch attention support to VAE. 2023-03-13 12:45:54 -04:00
comfyanonymous
a256a2abde --disable-xformers should not even try to import xformers. 2023-03-13 11:36:48 -04:00
comfyanonymous
0f3ba7482f Xformers is now properly disabled when --cpu used.
Added --windows-standalone-build option, currently it only opens
makes the code open up comfyui in the browser.
2023-03-12 15:44:16 -04:00
comfyanonymous
1de86851b1 Try to fix memory issue. 2023-03-11 15:15:13 -05:00
edikius
165be5828a
Fixed import (#44)
* fixed import error

I had an
ImportError: cannot import name 'Protocol' from 'typing'
while trying to update so I fixed it to start an app

* Update main.py

* deleted example files
2023-03-06 11:41:40 -05:00
comfyanonymous
cc8baf1080 Make VAE use common function to get free memory. 2023-03-05 14:20:07 -05:00
comfyanonymous
798c90e1c0 Fix pytorch 2.0 cross attention not working. 2023-03-05 14:14:54 -05:00
comfyanonymous
c1f5855ac1 Make some cross attention functions work on the CPU. 2023-03-03 03:27:33 -05:00
comfyanonymous
1a612e1c74 Add some pytorch scaled_dot_product_attention code for testing.
--use-pytorch-cross-attention to use it.
2023-03-02 17:01:20 -05:00
comfyanonymous
9502ee45c3 Hopefully fix a strange issue with xformers + lowvram. 2023-02-28 13:48:52 -05:00
comfyanonymous
fcb25d37db Prepare for t2i adapter. 2023-02-24 23:36:17 -05:00
comfyanonymous
c9daec4c89 Remove prints that are useless when xformers is enabled. 2023-02-21 22:16:13 -05:00
comfyanonymous
09f1d76ed8 Fix an OOM issue. 2023-02-17 16:21:01 -05:00
comfyanonymous
4efa67fa12 Add ControlNet support. 2023-02-16 10:38:08 -05:00
comfyanonymous
1a4edd19cd Fix overflow issue with inplace softmax. 2023-02-10 11:47:41 -05:00
comfyanonymous
509c7dfc6d Use real softmax in split op to fix issue with some images. 2023-02-10 03:13:49 -05:00
comfyanonymous
1f6a467e92 Update ldm dir with latest upstream stable diffusion changes. 2023-02-09 13:47:36 -05:00
comfyanonymous
773cdabfce Same thing but for the other places where it's used. 2023-02-09 12:43:29 -05:00
comfyanonymous
df40d4f3bf torch.cuda.OutOfMemoryError is not present on older pytorch versions. 2023-02-09 12:33:27 -05:00
comfyanonymous
e8c499ddd4 Split optimization for VAE attention block. 2023-02-08 22:04:20 -05:00
comfyanonymous
5b4e312749 Use inplace operations for less OOM issues. 2023-02-08 22:04:13 -05:00
comfyanonymous
047775615b Lower the chances of an OOM. 2023-02-08 14:24:27 -05:00
comfyanonymous
1daccf3678 Run softmax in place if it OOMs. 2023-01-30 19:55:01 -05:00
comfyanonymous
50db297cf6 Try to fix OOM issues with cards that have less vram than mine. 2023-01-29 00:50:46 -05:00
comfyanonymous
051f472e8f Fix sub quadratic attention for SD2 and make it the default optimization. 2023-01-25 01:22:43 -05:00
comfyanonymous
220afe3310 Initial commit. 2023-01-16 22:37:14 -05:00