comfyanonymous
843a7ff70c
fp16 is actually faster than fp32 on a GTX 1080.
2024-08-21 23:23:50 -04:00
comfyanonymous
a60620dcea
Fix slow performance on 10 series Nvidia GPUs.
2024-08-21 16:39:02 -04:00
comfyanonymous
015f73dc49
Try a different type of flux fp16 fix.
2024-08-21 16:17:15 -04:00
comfyanonymous
904bf58e7d
Make --fast work on pytorch nightly.
2024-08-21 14:01:41 -04:00
Svein Ove Aas
5f50263088
Replace use of .view with .reshape ( #4522 )
...
When generating images with fp8_e4_m3 Flux and batch size >1, using --fast, ComfyUI throws a "view size is not compatible with input tensor's size and stride" error pointing at the first of these two calls to view.
As reshape is semantically equivalent to view except for working on a broader set of inputs, there should be no downside to changing this. The only difference is that it clones the underlying data in cases where .view would error out. I have confirmed that the output still looks as expected, but cannot confirm that no mutable use is made of the tensors anywhere.
Note that --fast is only marginally faster than the default.
2024-08-21 11:21:48 -04:00
comfyanonymous
03ec517afb
Remove useless line, adjust windows default reserved vram.
2024-08-21 00:47:19 -04:00
comfyanonymous
510f3438c1
Speed up fp8 matrix mult by using better code.
2024-08-20 22:53:26 -04:00
comfyanonymous
ea63b1c092
Simpletrainer lycoris format.
2024-08-20 12:05:13 -04:00
comfyanonymous
9953f22fce
Add --fast argument to enable experimental optimizations.
...
Optimizations that might break things/lower quality will be put behind
this flag first and might be enabled by default in the future.
Currently the only optimization is float8_e4m3fn matrix multiplication on
4000/ADA series Nvidia cards or later. If you have one of these cards you
will see a speed boost when using fp8_e4m3fn flux for example.
2024-08-20 11:55:51 -04:00
comfyanonymous
d1a6bd6845
Support loading long clipl model with the CLIP loader node.
2024-08-20 10:46:36 -04:00
comfyanonymous
83dbac28eb
Properly set if clip text pooled projection instead of using hack.
2024-08-20 10:46:36 -04:00
comfyanonymous
538cb068bc
Make cast_to a nop if weight is already good.
2024-08-20 10:46:36 -04:00
comfyanonymous
1b3eee672c
Fix potential issue with multi devices.
2024-08-20 10:46:36 -04:00
comfyanonymous
9eee470244
New load_text_encoder_state_dicts function.
...
Now you can load text encoders straight from a list of state dicts.
2024-08-19 17:36:35 -04:00
comfyanonymous
045377ea89
Add a --reserve-vram argument if you don't want comfy to use all of it.
...
--reserve-vram 1.0 for example will make ComfyUI try to keep 1GB vram free.
This can also be useful if workflows are failing because of OOM errors but
in that case please report it if --reserve-vram improves your situation.
2024-08-19 17:16:18 -04:00
comfyanonymous
4d341b78e8
Bug fixes.
2024-08-19 16:28:55 -04:00
comfyanonymous
6138f92084
Use better dtype for the lowvram lora system.
2024-08-19 15:35:25 -04:00
comfyanonymous
be0726c1ed
Remove duplication.
2024-08-19 15:26:50 -04:00
comfyanonymous
4506ddc86a
Better subnormal fp8 stochastic rounding. Thanks Ashen.
2024-08-19 13:38:03 -04:00
comfyanonymous
20ace7c853
Code cleanup.
2024-08-19 12:48:59 -04:00
comfyanonymous
22ec02afc0
Handle subnormal numbers in float8 rounding.
2024-08-19 05:51:08 -04:00
comfyanonymous
39f114c44b
Less broken non blocking?
2024-08-18 16:53:17 -04:00
comfyanonymous
6730f3e1a3
Disable non blocking.
...
It fixed some perf issues but caused other issues that need to be debugged.
2024-08-18 14:38:09 -04:00
comfyanonymous
73332160c8
Enable non blocking transfers in lowvram mode.
2024-08-18 10:29:33 -04:00
comfyanonymous
2622c55aff
Automatically use RF variant of dpmpp_2s_ancestral if RF model.
2024-08-18 00:47:25 -04:00
Ashen
1beb348ee2
dpmpp_2s_ancestral_RF for rectified flow (Flux, SD3 and Auraflow).
2024-08-18 00:33:30 -04:00
comfyanonymous
d31df04c8a
Indentation.
2024-08-17 23:00:44 -04:00
Xrvk
e68763f40c
Add Flux model support for InstantX style controlnet residuals ( #4444 )
...
* Add Flux model support for InstantX style controlnet residuals
* Refactor Flux controlnet residual step to a separate method
* Rollback minor change
* New format for applying controlnet residuals: input->double_blocks, output->single_blocks
* Adjust XLabs Flux controlnet to fit new syntax of applying Flux controlnet residuals
* Remove unnecessary import and minor style change
2024-08-17 22:58:23 -04:00
comfyanonymous
4f7a3cb6fb
unet -> diffusion_models.
2024-08-17 21:31:04 -04:00
comfyanonymous
bb222ceddb
Fix loras having a weak effect when applied on fp8.
2024-08-17 15:20:17 -04:00
comfyanonymous
fca42836f2
Add model_options for text encoder.
2024-08-17 11:17:20 -04:00
comfyanonymous
cd5017c1c9
calculate_weight function to use a different dtype.
2024-08-17 01:06:08 -04:00
comfyanonymous
83f343146a
Fix potential lowvram issue.
2024-08-16 17:12:42 -04:00
Matthew Turnshek
1770fc77ed
Implement support for taef1 latent previews ( #4409 )
...
* add taef1 handling to several places
* remove guess_latent_channels and add latent_channels info directly to flux model
* remove TODO
* fix numbers
2024-08-16 12:53:13 -04:00
comfyanonymous
5960f946a9
Move a few files from comfy -> comfy_execution.
...
Python code in the comfy folder should not import things from outside it.
2024-08-15 11:21:14 -04:00
guill
5cfe38f41c
Execution Model Inversion ( #2666 )
...
* Execution Model Inversion
This PR inverts the execution model -- from recursively calling nodes to
using a topological sort of the nodes. This change allows for
modification of the node graph during execution. This allows for two
major advantages:
1. The implementation of lazy evaluation in nodes. For example, if a
"Mix Images" node has a mix factor of exactly 0.0, the second image
input doesn't even need to be evaluated (and visa-versa if the mix
factor is 1.0).
2. Dynamic expansion of nodes. This allows for the creation of dynamic
"node groups". Specifically, custom nodes can return subgraphs that
replace the original node in the graph. This is an incredibly
powerful concept. Using this functionality, it was easy to
implement:
a. Components (a.k.a. node groups)
b. Flow control (i.e. while loops) via tail recursion
c. All-in-one nodes that replicate the WebUI functionality
d. and more
All of those were able to be implemented entirely via custom nodes,
so those features are *not* a part of this PR. (There are some
front-end changes that should occur before that functionality is
made widely available, particularly around variant sockets.)
The custom nodes associated with this PR can be found at:
https://github.com/BadCafeCode/execution-inversion-demo-comfyui
Note that some of them require that variant socket types ("*") be
enabled.
* Allow `input_info` to be of type `None`
* Handle errors (like OOM) more gracefully
* Add a command-line argument to enable variants
This allows the use of nodes that have sockets of type '*' without
applying a patch to the code.
* Fix an overly aggressive assertion.
This could happen when attempting to evaluate `IS_CHANGED` for a node
during the creation of the cache (in order to create the cache key).
* Fix Pyright warnings
* Add execution model unit tests
* Fix issue with unused literals
Behavior should now match the master branch with regard to undeclared
inputs. Undeclared inputs that are socket connections will be used while
undeclared inputs that are literals will be ignored.
* Make custom VALIDATE_INPUTS skip normal validation
Additionally, if `VALIDATE_INPUTS` takes an argument named `input_types`,
that variable will be a dictionary of the socket type of all incoming
connections. If that argument exists, normal socket type validation will
not occur. This removes the last hurdle for enabling variant types
entirely from custom nodes, so I've removed that command-line option.
I've added appropriate unit tests for these changes.
* Fix example in unit test
This wouldn't have caused any issues in the unit test, but it would have
bugged the UI if someone copy+pasted it into their own node pack.
* Use fstrings instead of '%' formatting syntax
* Use custom exception types.
* Display an error for dependency cycles
Previously, dependency cycles that were created during node expansion
would cause the application to quit (due to an uncaught exception). Now,
we'll throw a proper error to the UI. We also make an attempt to 'blame'
the most relevant node in the UI.
* Add docs on when ExecutionBlocker should be used
* Remove unused functionality
* Rename ExecutionResult.SLEEPING to PENDING
* Remove superfluous function parameter
* Pass None for uneval inputs instead of default
This applies to `VALIDATE_INPUTS`, `check_lazy_status`, and lazy values
in evaluation functions.
* Add a test for mixed node expansion
This test ensures that a node that returns a combination of expanded
subgraphs and literal values functions correctly.
* Raise exception for bad get_node calls.
* Minor refactor of IsChangedCache.get
* Refactor `map_node_over_list` function
* Fix ui output for duplicated nodes
* Add documentation on `check_lazy_status`
* Add file for execution model unit tests
* Clean up Javascript code as per review
* Improve documentation
Converted some comments to docstrings as per review
* Add a new unit test for mixed lazy results
This test validates that when an output list is fed to a lazy node, the
node will properly evaluate previous nodes that are needed by any inputs
to the lazy node.
No code in the execution model has been changed. The test already
passes.
* Allow kwargs in VALIDATE_INPUTS functions
When kwargs are used, validation is skipped for all inputs as if they
had been mentioned explicitly.
* List cached nodes in `execution_cached` message
This was previously just bugged in this PR.
2024-08-15 11:21:11 -04:00
comfyanonymous
0f9c2a7822
Try to fix SDXL OOM issue on some configurations.
2024-08-14 23:08:54 -04:00
comfyanonymous
f1d6cef71c
Revert "Disable cuda malloc by default."
...
This reverts commit 50bf66e5c4
.
2024-08-14 08:38:07 -04:00
comfyanonymous
33fb282d5c
Fix issue.
2024-08-14 02:51:47 -04:00
comfyanonymous
50bf66e5c4
Disable cuda malloc by default.
2024-08-14 02:49:25 -04:00
comfyanonymous
a5af64d3ce
Revert "Not sure if this actually changes anything but it can't hurt."
...
This reverts commit 34608de2e9
.
2024-08-14 01:05:17 -04:00
comfyanonymous
34608de2e9
Not sure if this actually changes anything but it can't hurt.
2024-08-13 13:29:16 -04:00
comfyanonymous
39fb74c5bd
Fix bug when model cannot be partially unloaded.
2024-08-13 03:57:55 -04:00
comfyanonymous
74e124f4d7
Fix some issues with TE being in lowvram mode.
2024-08-12 23:42:21 -04:00
comfyanonymous
a562c17e8a
load_unet -> load_diffusion_model with a model_options argument.
2024-08-12 23:20:57 -04:00
comfyanonymous
5942c17d55
Order of operations matters.
2024-08-12 21:56:18 -04:00
comfyanonymous
c032b11e07
xlabs Flux controlnet implementation. ( #4260 )
...
* xlabs Flux controlnet.
* Fix not working on old python.
* Remove comment.
2024-08-12 21:22:22 -04:00
comfyanonymous
b8ffb2937f
Memory tweaks.
2024-08-12 15:07:11 -04:00
comfyanonymous
5d43e75e5b
Fix some issues with the model sometimes not getting patched.
2024-08-12 12:27:54 -04:00
comfyanonymous
517f4a94e4
Fix some lora loading slowdowns.
2024-08-12 11:50:32 -04:00
comfyanonymous
52a471c5c7
Change name of log.
2024-08-12 10:35:06 -04:00
comfyanonymous
ad76574cb8
Fix some potential issues with the previous commits.
2024-08-12 00:23:29 -04:00
comfyanonymous
9acfe4df41
Support loading directly to vram with CLIPLoader node.
2024-08-12 00:06:01 -04:00
comfyanonymous
9829b013ea
Fix mistake in last commit.
2024-08-12 00:00:17 -04:00
comfyanonymous
5c69cde037
Load TE model straight to vram if certain conditions are met.
2024-08-11 23:52:43 -04:00
comfyanonymous
e9589d6d92
Add a way to set model dtype and ops from load_checkpoint_guess_config.
2024-08-11 08:50:34 -04:00
comfyanonymous
0d82a798a5
Remove the ckpt_path from load_state_dict_guess_config.
2024-08-11 08:37:35 -04:00
ljleb
925fff26fd
alternative to load_checkpoint_guess_config
that accepts a loaded state dict ( #4249 )
...
* make alternative fn
* add back ckpt path as 2nd argument?
2024-08-11 08:36:52 -04:00
comfyanonymous
75b9b55b22
Fix issues with #4302 and support loading diffusers format flux.
2024-08-10 21:28:24 -04:00
Jaret Burkett
1765f1c60c
FLUX: Added full diffusers mapping for FLUX.1 schnell and dev. Adds full LoRA support from diffusers LoRAs. ( #4302 )
2024-08-10 21:26:41 -04:00
comfyanonymous
1de69fe4d5
Fix some issues with inference slowing down.
2024-08-10 16:21:25 -04:00
comfyanonymous
ae197f651b
Speed up hunyuan dit inference a bit.
2024-08-10 07:36:27 -04:00
comfyanonymous
1b5b8ca81a
Fix regression.
2024-08-09 21:45:21 -04:00
comfyanonymous
6678d5cf65
Fix regression.
2024-08-09 14:02:38 -04:00
TTPlanetPig
e172564eea
Update controlnet.py to fix the default controlnet weight as constant ( #4285 )
2024-08-09 13:40:05 -04:00
comfyanonymous
a3cc326748
Better fix for lowvram issue.
2024-08-09 12:16:25 -04:00
comfyanonymous
86a97e91fc
Fix controlnet regression.
2024-08-09 12:08:58 -04:00
comfyanonymous
5acdadc9f3
Fix issue with some lowvram weights.
2024-08-09 03:58:28 -04:00
comfyanonymous
55ad9d5f8c
Fix regression.
2024-08-09 03:36:40 -04:00
comfyanonymous
a9f04edc58
Implement text encoder part of HunyuanDiT loras.
2024-08-09 03:21:10 -04:00
comfyanonymous
a475ec2300
Cleanup HunyuanDit controlnets.
...
Use the: ControlNetApply SD3 and HunyuanDiT node.
2024-08-09 02:59:34 -04:00
来新璐
06eb9fb426
feat: add support for HunYuanDit ControlNet ( #4245 )
...
* add support for HunYuanDit ControlNet
* fix hunyuandit controlnet
* fix typo in hunyuandit controlnet
* fix typo in hunyuandit controlnet
* fix code format style
* add control_weight support for HunyuanDit Controlnet
* use control_weights in HunyuanDit Controlnet
* fix typo
2024-08-09 02:59:24 -04:00
comfyanonymous
413322645e
Raw torch is faster than einops?
2024-08-08 22:09:29 -04:00
comfyanonymous
11200de970
Cleaner code.
2024-08-08 20:07:09 -04:00
comfyanonymous
037c38eb0f
Try to improve inference speed on some machines.
2024-08-08 17:29:27 -04:00
comfyanonymous
1e11d2d1f5
Better prints.
2024-08-08 17:29:27 -04:00
comfyanonymous
66d4233210
Fix.
2024-08-08 15:16:51 -04:00
comfyanonymous
591010b7ef
Support diffusers text attention flux loras.
2024-08-08 14:45:52 -04:00
comfyanonymous
08f92d55e9
Partial model shift support.
2024-08-08 14:45:06 -04:00
comfyanonymous
8115d8cce9
Add Flux fp16 support hack.
2024-08-07 15:08:39 -04:00
comfyanonymous
6969fc9ba4
Make supported_dtypes a priority list.
2024-08-07 15:00:06 -04:00
comfyanonymous
cb7c4b4be3
Workaround for lora OOM on lowvram mode.
2024-08-07 14:30:54 -04:00
comfyanonymous
1208863eca
Fix "Comfy" lora keys.
...
They are in this format now:
diffusion_model.full.model.key.name.lora_up.weight
2024-08-07 13:49:31 -04:00
comfyanonymous
e1c528196e
Fix bundled embed.
2024-08-07 13:30:45 -04:00
comfyanonymous
17030fd4c0
Support for "Comfy" lora format.
...
The keys are just: model.full.model.key.name.lora_up.weight
It is supported by all comfyui supported models.
Now people can just convert loras to this format instead of having to ask
for me to implement them.
2024-08-07 13:18:32 -04:00
comfyanonymous
c19dcd362f
Controlnet code refactor.
2024-08-07 12:59:28 -04:00
comfyanonymous
1c08bf35b4
Support format for embeddings bundled in loras.
2024-08-07 03:45:25 -04:00
comfyanonymous
b334605a66
Fix OOMs happening in some cases.
...
A cloned model patcher sometimes reported a model was loaded on a device
when it wasn't.
2024-08-06 13:36:04 -04:00
comfyanonymous
c14ac98fed
Unload models and load them back in lowvram mode no free vram.
2024-08-06 03:22:39 -04:00
comfyanonymous
2d75df45e6
Flux tweak memory usage.
2024-08-05 21:58:28 -04:00
comfyanonymous
8edbcf5209
Improve performance on some lowend GPUs.
2024-08-05 16:24:04 -04:00
a-One-Fan
a178e25912
Fix Flux FP64 math on XPU ( #4210 )
2024-08-05 01:26:20 -04:00
comfyanonymous
78e133d041
Support simple diffusers Flux loras.
2024-08-04 22:05:48 -04:00
Silver
7afa985fba
Correct spelling 'token_weight_pars_t5' to 'token_weight_pairs_t5' ( #4200 )
2024-08-04 17:10:02 -04:00
comfyanonymous
3b71f84b50
ONNX tracing fixes.
2024-08-04 15:45:43 -04:00
comfyanonymous
0a6b008117
Fix issue with some custom nodes.
2024-08-04 10:03:33 -04:00
comfyanonymous
f7a5107784
Fix crash.
2024-08-03 16:55:38 -04:00
comfyanonymous
91be9c2867
Tweak lowvram memory formula.
2024-08-03 16:44:50 -04:00
comfyanonymous
03c5018c98
Lower lowvram memory to 1/3 of free memory.
2024-08-03 15:14:07 -04:00
comfyanonymous
2ba5cc8b86
Fix some issues.
2024-08-03 15:06:40 -04:00
comfyanonymous
1e68002b87
Cap lowvram to half of free memory.
2024-08-03 14:50:20 -04:00
comfyanonymous
ba9095e5bd
Automatically use fp8 for diffusion model weights if:
...
Checkpoint contains weights in fp8.
There isn't enough memory to load the diffusion model in GPU vram.
2024-08-03 13:45:19 -04:00
comfyanonymous
f123328b82
Load T5 in fp8 if it's in fp8 in the Flux checkpoint.
2024-08-03 12:39:33 -04:00
comfyanonymous
63a7e8edba
More aggressive batch splitting.
2024-08-03 11:53:30 -04:00
comfyanonymous
ea03c9dcd2
Better per model memory usage estimations.
2024-08-02 18:09:24 -04:00
comfyanonymous
3a9ee995cf
Tweak regular SD memory formula.
2024-08-02 17:34:30 -04:00
comfyanonymous
47da42d928
Better Flux vram estimation.
2024-08-02 17:02:35 -04:00
Alexander Brown
ce9ac2fe05
Fix clip_g/clip_l mixup ( #4168 )
2024-08-01 21:40:56 -04:00
comfyanonymous
e638f2858a
Hack to make all resolutions work on Flux models.
2024-08-01 21:39:18 -04:00
comfyanonymous
d420bc792a
Tweak the memory usage formulas for Flux and SD.
2024-08-01 17:53:45 -04:00
comfyanonymous
d965474aaa
Make ComfyUI split batches a higher priority than weight offload.
2024-08-01 16:39:59 -04:00
comfyanonymous
1c61361fd2
Fast preview support for Flux.
2024-08-01 16:28:11 -04:00
comfyanonymous
a6decf1e62
Fix bfloat16 potentially not being enabled on mps.
2024-08-01 16:18:44 -04:00
comfyanonymous
48eb1399c0
Try to fix mac issue.
2024-08-01 13:41:27 -04:00
comfyanonymous
d7430a1651
Add a way to load the diffusion model in fp8 with UNETLoader node.
2024-08-01 13:30:51 -04:00
comfyanonymous
f2b80f95d2
Better Mac support on flux model.
2024-08-01 13:10:50 -04:00
comfyanonymous
1aa9cf3292
Make lowvram more aggressive on low memory machines.
2024-08-01 12:11:57 -04:00
comfyanonymous
eb96c3bd82
Fix .sft file loading (they are safetensors files).
2024-08-01 11:32:58 -04:00
comfyanonymous
5f98de7697
Load flux t5 in fp8 if weights are in fp8.
2024-08-01 11:05:56 -04:00
comfyanonymous
8d34211a7a
Fix old python versions no longer working.
2024-08-01 09:57:20 -04:00
comfyanonymous
1589b58d3e
Basic Flux Schnell and Flux Dev model implementation.
2024-08-01 09:49:29 -04:00
comfyanonymous
7ad574bffd
Mac supports bf16 just make sure you are using the latest pytorch.
2024-08-01 09:42:17 -04:00
comfyanonymous
e2382b6adb
Make lowvram less aggressive when there are large amounts of free memory.
2024-08-01 03:58:58 -04:00
comfyanonymous
c24f897352
Fix to get fp8 working on T5 base.
2024-07-31 02:00:19 -04:00
comfyanonymous
a5991a7aa6
Fix hunyuan dit text encoder weights always being in fp32.
2024-07-31 01:34:57 -04:00
comfyanonymous
2c038ccef0
Lower CLIP memory usage by a bit.
2024-07-31 01:32:35 -04:00
comfyanonymous
b85216a3c0
Lower T5 memory usage by a few hundred MB.
2024-07-31 00:52:34 -04:00
comfyanonymous
82cae45d44
Fix potential issue with non clip text embeddings.
2024-07-30 14:41:13 -04:00
comfyanonymous
25853d0be8
Use common function for casting weights to input.
2024-07-30 10:49:14 -04:00
comfyanonymous
79040635da
Remove unnecessary code.
2024-07-30 05:01:34 -04:00
comfyanonymous
66d35c07ce
Improve artifacts on hydit, auraflow and SD3 on specific resolutions.
...
This breaks seeds for resolutions that are not a multiple of 16 in pixel
resolution by using circular padding instead of reflection padding but
should lower the amount of artifacts when doing img2img at those
resolutions.
2024-07-29 20:48:50 -04:00
comfyanonymous
4ba7fa0244
Refactor: Move sd2_clip.py to text_encoders folder.
2024-07-28 01:19:20 -04:00
comfyanonymous
cf4418b806
Don't treat Bert model like CLIP.
...
Bert can accept up to 512 tokens so any prompt with more than 77 should
just be passed to it as is instead of splitting it up like CLIP.
2024-07-26 13:08:12 -04:00
comfyanonymous
8328a2d8cd
Let hunyuan dit work with all prompt lengths.
2024-07-26 12:11:32 -04:00
comfyanonymous
afe732bef9
Hunyuan dit can now accept longer prompts.
2024-07-26 11:52:58 -04:00
comfyanonymous
a9ac56fc0d
Own BertModel implementation that works with lowvram.
2024-07-26 04:47:17 -04:00
comfyanonymous
25b51b1a8b
Hunyuan DiT lora support.
2024-07-25 22:42:54 -04:00
comfyanonymous
a5f4292f9f
Basic hunyuan dit implementation. ( #4102 )
...
* Let tokenizers return weights to be stored in the saved checkpoint.
* Basic hunyuan dit implementation.
* Fix some resolutions not working.
* Support hydit checkpoint save.
* Init with right dtype.
* Switch to optimized attention in pooler.
* Fix black images on hunyuan dit.
2024-07-25 18:21:08 -04:00
comfyanonymous
f87810cd3e
Let tokenizers return weights to be stored in the saved checkpoint.
2024-07-25 10:52:09 -04:00
comfyanonymous
10c919f4c7
Make it possible to load tokenizer data from checkpoints.
2024-07-24 16:43:53 -04:00
comfyanonymous
10b43ceea5
Remove duplicate code.
2024-07-24 01:12:59 -04:00
comfyanonymous
0a4c49c57c
Support MT5.
2024-07-23 15:35:28 -04:00
comfyanonymous
88ed893034
Allow SPieceTokenizer to load model from a byte string.
2024-07-23 14:17:42 -04:00
comfyanonymous
334ba48cea
More generic unet prefix detection code.
2024-07-23 14:13:32 -04:00
comfyanonymous
14764aa2e2
Rename LLAMATokenizer to SPieceTokenizer.
2024-07-22 12:21:45 -04:00
comfyanonymous
b2c995f623
"auto" type is only relevant to the SetUnionControlNetType node.
2024-07-22 11:30:38 -04:00
Chenlei Hu
4151fbfa8a
Add error message on union controlnet ( #4081 )
2024-07-22 11:27:32 -04:00
comfyanonymous
95fa9545f1
Only append zero to noise schedule if last sigma isn't zero.
2024-07-20 12:37:30 -04:00
comfyanonymous
6ab8cad22e
Implement beta sampling scheduler.
...
It is based on: https://arxiv.org/abs/2407.12173
Add "beta" to the list of schedulers and the BetaSamplingScheduler node.
2024-07-19 18:05:09 -04:00
喵哩个咪
855789403b
support clip-vit-large-patch14-336 ( #4042 )
...
* support clip-vit-large-patch14-336
* support clip-vit-large-patch14-336
2024-07-17 13:12:50 -04:00
comfyanonymous
6f7869f365
Get clip vision image size from config.
2024-07-17 13:05:38 -04:00
comfyanonymous
281ad42df4
Fix lowvram union controlnet bug.
2024-07-17 10:16:31 -04:00
Thomas Ward
c5a48b15bd
Make default hash lib configurable without code changes via CLI argument ( #3947 )
...
* cli_args: Add --duplicate-check-hash-function.
* server.py: compare_image_hash configurable hash function
Uses an argument added in cli_args to specify the type of hashing to default to for duplicate hash checking. Uses an `eval()` to identify the specific hashlib class to utilize, but ultimately safely operates because we have specific options and only those options/choices in the arg parser. So we don't have any unsafe input there.
* Add hasher() to node_helpers
* hashlib selection moved to node_helpers
* default-hashing-function instead of dupe checking hasher
This makes a default-hashing-function option instead of previous selected option.
* Use args.default_hashing_function
* Use safer handling for node_helpers.hasher()
Uses a safer handling method than `eval` to evaluate default hashing function.
* Stray parentheses are evil.
* Indentation fix.
Somehow when I hit save I didn't notice I missed a space to make indentation work proper. Oops!
2024-07-16 18:27:09 -04:00
comfyanonymous
8270c62530
Add SetUnionControlNetType to set the type of the union controlnet model.
2024-07-16 17:04:53 -04:00
comfyanonymous
821f93872e
Allow model sampling to set number of timesteps.
2024-07-16 15:18:40 -04:00
Chenlei Hu
99458e8aca
Add FrontendManager
to manage non-default front-end impl ( #3897 )
...
* Add frontend manager
* Add tests
* nit
* Add unit test to github CI
* Fix path
* nit
* ignore
* Add logging
* Install test deps
* Remove 'stable' keyword support
* Update test
* Add web-root arg
* Rename web-root to front-end-root
* Add test on non-exist version number
* Use repo owner/name to replace hard coded provider list
* Inline cmd args
* nit
* Fix unit test
2024-07-16 11:26:11 -04:00
comfyanonymous
1305fb294c
Refactor: Move some code to the comfy/text_encoders folder.
2024-07-15 17:36:24 -04:00
comfyanonymous
7914c47d5a
Quick fix for the promax controlnet.
2024-07-14 10:07:36 -04:00
comfyanonymous
a3dffc447a
Support AuraFlow Lora and loading model weights in diffusers format.
...
You can load model weights in diffusers format using the UNETLoader node.
2024-07-13 13:51:40 -04:00
comfyanonymous
29c2e26724
Better tokenizing code for AuraFlow.
2024-07-12 01:15:25 -04:00
comfyanonymous
8e012043a9
Add a ModelSamplingAuraFlow node to change the shift value.
...
Set the default AuraFlow shift value to 1.73 (sqrt(3)).
2024-07-11 17:57:36 -04:00
comfyanonymous
9f291d75b3
AuraFlow model implementation.
2024-07-11 16:52:26 -04:00
comfyanonymous
f45157e3ac
Fix error message never being shown.
2024-07-11 11:46:51 -04:00
comfyanonymous
5e1fced639
Cleaner support for loading different diffusion model types.
2024-07-11 11:37:31 -04:00
comfyanonymous
ffe0bb0a33
Remove useless code.
2024-07-10 20:33:12 -04:00
comfyanonymous
391c1046cf
More flexibility with text encoder return values.
...
Text encoders can now return other values to the CONDITIONING than the cond
and pooled output.
2024-07-10 20:06:50 -04:00
comfyanonymous
e44fa5667f
Support returning text encoder attention masks.
2024-07-10 19:31:22 -04:00
Extraltodeus
f1a01c2c7e
Add sampler_pre_cfg_function ( #3979 )
...
* Update samplers.py
* Update model_patcher.py
2024-07-09 16:20:49 -04:00
comfyanonymous
ade7aa1b0c
Remove useless import.
2024-07-09 11:05:05 -04:00
comfyanonymous
faa57430b0
Controlnet union model basic implementation.
...
This is only the model code itself, it currently defaults to an empty
embedding [0] * 6 which seems to work better than treating it like a
regular controlnet.
TODO: Add nodes to select the image type.
2024-07-08 23:49:02 -04:00
comfyanonymous
bb663bcd6c
Rename clip_t5base to t5base for stable audio text encoder.
2024-07-08 08:53:55 -04:00
comfyanonymous
2dc84d1444
Add a way to set the timestep multiplier in the flow sampling.
2024-07-06 04:06:03 -04:00
comfyanonymous
ff63893d10
Support other types of T5 models.
2024-07-06 02:42:53 -04:00
comfyanonymous
4040491149
Better T5xxl detection.
2024-07-06 00:53:33 -04:00
comfyanonymous
b8e58a9394
Cleanup T5 code a bit.
2024-07-06 00:36:49 -04:00
comfyanonymous
80c4590998
Allow specifying the padding token for the tokenizer.
2024-07-06 00:06:49 -04:00
comfyanonymous
ce649d61c0
Allow zeroing out of embeds with unused attention mask.
2024-07-05 23:48:17 -04:00
comfyanonymous
739b76630e
Remove useless code.
2024-07-04 15:14:13 -04:00
comfyanonymous
d7484ef30c
Support loading checkpoints with the UNETLoader node.
2024-07-03 11:34:32 -04:00
comfyanonymous
537f35c7bc
Don't update dict if contiguous.
2024-07-02 20:21:51 -04:00
Alex "mcmonkey" Goodwin
3f46362d22
fix non-contiguous tensor saving (from channels-last) ( #3932 )
2024-07-02 20:16:33 -04:00
Chenlei Hu
9dd549e253
Add --no-custom-node
cmd flag ( #3903 )
...
* Add --no-custom-node cmd flag
* nit
2024-07-01 17:54:03 -04:00
comfyanonymous
05e831697a
Switch to the real cfg++ method in the samplers.
...
The old _pp ones will be updated automatically to the regular ones with 2x
the cfg.
My fault for not checking what the "_pp" samplers actually did.
2024-06-29 11:59:48 -04:00
comfyanonymous
264caca20e
ControlNetApplySD3 node can now be used to use SD3 controlnets.
2024-06-27 18:43:11 -04:00
comfyanonymous
f8f7568d03
Basic SD3 controlnet implementation.
...
Still missing the node to properly use it.
2024-06-27 18:43:11 -04:00
comfyanonymous
66aaa14001
Controlnet refactor.
2024-06-27 18:43:11 -04:00
comfyanonymous
8ceb5a02a3
Support saving stable audio checkpoint that can be loaded back.
2024-06-27 11:06:52 -04:00
comfyanonymous
4f9d2b057c
Remove print.
2024-06-27 02:54:15 -04:00
comfyanonymous
44947e7ad4
Add DEIS order 3 sampler.
...
Order 4 seems to give bad results.
2024-06-26 22:40:05 -04:00
comfyanonymous
69d710e40f
Implement my alternative take on CFG++ as the euler_pp sampler.
...
Add euler_ancestral_pp which is the ancestral version of euler with the
same modification.
2024-06-25 07:41:52 -04:00
comfyanonymous
73ca780019
Add SamplerEulerCFG++ node.
...
This node should match the DDIM implementation of CFG++ when "regular" is
selected.
"alternative" is a slightly different take on CFG++
2024-06-23 13:21:18 -04:00
comfyanonymous
2f360ae898
Support OneTrainer SD3 lora format.
2024-06-22 13:08:04 -04:00
comfyanonymous
4ef1479dcd
Multi dimension tiled scale function and tiled VAE audio encoding fallback.
2024-06-22 11:57:49 -04:00
comfyanonymous
1e2839f4d9
More proper tiled audio decoding.
2024-06-20 16:50:31 -04:00
comfyanonymous
d5efde89b7
Add ipndm_v sampler, works best with the exponential scheduler.
2024-06-20 08:51:49 -04:00
comfyanonymous
028a583bef
Fix issue with full diffusers SD3 loras.
2024-06-19 22:32:04 -04:00
comfyanonymous
0d6a57938e
Support loading diffusers SD3 model format with UNETLoader node.
2024-06-19 22:21:18 -04:00
comfyanonymous
b08a9dd04b
Remove empty line.
2024-06-19 20:20:35 -04:00
Mario Klingemann
eee815ec99
Update sd1_clip.py ( #3684 )
...
Made token instance check more flexible so it also works with integers from numpy arrays or long tensors
2024-06-19 16:42:41 -04:00
comfyanonymous
e11052afcf
Add ipndm sampler.
2024-06-19 16:32:30 -04:00
comfyanonymous
3914d5a2ae
Support full SD3 loras.
2024-06-19 10:13:33 -04:00
comfyanonymous
a45df69570
Basic tiled decoding for audio VAE.
2024-06-17 22:48:23 -04:00
Janek Mann
b7c473d1ab
Fix lora keys for SimpleTuner ( #3759 )
2024-06-17 07:55:06 -04:00
comfyanonymous
6425252c4f
Use fp16 as the default vae dtype for the audio VAE.
2024-06-16 13:12:54 -04:00
comfyanonymous
8ddc151a4c
Squash depreciation warning on new pytorch.
2024-06-16 13:06:23 -04:00
comfyanonymous
ca9d300a80
Better estimation for memory usage during audio VAE encoding/decoding.
2024-06-16 11:47:32 -04:00
comfyanonymous
746a0410d4
Fix VAEEncode with taesd3.
2024-06-16 03:10:04 -04:00
comfyanonymous
04e8798c37
Improvements to the TAESD3 implementation.
2024-06-16 02:04:24 -04:00
Dr.Lt.Data
df7db0e027
support TAESD3 ( #3738 )
2024-06-16 02:03:53 -04:00
comfyanonymous
bb1969cab7
Initial support for the stable audio open model.
2024-06-15 12:14:56 -04:00
comfyanonymous
1281f933c1
Small optimization.
2024-06-15 02:44:38 -04:00
comfyanonymous
f2e844e054
Optimize some unneeded if conditions in the sampling code.
2024-06-15 02:26:19 -04:00
comfyanonymous
0ec513d877
Add a --force-channels-last to inference models in channel last mode.
2024-06-15 01:08:12 -04:00
comfyanonymous
0e06b370db
Print key names for easier debugging.
2024-06-14 18:18:53 -04:00
Simon Lui
5eb98f0092
Exempt IPEX from non_blocking previews fixing segmentation faults. ( #3708 )
2024-06-13 18:51:14 -04:00
comfyanonymous
ac151ac169
Support SD3 diffusers lora.
2024-06-13 18:26:10 -04:00
comfyanonymous
37a08a41b3
Support setting weight offsets in weight patcher.
2024-06-13 17:21:26 -04:00
comfyanonymous
605e64f6d3
Fix lowvram issue.
2024-06-12 10:39:33 -04:00
comfyanonymous
1ddf512fdc
Don't auto convert clip and vae weights to fp16 when saving checkpoint.
2024-06-12 01:07:58 -04:00
comfyanonymous
694e0b48e0
SD3 better memory usage estimation.
2024-06-12 00:49:00 -04:00
comfyanonymous
69c8d6d8a6
Single and dual clip loader nodes support SD3.
...
You can use the CLIPLoader to use the t5xxl only or the DualCLIPLoader to
use CLIP-L and CLIP-G only for sd3.
2024-06-11 23:27:39 -04:00
comfyanonymous
0e49211a11
Load the SD3 T5xxl model in the same dtype stored in the checkpoint.
2024-06-11 17:03:26 -04:00
comfyanonymous
5889b7ca0a
Support multiple text encoder configurations on SD3.
2024-06-11 13:14:43 -04:00
comfyanonymous
9424522ead
Reuse code.
2024-06-11 07:20:26 -04:00
Dango233
73ce178021
Remove redundancy in mmdit.py ( #3685 )
2024-06-11 06:30:25 -04:00
comfyanonymous
a82fae2375
Fix bug with cosxl edit model.
2024-06-10 16:00:03 -04:00
comfyanonymous
8c4a9befa7
SD3 Support.
2024-06-10 14:06:23 -04:00
comfyanonymous
a5e6a632f9
Support sampling non 2D latents.
2024-06-10 01:31:09 -04:00
comfyanonymous
742d5720d1
Support zeroing out text embeddings with the attention mask.
2024-06-09 16:51:58 -04:00
comfyanonymous
6cd8ffc465
Reshape the empty latent image to the right amount of channels if needed.
2024-06-08 02:35:08 -04:00
comfyanonymous
56333d4850
Use the end token for the text encoder attention mask.
2024-06-07 03:05:23 -04:00
comfyanonymous
104fcea0c8
Add function to get the list of currently loaded models.
2024-06-05 23:25:16 -04:00
comfyanonymous
b1fd26fe9e
pytorch xpu should be flash or mem efficient attention?
2024-06-04 17:44:14 -04:00
comfyanonymous
809cc85a8e
Remove useless code.
2024-06-02 19:23:37 -04:00
comfyanonymous
b249862080
Add an annoying print to a function I want to remove.
2024-06-01 12:47:31 -04:00
comfyanonymous
bf3e334d46
Disable non_blocking when --deterministic or directml.
2024-05-30 11:07:38 -04:00
JettHu
b26da2245f
Fix UnetParams annotation typo ( #3589 )
2024-05-27 19:30:35 -04:00
comfyanonymous
0920e0e5fe
Remove some unused imports.
2024-05-27 19:08:27 -04:00
comfyanonymous
ffc4b7c30e
Fix DORA strength.
...
This is a different version of #3298 with more correct behavior.
2024-05-25 02:50:11 -04:00
comfyanonymous
efa5a711b2
Reduce memory usage when applying DORA: #3557
2024-05-24 23:36:48 -04:00
comfyanonymous
6c23854f54
Fix OSX latent2rgb previews.
2024-05-22 13:56:28 -04:00
Chenlei Hu
7718ada4ed
Add type annotation UnetWrapperFunction ( #3531 )
...
* Add type annotation UnetWrapperFunction
* nit
* Add types.py
2024-05-22 02:07:27 -04:00
comfyanonymous
8508df2569
Work around black image bug on Mac 14.5 by forcing attention upcasting.
2024-05-21 16:56:33 -04:00
comfyanonymous
83d969e397
Disable xformers when tracing model.
2024-05-21 13:55:49 -04:00
comfyanonymous
1900e5119f
Fix potential issue.
2024-05-20 08:19:54 -04:00
comfyanonymous
09e069ae6c
Log the pytorch version.
2024-05-20 06:22:29 -04:00
comfyanonymous
11a2ad5110
Fix controlnet not upcasting on models that have it enabled.
2024-05-19 17:58:03 -04:00
comfyanonymous
0bdc2b15c7
Cleanup.
2024-05-18 10:11:44 -04:00
comfyanonymous
98f828fad9
Remove unnecessary code.
2024-05-18 09:36:44 -04:00
comfyanonymous
19300655dd
Don't automatically switch to lowvram mode on GPUs with low memory.
2024-05-17 00:31:32 -04:00
comfyanonymous
46daf0a9a7
Add debug options to force on and off attention upcasting.
2024-05-16 04:09:41 -04:00
comfyanonymous
2d41642716
Fix lowvram dora issue.
2024-05-15 02:47:40 -04:00
comfyanonymous
ec6f16adb6
Fix SAG.
2024-05-14 18:02:27 -04:00
comfyanonymous
bb4940d837
Only enable attention upcasting on models that actually need it.
2024-05-14 17:00:50 -04:00
comfyanonymous
b0ab31d06c
Refactor attention upcasting code part 1.
2024-05-14 12:47:31 -04:00
Simon Lui
f509c6fe21
Fix Intel GPU memory allocation accuracy and documentation update. ( #3459 )
...
* Change calculation of memory total to be more accurate, allocated is actually smaller than reserved.
* Update README.md install documentation for Intel GPUs.
2024-05-12 06:36:30 -04:00
comfyanonymous
fa6dd7e5bb
Fix lowvram issue with saving checkpoints.
...
The previous fix didn't cover the case where the model was loaded in
lowvram mode right before.
2024-05-12 06:13:45 -04:00
comfyanonymous
49c20cdc70
No longer necessary.
2024-05-12 05:34:43 -04:00
comfyanonymous
e1489ad257
Fix issue with lowvram mode breaking model saving.
2024-05-11 21:55:20 -04:00
comfyanonymous
93e876a3be
Remove warnings that confuse people.
2024-05-09 05:29:42 -04:00
comfyanonymous
cd07340d96
Typo fix.
2024-05-08 18:36:56 -04:00
comfyanonymous
c61eadf69a
Make the load checkpoint with config function call the regular one.
...
I was going to completely remove this function because it is unmaintainable
but I think this is the best compromise.
The clip skip and v_prediction parts of the configs should still work but
not the fp16 vs fp32.
2024-05-06 20:04:39 -04:00
Simon Lui
a56d02efc7
Change torch.xpu to ipex.optimize, xpu device initialization and remove workaround for text node issue from older IPEX. ( #3388 )
2024-05-02 03:26:50 -04:00
comfyanonymous
f81a6fade8
Fix some edge cases with samplers and arrays with a single sigma.
2024-05-01 17:05:30 -04:00
comfyanonymous
2aed53c4ac
Workaround xformers bug.
2024-04-30 21:23:40 -04:00
Garrett Sutula
bacce529fb
Add TLS Support ( #3312 )
...
* Add TLS Support
* Add to readme
* Add guidance for windows users on generating certificates
* Add guidance for windows users on generating certificates
* Fix typo
2024-04-30 20:17:02 -04:00
Jedrzej Kosinski
7990ae18c1
Fix error when more cond masks passed in than batch size ( #3353 )
2024-04-26 12:51:12 -04:00
comfyanonymous
8dc19e40d1
Don't init a VAE model when there are no VAE weights.
2024-04-24 09:20:31 -04:00
comfyanonymous
c59fe9f254
Support VAE without quant_conv.
2024-04-18 21:05:33 -04:00
comfyanonymous
719fb2c81d
Add basic PAG node.
2024-04-14 23:49:50 -04:00
comfyanonymous
258dbc06c3
Fix some memory related issues.
2024-04-14 12:08:58 -04:00
comfyanonymous
58812ab8ca
Support SDXS 512 model.
2024-04-12 22:12:35 -04:00
comfyanonymous
831511a1ee
Fix issue with sampling_settings persisting across models.
2024-04-09 23:20:43 -04:00
comfyanonymous
30abc324c2
Support properly saving CosXL checkpoints.
2024-04-08 00:36:22 -04:00
comfyanonymous
0a03009808
Fix issue with controlnet models getting loaded multiple times.
2024-04-06 18:38:39 -04:00
kk-89
38ed2da2dd
Fix typo in lowvram patcher ( #3209 )
2024-04-05 12:02:13 -04:00
comfyanonymous
1088d1850f
Support for CosXL models.
2024-04-05 10:53:41 -04:00
comfyanonymous
41ed7e85ea
Fix object_patches_backup not being the same object across clones.
2024-04-05 00:22:44 -04:00
comfyanonymous
0f5768e038
Fix missing arguments in cfg_function.
2024-04-04 23:38:57 -04:00
comfyanonymous
1f4fc9ea0c
Fix issue with get_model_object on patched model.
2024-04-04 23:01:02 -04:00
comfyanonymous
1a0486bb96
Fix model needing to be loaded on GPU to generate the sigmas.
2024-04-04 22:08:49 -04:00
comfyanonymous
c6bd456c45
Make zero denoise a NOP.
2024-04-04 11:41:27 -04:00
comfyanonymous
fcfd2bdf8a
Small cleanup.
2024-04-04 11:16:49 -04:00
comfyanonymous
0542088ef8
Refactor sampler code for more advanced sampler nodes part 2.
2024-04-04 01:26:41 -04:00
comfyanonymous
57753c964a
Refactor sampling code for more advanced sampler nodes.
2024-04-03 22:09:51 -04:00
comfyanonymous
6c6a39251f
Fix saving text encoder in fp8.
2024-04-02 11:46:34 -04:00
comfyanonymous
e6482fbbfc
Refactor calc_cond_uncond_batch into calc_cond_batch.
...
calc_cond_batch can take an arbitrary amount of cond inputs.
Added a calc_cond_uncond_batch wrapper with a warning so custom nodes
won't break.
2024-04-01 18:07:47 -04:00
comfyanonymous
575acb69e4
IP2P model loading support.
...
This is the code to load the model and inference it with only a text
prompt. This commit does not contain the nodes to properly use it with an
image input.
This supports both the original SD1 instructpix2pix model and the
diffusers SDXL one.
2024-03-31 03:10:28 -04:00
comfyanonymous
94a5a67c32
Cleanup to support different types of inpaint models.
2024-03-29 14:44:13 -04:00
comfyanonymous
5d8898c056
Fix some performance issues with weight loading and unloading.
...
Lower peak memory usage when changing model.
Fix case where model weights would be unloaded and reloaded.
2024-03-28 18:04:42 -04:00
comfyanonymous
327ca1313d
Support SDXS 0.9
2024-03-27 23:58:58 -04:00
comfyanonymous
ae77590b4e
dora_scale support for lora file.
2024-03-25 18:09:23 -04:00
comfyanonymous
c6de09b02e
Optimize memory unload strategy for more optimized performance.
2024-03-24 02:36:30 -04:00
comfyanonymous
0624838237
Add inverse noise scaling function.
2024-03-21 14:49:11 -04:00
comfyanonymous
5d875d77fe
Fix regression with lcm not working with batches.
2024-03-20 20:48:54 -04:00
comfyanonymous
4b9005e949
Fix regression with model merging.
2024-03-20 13:56:12 -04:00
comfyanonymous
c18a203a8a
Don't unload model weights for non weight patches.
2024-03-20 02:27:58 -04:00
comfyanonymous
150a3e946f
Make LCM sampler use the model noise scaling function.
2024-03-20 01:35:59 -04:00
comfyanonymous
40e124c6be
SV3D support.
2024-03-18 16:54:13 -04:00
comfyanonymous
cacb022c4a
Make saved SD1 checkpoints match more closely the official one.
2024-03-18 00:26:23 -04:00