comfyanonymous
8773ccf74d
Better memory estimation for ROCm that support mem efficient attention.
...
There is no way to check if the card actually supports it so it assumes
that it does if you use --use-pytorch-cross-attention with yours.
2025-02-13 08:32:36 -05:00
comfyanonymous
1d5d6586f3
Fix ruff.
2025-02-12 06:49:16 -05:00
zhoufan2956
35740259de
mix_ascend_bf16_infer_err ( #6794 )
2025-02-12 06:48:11 -05:00
comfyanonymous
ab888e1e0b
Add add_weight_wrapper function to model patcher.
...
Functions can now easily be added to wrap/modify model weights.
2025-02-12 05:55:35 -05:00
comfyanonymous
d9f0fcdb0c
Cleanup.
2025-02-11 17:17:03 -05:00
HishamC
b124256817
Fix for running via DirectML ( #6542 )
...
* Fix for running via DirectML
Fix DirectML empty image generation issue with Flux1. add CPU fallback for unsupported path. Verified the model works on AMD GPUs
* fix formating
* update casual mask calculation
2025-02-11 17:11:32 -05:00
comfyanonymous
af4b7c91be
Make --force-fp16 actually force the diffusion model to be fp16.
2025-02-11 08:33:09 -05:00
comfyanonymous
4027466c80
Make lumina model work with any latent resolution.
2025-02-10 00:24:20 -05:00
comfyanonymous
095d867147
Remove useless function.
2025-02-09 07:02:57 -05:00
Pam
caeb27c3a5
res_multistep: Fix cfgpp and add ancestral samplers ( #6731 )
2025-02-08 19:39:58 -05:00
comfyanonymous
3d06e1c555
Make error more clear to user.
2025-02-08 18:57:24 -05:00
catboxanon
43a74c0de1
Allow FP16 accumulation with --fast
( #6453 )
...
Currently only applies to PyTorch nightly releases. (>=20250208)
2025-02-08 17:00:56 -05:00
comfyanonymous
079eccc92a
Don't compress http response by default.
...
Remove argument to disable it.
Add new --enable-compress-response-body argument to enable it.
2025-02-07 03:29:21 -05:00
comfyanonymous
14880e6dba
Remove some useless code.
2025-02-06 05:00:37 -05:00
comfyanonymous
37cd448529
Set the shift for Lumina back to 6.
2025-02-05 14:49:52 -05:00
comfyanonymous
94f21f9301
Upcasting rope to fp32 seems to make no difference in this model.
2025-02-05 04:32:47 -05:00
comfyanonymous
60653004e5
Use regular numbers for rope in lumina model.
2025-02-05 04:17:25 -05:00
comfyanonymous
a57d635c5f
Fix lumina 2 batches.
2025-02-04 21:48:11 -05:00
comfyanonymous
8ac2dddeed
Lower the default shift of lumina to reduce artifacts.
2025-02-04 06:50:37 -05:00
comfyanonymous
3e880ac709
Fix on python 3.9
2025-02-04 04:20:56 -05:00
comfyanonymous
e5ea112a90
Support Lumina 2 model.
2025-02-04 04:16:30 -05:00
comfyanonymous
44e19a28d3
Use maximum negative value instead of -inf for masks in text encoders.
...
This is probably more correct.
2025-02-02 09:46:00 -05:00
Dr.Lt.Data
0a0df5f136
better guide message for sageattention ( #6634 )
2025-02-02 09:26:47 -05:00
KarryCharon
24d6871e47
add disable-compres-response-body cli args; add compress middleware; ( #6672 )
2025-02-02 09:24:55 -05:00
comfyanonymous
9e1d301129
Only use stable cascade lora format with cascade model.
2025-02-01 06:35:22 -05:00
comfyanonymous
8d8dc9a262
Allow batch of different sigmas when noise scaling.
2025-01-30 06:49:52 -05:00
filtered
222f48c0f2
Allow changing folder_paths.base_path via command line argument. ( #6600 )
...
* Reimpl. CLI arg directly inside folder_paths.
* Update tests to use CLI arg mocking.
* Revert last-minute refactor.
* Fix test state polution.
2025-01-29 08:06:28 -05:00
comfyanonymous
13fd4d6e45
More friendly error messages for corrupted safetensors files.
2025-01-28 09:41:09 -05:00
comfyanonymous
255edf2246
Lower minimum ratio of loaded weights on Nvidia.
2025-01-27 05:26:51 -05:00
comfyanonymous
67feb05299
Remove redundant code.
2025-01-25 19:04:53 -05:00
comfyanonymous
14ca5f5a10
Remove useless code.
2025-01-24 06:15:54 -05:00
comfyanonymous
96e2a45193
Remove useless code.
2025-01-23 05:56:23 -05:00
Chenlei Hu
dfa2b6d129
Remove unused function lcm in conds.py ( #6572 )
2025-01-23 05:54:09 -05:00
comfyanonymous
d6bbe8c40f
Remove support for python 3.8.
2025-01-22 17:04:30 -05:00
chaObserv
e857dd48b8
Add gradient estimation sampler ( #6554 )
2025-01-22 05:29:40 -05:00
comfyanonymous
fb2ad645a3
Add FluxDisableGuidance node to disable using the guidance embed.
2025-01-20 14:50:24 -05:00
comfyanonymous
d8a7a32779
Cleanup old TODO.
2025-01-20 03:44:13 -05:00
Sergii Dymchenko
ebf038d4fa
Use torch.special.expm1
( #6388 )
...
* Use `torch.special.expm1`
This function provides greater precision than `exp(x) - 1` for small values of `x`.
Found with TorchFix https://github.com/pytorch-labs/torchfix/
* Use non-alias
2025-01-19 04:54:32 -05:00
catboxanon
b1a02131c9
Remove comfy.samplers self-import ( #6506 )
2025-01-18 17:49:51 -05:00
comfyanonymous
507199d9a8
Uni pc sampler now works with audio and video models.
2025-01-18 05:27:58 -05:00
comfyanonymous
2f3ab40b62
Add warning when using old pytorch versions.
2025-01-17 18:47:27 -05:00
comfyanonymous
0aa2368e46
Fix some cosmos fp8 issues.
2025-01-16 17:45:37 -05:00
comfyanonymous
cca96a85ae
Fix cosmos VAE failing with videos longer than 121 frames.
2025-01-16 16:30:06 -05:00
comfyanonymous
31831e6ef1
Code refactor.
2025-01-16 07:23:54 -05:00
comfyanonymous
88ceb28e20
Tweak hunyuan memory usage factor.
2025-01-16 06:31:03 -05:00
comfyanonymous
23289a6a5c
Clean up some debug lines.
2025-01-16 04:24:39 -05:00
comfyanonymous
9d8b6c1f46
More accurate memory estimation for cosmos and hunyuan video.
2025-01-16 03:48:40 -05:00
comfyanonymous
6320d05696
Slightly lower hunyuan video memory usage.
2025-01-16 00:23:01 -05:00
comfyanonymous
25683b5b02
Lower cosmos diffusion model memory usage.
2025-01-15 23:46:42 -05:00
comfyanonymous
4758fb64b9
Lower cosmos VAE memory usage by a bit.
2025-01-15 22:57:52 -05:00
comfyanonymous
008761166f
Optimize first attention block in cosmos VAE.
2025-01-15 21:48:46 -05:00
comfyanonymous
cba58fff0b
Remove unsafe embedding load for very old pytorch.
2025-01-15 04:32:23 -05:00
comfyanonymous
2feb8d0b77
Force safe loading of files in torch format on pytorch 2.4+
...
If this breaks something for you make an issue.
2025-01-15 03:50:27 -05:00
Pam
c78a45685d
Rewrite res_multistep sampler and implement res_multistep_cfg_pp sampler. ( #6462 )
2025-01-14 18:20:06 -05:00
comfyanonymous
3aaabb12d4
Implement Cosmos Image/Video to World (Video) diffusion models.
...
Use CosmosImageToVideoLatent to set the input image/video.
2025-01-14 05:14:10 -05:00
comfyanonymous
1f1c7b7b56
Remove useless code.
2025-01-13 03:52:37 -05:00
comfyanonymous
90f349f93d
Add res_multistep sampler from the cosmos code.
...
This sampler should work with all models.
2025-01-12 03:10:07 -05:00
Jedrzej Kosinski
6c9bd11fa3
Hooks Part 2 - TransformerOptionsHook and AdditionalModelsHook ( #6377 )
...
* Add 'sigmas' to transformer_options so that downstream code can know about the full scope of current sampling run, fix Hook Keyframes' guarantee_steps=1 inconsistent behavior with sampling split across different Sampling nodes/sampling runs by referencing 'sigmas'
* Cleaned up hooks.py, refactored Hook.should_register and add_hook_patches to use target_dict instead of target so that more information can be provided about the current execution environment if needed
* Refactor WrapperHook into TransformerOptionsHook, as there is no need to separate out Wrappers/Callbacks/Patches into different hook types (all affect transformer_options)
* Refactored HookGroup to also store a dictionary of hooks separated by hook_type, modified necessary code to no longer need to manually separate out hooks by hook_type
* In inner_sample, change "sigmas" to "sampler_sigmas" in transformer_options to not conflict with the "sigmas" that will overwrite "sigmas" in _calc_cond_batch
* Refactored 'registered' to be HookGroup instead of a list of Hooks, made AddModelsHook operational and compliant with should_register result, moved TransformerOptionsHook handling out of ModelPatcher.register_all_hook_patches, support patches in TransformerOptionsHook properly by casting any patches/wrappers/hooks to proper device at sample time
* Made hook clone code sane, made clear ObjectPatchHook and SetInjectionsHook are not yet operational
* Fix performance of hooks when hooks are appended via Cond Pair Set Props nodes by properly caching between positive and negative conds, make hook_patches_backup behave as intended (in the case that something pre-registers WeightHooks on the ModelPatcher instead of registering it at sample time)
* Filter only registered hooks on self.conds in CFGGuider.sample
* Make hook_scope functional for TransformerOptionsHook
* removed 4 whitespace lines to satisfy Ruff,
* Add a get_injections function to ModelPatcher
* Made TransformerOptionsHook contribute to registered hooks properly, added some doc strings and removed a so-far unused variable
* Rename AddModelsHooks to AdditionalModelsHook, rename SetInjectionsHook to InjectionsHook (not yet implemented, but at least getting the naming figured out)
* Clean up a typehint
2025-01-11 12:20:23 -05:00
comfyanonymous
ee8a7ab69d
Fast latent preview for Cosmos.
2025-01-11 04:41:24 -05:00
comfyanonymous
2ff3104f70
WIP support for Nvidia Cosmos 7B and 14B text to world (video) models.
2025-01-10 09:14:16 -05:00
comfyanonymous
129d8908f7
Add argument to skip the output reshaping in the attention functions.
2025-01-10 06:27:37 -05:00
comfyanonymous
ff838657fa
Cleaner handling of attention mask in ltxv model code.
2025-01-09 07:12:03 -05:00
comfyanonymous
2307ff6746
Improve some logging messages.
2025-01-08 19:05:22 -05:00
comfyanonymous
d0f3752e33
Properly calculate inner dim for t5 model.
...
This is required to support some different types of t5 models.
2025-01-07 17:33:03 -05:00
comfyanonymous
4209edf48d
Make a few more samplers deterministic.
2025-01-07 02:12:32 -05:00
Chenlei Hu
d055325783
Document get_attr and get_model_object ( #6357 )
...
* Document get_attr and get_model_object
* Update model_patcher.py
* Update model_patcher.py
* Update model_patcher.py
2025-01-06 20:12:22 -05:00
comfyanonymous
916d1e14a9
Make ancestral samplers more deterministic.
2025-01-06 03:04:32 -05:00
Jedrzej Kosinski
c496e53519
In inner_sample, change "sigmas" to "sampler_sigmas" in transformer_options to not conflict with the "sigmas" that will overwrite "sigmas" in _calc_cond_batch ( #6360 )
2025-01-06 01:36:47 -05:00
comfyanonymous
d45ebb63f6
Remove old unused function.
2025-01-04 07:20:54 -05:00
comfyanonymous
9e9c8a1c64
Clear cache as often on AMD as Nvidia.
...
I think the issue this was working around has been solved.
If you notice that this change slows things down or causes stutters on
your AMD GPU with ROCm on Linux please report it.
2025-01-02 08:44:16 -05:00
Andrew Kvochko
0f11d60afb
Fix temporal tiling for decoder, remove redundant tiles. ( #6306 )
...
This commit fixes the temporal tile size calculation, and removes
a redundant tile at the end of the range when its elements are
completely covered by the previous tile.
Co-authored-by: Andrew Kvochko <a.kvochko@lightricks.com>
2025-01-01 16:29:01 -05:00
comfyanonymous
79eea51a1d
Fix and enforce all ruff W rules.
2025-01-01 03:08:33 -05:00
blepping
c0338a46a4
Fix unknown sampler error handling in calculate_sigmas function ( #6280 )
...
Modernize calculate_sigmas function
2024-12-31 17:33:50 -05:00
Jedrzej Kosinski
1c99734e5a
Add missing model_options param ( #6296 )
2024-12-31 14:46:55 -05:00
filtered
67758f50f3
Fix custom node type-hinting examples ( #6281 )
...
* Fix import in comfy_types doc / sample
* Clarify docstring
2024-12-31 03:41:09 -05:00
comfyanonymous
b7572b2f87
Fix and enforce no trailing whitespace.
2024-12-31 03:16:37 -05:00
blepping
a90aafafc1
Add kl_optimal scheduler ( #6206 )
...
* Add kl_optimal scheduler
* Rename kl_optimal_schedule to kl_optimal_scheduler to be more consistent
2024-12-30 05:09:38 -05:00
comfyanonymous
d9b7cfac7e
Fix and enforce new lines at the end of files.
2024-12-30 04:14:59 -05:00
Jedrzej Kosinski
3507870535
Add 'sigmas' to transformer_options so that downstream code can know about the full scope of current sampling run, fix Hook Keyframes' guarantee_steps=1 inconsistent behavior with sampling split across different Sampling nodes/sampling runs by referencing 'sigmas' ( #6273 )
2024-12-30 03:42:49 -05:00
comfyanonymous
a618f768e0
Auto reshape 2d to 3d latent for single image generation on video model.
2024-12-29 02:26:49 -05:00
comfyanonymous
b504bd606d
Add ruff rule for empty line with trailing whitespace.
2024-12-28 05:23:08 -05:00
comfyanonymous
d170292594
Remove some trailing white space.
2024-12-27 18:02:30 -05:00
filtered
9cfd185676
Add option to log non-error output to stdout ( #6243 )
...
* nit
* Add option to log non-error output to stdout
- No change to default behaviour
- Adds CLI argument: --log-stdout
- With this arg present, any logging of a level below logging.ERROR will be sent to stdout instead of stderr
2024-12-27 14:40:05 -05:00
comfyanonymous
4b5bcd8ac4
Closer memory estimation for hunyuan dit model.
2024-12-27 07:37:00 -05:00
comfyanonymous
ceb50b2cbf
Closer memory estimation for pixart models.
2024-12-27 07:30:09 -05:00
comfyanonymous
160ca08138
Use python 3.9 in launch test instead of 3.8
...
Fix ruff check.
2024-12-26 20:05:54 -05:00
Huazhong Ji
c4bfdba330
Support ascend npu ( #5436 )
...
* support ascend npu
Co-authored-by: YukMingLaw <lymmm2@163.com>
Co-authored-by: starmountain1997 <guozr1997@hotmail.com>
Co-authored-by: Ginray <ginray0215@gmail.com>
2024-12-26 19:36:50 -05:00
comfyanonymous
ee9547ba31
Improve temporal VAE Encode (Tiled) math.
2024-12-26 07:18:49 -05:00
comfyanonymous
19a64d6291
Cleanup some mac related code.
2024-12-25 05:32:51 -05:00
comfyanonymous
b486885e08
Disable bfloat16 on older mac.
2024-12-25 05:18:50 -05:00
comfyanonymous
0229228f3f
Clean up the VAE dtypes code.
2024-12-25 04:50:34 -05:00
comfyanonymous
99a1fb6027
Make fast fp8 take a bit less peak memory.
2024-12-24 18:05:19 -05:00
comfyanonymous
73e04987f7
Prevent black images in VAE Decode (Tiled) node.
...
Overlap should be minimum 1 with tiling 2 for tiled temporal VAE decoding.
2024-12-24 07:36:30 -05:00
comfyanonymous
5388df784a
Add temporal tiling to VAE Encode (Tiled) node.
2024-12-24 07:10:09 -05:00
comfyanonymous
bc6dac4327
Add temporal tiling to VAE Decode (Tiled) node.
...
You can now do tiled VAE decoding on the temporal direction for videos.
2024-12-23 20:03:37 -05:00
comfyanonymous
15564688ed
Add a try except block so if torch version is weird it won't crash.
2024-12-23 03:22:48 -05:00
Simon Lui
c6b9c11ef6
Add oneAPI device selector for xpu and some other changes. ( #6112 )
...
* Add oneAPI device selector and some other minor changes.
* Fix device selector variable name.
* Flip minor version check sign.
* Undo changes to README.md.
2024-12-23 03:18:32 -05:00
comfyanonymous
e44d0ac7f7
Make --novram completely offload weights.
...
This flag is mainly used for testing the weight offloading, it shouldn't
actually be used in practice.
Remove useless import.
2024-12-23 01:51:08 -05:00
comfyanonymous
56bc64f351
Comment out some useless code.
2024-12-22 23:51:14 -05:00
zhangp365
f7d83b72e0
fixed a bug in ldm/pixart/blocks.py ( #6158 )
2024-12-22 23:44:20 -05:00