Subaquatic World - Houdini & Comfyui
Introduction to the Project
💡Big Picture:
This tool has unlocked so many opportunities in terms of creativity that are too significant to overlook. I appreciate being the guiding force behind all the elements. I'm not as keen on the unpredictability of AI as I am on building a robust workflow to make my process more efficient. I enjoy having full control over all aspects of my system while allowing AI to work its magic.
🚨Workflow Steps:
Initial Animations: Created in Houdini using Vellum and KineFX . I developed a separate workflow to generate new masks by combining the matte with the color channels of the Beauty pass.
Elements: Corals, reef, environment, and the fish were all upscaled and processed separately using AnimateDiff, then recombined at the end through ksampler and another RIFE node for smooth interpolation. This was exported as image sequences and run through another upscaler.
IPAdapter has been trained on low quality square photos, so I built a setup in Comfy to split a panoramic image into two square images and process them separately. This allowed for a more precise extraction of the aesthetics of the image through IPAdapter, Thanks to Matteo (AT-latent.vision)for his guidance.
The fish was processed again at the end to increase image accuracy.
Upscaling: I also experimented with a few upscaling methods; most can be really taxing on your VRAM. But regardless I found it much more efficient to work with image sequence instead up trying to upscale a generated mp4 video.
💡Quick Masking Tip
this is a simple way to slice a depth layer of you image and turn it into a mask.
Generate more realistic images by using loRAs
How to train one:
Step1. Install Training Software:
Kohya’s Stable Diffusion GUI is a popular and easy to use platform to train LoRAs and Checkpoints. To install it follow the instruction on the Github page.
🚨Please be aware that installing Kohya may lead to numerous Python and CUDA issues. It is highly recommended to carefully follow the installation steps to avoid these problems.
https://github.com/bmaltais/kohya_ss
Step2. Prepare Datasets
At this point, you should have the training software running and your images selected and resized.
You should follow the recommended folder and file structure.
💡watch the following videos for step by step instruction.
Handy Tools:
Bulk Image Resizer:
https://imageresizer.com/bulk-resize
Booru Dataset Tag Manager:
UPSCALING PROCESS
Upscaling steps:
In the Load Images(Path), paste the path to your image sequence in the directory section.
In the CR Upscale Image node, select the upscale_model and set the rescalefactor. I used 4xUltrasharp as the upscale_model and rescale the video to 2x.
In the Film VFI node, set the multiplier. I used 2 as the multiplier.
In the Video Combine node, set the frame_rate. My input video’s frame rate is 24 fps. After the interpolation, the frame rate changed to (24×2 = 48) fps. Therefore, I set the frame_rate to 48 to match it.
Alternatively, you can use the 'Save Image' node to preserve the image sequence as is, and then apply any necessary modifications in Nuke or After Effects at a later stage.
💡List of the models I have used for this example:
Animatediff v3:
https://huggingface.co/guoyww/animatediff/blob/main/v3_sd15_mm.ckpt
AnimateDiff v3 LoRA:
https://huggingface.co/guoyww/animatediff/blob/main/v3_sd15_adapter.ckpt
ControlNet-v1-1_fp16_safetensors:
EpicRealism Checkpoint:
https://civitai.com/models/25694?modelVersionId=143906
Underwater LoRAs:
https://civitai.com/models/107690/underwater-specializationorcoral-and-stone
https://civitai.com/models/59994/undersea-depths-fc-lora
join Our vFX-AI newsletter:
-Ardy