RealRestorer A new king of open-source image rescue. An all-in-one DiT (Step1X-Edit) that handles nine restoration tasks. 🔴 Deblurring, denoising, rain/haze removal, low-light enhancement, moiré and reflection removal, and compression artifacts. 🔴1024x1024 pixels. 🔴 Step1X-Edit + Flux-VAE 🔴Outperforming FLUX.1-Kontext-dev and Qwen-Image-Edit. 👉 Project 👉 Code 👉 Model
TokenVibe
AI dev ⬥ vibe‑coding ⬥ creativity One author ⬦ one POV ⬦ zero noise ComfyUI, models, AI stuff
Графики
📊 Средний охват постов
📉 ERR % по дням
📋 Публикации по дням
📎 Типы контента
Лучшие публикации
17 из 17LTX2.3 I2V/T2V ID-LoRA Unlike using custom spoken audio input that strips the ambient sound away, with ID-LoRA you can prompt what the person should say, the background sound, etc And all it needs is a 5 second reference audio, that you can prompt any dialog from based on the reference audio, giving you full flexibility 👉 Workflow 👉 ID-LoRA CelebVHQ-3K 👉 ID-LoRA TalkVid-3K
MACRO Fixes multi-reference image generation by providing a large-scale dataset and benchmark designed to handle up to 10 input images per prompt. Bagel, OmniGen2, Qwen-Image-Edit-2511. 👉 Project 👉 HF 👉 GitHub
ComfyUI-DaVinci-MagiHuman 🔴Block-level CPU/GPU swapping 🔴Async CUDA prefetching 🔴Distill mode 🔴1080p super-resolution 🔴TurboVAE decoder 🔴Audio + video 👉 GitHub
Cohere Transcribe Local ASR just got a massive efficiency upgrade. Hit #1 on the ASR leaderboard (5.42% WER). 🔴 Apache 2.0 🔴 Beats Whisper v3 and ElevenLabs Scribe v2 🔴Robust on messy boardroom acoustics (AMI/Voxpopuli) 🔴 High-throughput Conformer architecture 👉 Blog 👉 HF 👉 Playground
PixelSmile Continuous, high-fidelity facial expression editing across both photorealistic portraits and stylized characters. 🔴12 expression categories with linear intensity control via flow-matching-based latent interpolation. 🔴 Qwen-Image-Edit-2511 + LoRA. 🔴 Preserves subject identity, hairstyle, and background consistency through ArcFace-supervised identity loss. 👉 Project 👉 Model 👉 GitHub
ShotStream Turns narrative video into a next-shot generation task. 🔴 Dual-Cache Memory: keeps global context (inter-shot) + local context (intra-shot). 🔴 4-step generation via DMD. 🔴25x throughput gain. 👉 Project 👉 Models 👉 GitHub
X-Dub A context-rich visual dubbing. Video-to-video editing approach to synchronize lip movements with new audio while preserving identity and handling occlusions. 🔴Noise-level specialized LoRA experts for structure, lip-sync, and texture refinement. 🔴1B-parameter DiT + Whisper audio encoding. 🔴 96.36% success rate on complex dataset 👉 Project 👉 GitHub 👉 Model
PSDesigner An automated graphic design system built on Qwen2.5-VL-7B 🔴Uses three specialized components: AssetCollector, GraphicPlanner, and ToolExecutor to perform 70+ Photoshop operations via JavaScript APIs. 🔴Natively renders text layers, eliminating the spelling and distortion issues common in standard T2I models. 🔴Self-refinement mechanism that autonomously evaluates and fixes suboptimal layers and layouts. 🔴Complex hierarchies, visual effects, and coherent typography from simple text p...
Calibri Calibration is the new fine-tuning. 🔴2x speedup (FLUX 30 -> 15 steps). 🔴 3.3x for Qwen (100 -> 30). 🔴 Zero extra compute or params. 👉 Project 👉 Models 👉 GitHub