[CVPR 2024] HUGS paper review
HUGS: Human Gaussian Splats
HUGS: Human Gaussian Splats
HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video
Shape of Motion: 4D Reconstruction from a Single Video
Dance In the Wild: Monocular Human Animation with Neural Dynamic Appearance Synthesis
InterHandGen: Two-Hand Interaction Generation via Cascaded Reverse Diffusion
Sitcom3D: The One Where They Reconstructed 3D Humans and Environments in TV shows
multishot: Human Mesh Recovery from Multiple Shots
Spatial VLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities
CoCoOp: Conditional Prompt Learning for Vision-Language Models
CoOp: Learning to Prompt for Vision-Language Models
CLIP: Learning Transferable Visual Models From Natural Language Supervision
METRO: End-to-End Human Pose and Mesh Reconstruction with Transformers
VIBE: Video Inference for Human Body Pose and Shape Estimation
Deformable DETR: Deformable Transfomers for End-to-End Object Detection
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection