ICCV2023論文速遞(2023.8.16)!多篇擴(kuò)散模型/SAM相關(guān)研究
最新成果demo展示:
Helping Hands: An Object-Aware Ego-Centric Video Recognition Model
論文/Paper: http://arxiv.org/pdf/2308.07918
代碼/Code: https://github.com/chuhanxx/helping_hand_for_egocentric_videos
Memory-and-Anticipation Transformer for Online Action Understanding
論文/Paper: http://arxiv.org/pdf/2308.07893
代碼/Code: https://github.com/echo0125/memory-and-anticipation-transformer
ObjectSDF++: Improved Object-Compositional Neural Implicit Surfaces
論文/Paper: http://arxiv.org/pdf/2308.07868
代碼/Code: https://github.com/qianyiwu/objectsdf_plus
StyleDiffusion: Controllable Disentangled Style Transfer via Diffusion Models
論文/Paper: http://arxiv.org/pdf/2308.07863
代碼/Code: None
ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition
論文/Paper: http://arxiv.org/pdf/2308.07815
代碼/Code: https://github.com/cool-xuan/imbalanced_sam
Learning to Identify Critical States for Reinforcement Learning from Videos
論文/Paper: http://arxiv.org/pdf/2308.07795
代碼/Code: https://github.com/ai-initiative-kaust/videorlcs
Identity-Consistent Aggregation for Video Object Detection
論文/Paper: http://arxiv.org/pdf/2308.07737
代碼/Code: None
UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation
論文/Paper: http://arxiv.org/pdf/2308.07732
代碼/Code: https://github.com/haiyang-w/unitr
DiffGuard: Semantic Mismatch-Guided Out-of-Distribution Detection using Pre-trained Diffusion Models
論文/Paper: http://arxiv.org/pdf/2308.07687
代碼/Code: https://github.com/cure-lab/diffguard
Boosting Multi-modal Model Performance with Adaptive Gradient Modulation
論文/Paper: http://arxiv.org/pdf/2308.07686
代碼/Code: https://github.com/lihong2303/agm_iccv2023
Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval
論文/Paper: http://arxiv.org/pdf/2308.07648
代碼/Code: None
Backpropagation Path Search On Adversarial Transferability
論文/Paper: http://arxiv.org/pdf/2308.07625
代碼/Code: None
Story Visualization by Online Text Augmentation with Context Memory
論文/Paper: http://arxiv.org/pdf/2308.07575
代碼/Code: None
3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack
論文/Paper: http://arxiv.org/pdf/2308.07546
代碼/Code: None
Boosting Semi-Supervised Learning by bridging high and low-confidence predictions
論文/Paper: http://arxiv.org/pdf/2308.07509
代碼/Code: None
DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation
論文/Paper: http://arxiv.org/pdf/2308.07498
代碼/Code: https://github.com/hanqingwangai/Dreamwalker
Probabilistic MIMO U-Net: Efficient and Accurate Uncertainty Estimation for Pixel-wise Regression
論文/Paper: http://arxiv.org/pdf/2308.07477
代碼/Code: https://github.com/antonbaumann/mimo-unet
PARIS: Part-level Reconstruction and Motion Analysis for Articulated Objects
論文/Paper: http://arxiv.org/pdf/2308.07391
代碼/Code: None
DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding
論文/Paper: http://arxiv.org/pdf/2308.07787
代碼/Code: https://github.com/joannahong/diffv2s
