Sato Lab./Sugano Lab.
Sato Lab./Sugano Lab.
Y. Sato Lab.
Sugano Lab.
News
Publications
Contact
Datasets
Internal Wiki
English
日本語
Yifei Huang
Latest
ActionVOS: Action as Prompts for Video Object Segmentation
Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Matching Compound Prototypes for Few-Shot Action Recognition
Proposal-based Temporal Action Localization with Point-level Supervision
FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotations
Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction
Technical Report for EgoTracks in Ego4D Challenge 2023
Weakly Supervised Temporal Sentence Grounding With Uncertainty-Guided Self-Training
Fine-grained Affordance Annotation for Egocentric Hand-Object Interaction Videos
Compound Prototype Matching for Few-shot Action Recognition
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Interact before Align: Leveraging Cross-Modal Knowledge for Domain Adaptive Action Recognition
Precise Affordance Annotation for Egocentric Action Video Datasets
Spatio-Temporal Perturbations for Video Attribution
Leveraging Human Selective Attention for Medical Image Analysis with Limited Training Data
Stacked Temporal Attention: Improving First-person Action Recognition by Emphasizing Discriminative Clips
EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2021: Team M3EM Technical Report
Toward Visually Explaining Video Understanding Networks by Perturbation
Mutual Context Network for Jointly Estimating Egocentric Gaze and Action
Improving Action Segmentation via Graph Based Temporal Reasoning
An ego-vision system for discovering human joint attention
Manipulation-Skill Assessment from Videos with Spatial Attention Network
Mutual Context Network for Jointly Estimating Egocentric Gaze and Actions
Predicting Gaze in Egocentric Video by Learning Task-Dependent Attention Transition
Temporal Localization and Spatial Segmentation of Joint Attention in Multiple First-Person Videos
Cite
×