Publications
Research
I’m broadly interested in computer vision and machine learning. Much of my research is about 3D vision, graph neural networks, hand-object interaction and robotics.
DAS3R: Dynamics-Aware Gaussian Splatting for Static Scene Reconstruction Kai Xu, Tze Ho Elden Tse, Jizong Peng, Angela Yao arXiv, 2024 [pdf] [code] [webpage] We present a novel framework for scene decomposition and static background reconstruction from unposed videos. |
Humans as Checkerboards: Calibrating Camera Motion Scale for World-Coordinate Human Mesh Recovery Fengyuan Yang, Kerui Gu, Ha Linh Nguyen, Tze Ho Elden Tse, Angela Yao arXiv, 2024 [pdf] [webpage] We present an optimization-free scale calibration framework for global human motion recovery. |
Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics Tze Ho Elden Tse, Runyang Feng, Linfang Zheng, Jiho Park, Yixing Gao, Jihie Kim, Ales Leonardis, Hyung Jin Chang AAAI, 2025 [pdf] We introduce a collaborative learning framework for 3D hand-object reconstruction and compositional action recognition using superquadrics. |
GeoReF: Geometric Alignment Across Shape Variation for Category-level Object Pose Refinement Linfang Zheng, Tze Ho Elden Tse, Chen Wang, Yinghan Sun, Hua Chen, Ales Leonardis, Wei Zhang, Hyung Jin Chang CVPR, 2024 [pdf] [code] We introduce a novel framework for category-level object pose refinement which integrates an HS-layer and learnable affine transformations to enhance the extraction and alignment of geometric information. |
Spectral Graphormer: Spectral Graph-based Transformer for Egocentric Two-Hand Reconstruction using Multi-View Color Images Tze Ho Elden Tse, Franziska Mueller, Zhengyang Shen, Danhang Tang, Thabo Beeler, Mingsong Dou, Yinda Zhang, Sasa Petrovic, Hyung Jin Chang, Jonathan Taylor, Bardia Doosti ICCV, 2023 [pdf] [webpage] [code] We present a spectral graph-based Transformer framework that reconstructs two high fidelity hands from multi-view RGB images. The proposed framework combines ideas from spectral graph theory and Transformers. |
DiffPose: SpatioTemporal Diffusion Model for Video-Based Human Pose Estimation Runyang Feng, Yixing Gao, Tze Ho Elden Tse, Xueqing Ma, Hyung Jin Chang ICCV, 2023 [pdf] [webpage] We present a diffusion architecture that formulates video-based human pose estimation as a conditional heatmap generation problem. |
Mutual Information-based Temporal Difference Learning for Human Pose Estimation in Video Runyang Feng, Yixing Gao, Xueqing Ma, Tze Ho Elden Tse, Hyung Jin Chang CVPR, 2023 [pdf] [webpage] We present a multi-frame human pose estimation framework, which employs temporal differences across frames to model dynamic contexts. |
S$^2$Contact: Graph-based Network for 3D Hand-Object Contact Estimation with Semi-Supervised Learning Tze Ho Elden Tse$^*$, Zhongqun Zhang$^*$, Kwang In Kim, Ales Leonardis, Feng Zheng, Hyung Jin Chang ECCV, 2022 [pdf] [webpage] We propose a semi-supervised framework that learns contact from monocular videos. |
Collaborative Learning for Hand and Object Reconstruction with Attention-guided Graph Convolution Tze Ho Elden Tse, Kwang In Kim, Ales Leonardis, Hyung Jin Chang CVPR, 2022 [pdf] [webpage] We propose a collaborative learning framework which jointly reconstructs hand and object from a single RGB image. |
TP-AE: Temporally Primed 6D Object Pose Tracking with Auto-Encoders Linfang Zheng, Ales Leonardis, Tze Ho Elden Tse, Nora Horanyi, Wei Zhang, Hua Chen, Hyung Jin Chang ICRA, 2022 [pdf] This paper focuses on instance-level 6D object pose tracking. In particular, the targeted scenarios are symmetric and textureless object under occlusion. |
No Need to Scream: Robust Sound-based Speaker Localisation in Challenging Scenarios Tze Ho Elden Tse, D. De Martini and L. Marchegiani ICSR, 2019 [pdf] Master project at Oxford Robotics Institute. |