Publications

Research

I’m broadly interested in computer vision and machine learning. Much of my research is about 3D vision, graph neural networks, hand-object interaction and robotics.

DAS3R: Dynamics-Aware Gaussian Splatting for Static Scene Reconstruction
‪Kai Xu, Tze Ho Elden Tse, Jizong Peng, Angela Yao
arXiv, 2024
[pdf] [code] [webpage]
We present a novel framework for scene decomposition and static background reconstruction from unposed videos.
Humans as Checkerboards: Calibrating Camera Motion Scale for World-Coordinate Human Mesh Recovery
‪Fengyuan Yang, Kerui Gu, Ha Linh Nguyen, Tze Ho Elden Tse, Angela Yao
arXiv, 2024
[pdf] [webpage]
We present an optimization-free scale calibration framework for global human motion recovery.
Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics
Tze Ho Elden Tse, Runyang Feng, Linfang Zheng, Jiho Park, Yixing Gao, Jihie Kim, Ales Leonardis, Hyung Jin Chang
AAAI, 2025
[pdf]
We introduce a collaborative learning framework for 3D hand-object reconstruction and compositional action recognition using superquadrics.
GeoReF: Geometric Alignment Across Shape Variation for Category-level Object Pose Refinement
‪Linfang Zheng, Tze Ho Elden Tse, Chen Wang, Yinghan Sun, Hua Chen, Ales Leonardis, Wei Zhang, Hyung Jin Chang
CVPR, 2024
[pdf] [code]
We introduce a novel framework for category-level object pose refinement which integrates an HS-layer and learnable affine transformations to enhance the extraction and alignment of geometric information.
Spectral Graphormer: Spectral Graph-based Transformer for Egocentric Two-Hand Reconstruction using Multi-View Color Images
Tze Ho Elden Tse, Franziska Mueller, Zhengyang Shen, Danhang Tang, Thabo Beeler, Mingsong Dou, Yinda Zhang, Sasa Petrovic, Hyung Jin Chang, Jonathan Taylor, Bardia Doosti
ICCV, 2023
[pdf] [webpage] [code]
We present a spectral graph-based Transformer framework that reconstructs two high fidelity hands from multi-view RGB images. The proposed framework combines ideas from spectral graph theory and Transformers.
DiffPose: SpatioTemporal Diffusion Model for Video-Based Human Pose Estimation
‪Runyang Feng, Yixing Gao, Tze Ho Elden Tse, Xueqing Ma, Hyung Jin Chang
ICCV, 2023
[pdf] [webpage]
We present a diffusion architecture that formulates video-based human pose estimation as a conditional heatmap generation problem.
Mutual Information-based Temporal Difference Learning for Human Pose Estimation in Video
‪Runyang Feng, Yixing Gao, Xueqing Ma, Tze Ho Elden Tse, Hyung Jin Chang
CVPR, 2023
[pdf] [webpage]
We present a multi-frame human pose estimation framework, which employs temporal differences across frames to model dynamic contexts.
S$^2$Contact: Graph-based Network for 3D Hand-Object Contact Estimation with Semi-Supervised Learning
Tze Ho Elden Tse$^*$, Zhongqun Zhang$^*$, Kwang In Kim, Ales Leonardis, Feng Zheng, Hyung Jin Chang
ECCV, 2022
[pdf] [webpage]
We propose a semi-supervised framework that learns contact from monocular videos.
Collaborative Learning for Hand and Object Reconstruction with Attention-guided Graph Convolution
Tze Ho Elden Tse, Kwang In Kim, Ales Leonardis, Hyung Jin Chang
CVPR, 2022
[pdf] [webpage]
We propose a collaborative learning framework which jointly reconstructs hand and object from a single RGB image.
TP-AE: Temporally Primed 6D Object Pose Tracking with Auto-Encoders
Linfang Zheng, Ales Leonardis, ‪Tze Ho Elden Tse, Nora Horanyi, Wei Zhang, Hua Chen, Hyung Jin Chang
ICRA, 2022
[pdf]
This paper focuses on instance-level 6D object pose tracking. In particular, the targeted scenarios are symmetric and textureless object under occlusion.
No Need to Scream: Robust Sound-based Speaker Localisation in Challenging Scenarios
Tze Ho Elden Tse, D. De Martini and L. Marchegiani
ICSR, 2019
[pdf]
Master project at Oxford Robotics Institute.