Publications
Research
I’m broadly interested in computer vision and machine learning. Much of my research is about 3D vision, graph neural networks, hand-object interaction and robotics.
GeoReF: Geometric Alignment Across Shape Variation for Category-level Object Pose Refinement Linfang Zheng, Tze Ho Elden Tse, Chen Wang, Yinghan Sun, Hua Chen, Ales Leonardis, Wei Zhang, Hyung Jin Chang CVPR, 2024 [pdf] [code] We introduce a novel framework for category-level object pose refinement which integrates an HS-layer and learnable affine transformations to enhance the extraction and alignment of geometric information. |
Spectral Graphormer: Spectral Graph-based Transformer for Egocentric Two-Hand Reconstruction using Multi-View Color Images Tze Ho Elden Tse, Franziska Mueller, Zhengyang Shen, Danhang Tang, Thabo Beeler, Mingsong Dou, Yinda Zhang, Sasa Petrovic, Hyung Jin Chang, Jonathan Taylor, Bardia Doosti ICCV, 2023 [pdf] [webpage] [code] We present a spectral graph-based Transformer framework that reconstructs two high fidelity hands from multi-view RGB images. The proposed framework combines ideas from spectral graph theory and Transformers. |
DiffPose: SpatioTemporal Diffusion Model for Video-Based Human Pose Estimation Runyang Feng, Yixing Gao, Tze Ho Elden Tse, Xueqing Ma, Hyung Jin Chang ICCV, 2023 [pdf] [webpage] We present a diffusion architecture that formulates video-based human pose estimation as a conditional heatmap generation problem. |
Mutual Information-based Temporal Difference Learning for Human Pose Estimation in Video Runyang Feng, Yixing Gao, Xueqing Ma, Tze Ho Elden Tse, Hyung Jin Chang CVPR, 2023 [pdf] [webpage] We present a multi-frame human pose estimation framework, which employs temporal differences across frames to model dynamic contexts. |
S$^2$Contact: Graph-based Network for 3D Hand-Object Contact Estimation with Semi-Supervised Learning Tze Ho Elden Tse$^*$, Zhongqun Zhang$^*$, Kwang In Kim, Ales Leonardis, Feng Zheng, Hyung Jin Chang ECCV, 2022 [pdf] [webpage] We propose a semi-supervised framework that learns contact from monocular videos. |
Collaborative Learning for Hand and Object Reconstruction with Attention-guided Graph Convolution Tze Ho Elden Tse, Kwang In Kim, Ales Leonardis, Hyung Jin Chang CVPR, 2022 [pdf] [webpage] We propose a collaborative learning framework which jointly reconstructs hand and object from a single RGB image. |
TP-AE: Temporally Primed 6D Object Pose Tracking with Auto-Encoders Linfang Zheng, Ales Leonardis, Tze Ho Elden Tse, Nora Horanyi, Wei Zhang, Hua Chen, Hyung Jin Chang ICRA, 2022 [pdf] This paper focuses on instance-level 6D object pose tracking. In particular, the targeted scenarios are symmetric and textureless object under occlusion. |
No Need to Scream: Robust Sound-based Speaker Localisation in Challenging Scenarios Tze Ho Elden Tse, D. De Martini and L. Marchegiani ICSR, 2019 [pdf] Master project at Oxford Robotics Institute. |