Pose-Robust Calibration Strategy for Point-of-Gaze Estimation on Mobile Phones

TL;DR: We introduce MobilePoG, a benchmark for evaluating point-of-gaze calibration, and propose a user-friendly dynamic calibration strategy which is robust to pose variation.

Abstract

Although appearance-based point-of-gaze (PoG) estimation has improved, the estimators still struggle to generalize across individuals due to personal differences. Therefore, person-specific calibration is required for accurate PoG estimation. However, calibrated PoG estimators are often sensitive to head pose variations. To address this, we investigate the key factors influencing calibrated estimators and explore pose-robust calibration strategies. Specifically, we first construct a benchmark, MobilePoG, which includes facial images from 32 individuals focusing on designated points under either fixed or continuously changing head poses. Using this benchmark, we systematically analyze how the diversity of calibration points and head poses influences estimation accuracy. Our experiments show that introducing a wider range of head poses during calibration improves the estimator’s ability to handle pose variation. Building on this insight, we propose a dynamic calibration strategy in which users fixate on calibration points while moving their phones. This strategy naturally introduces head pose variation during a user-friendly and efficient calibration process, ultimately producing a better calibrated PoG estimator that is less sensitive to head pose variations than those using conventional calibration strategies.

MobilePoG: A New Benchmark for Personalized PoG Calibration

Figure 1: Pipeline of dataset collection.

We introduce MobilePoG, a new phone-based PoG dataset that captures facial images of users when they are fixating on predefined points under either fixed or continuously changing head poses. MobilePoG enables the simulation of diverse calibration scenarios, serving as a benchmark for evaluating calibration strategies and analyzing influential factors systematically. Figure 1 shows collection pipeline of MobilePoG.

Figure 2: Examples of Static-MobilePoG.

Figure 3: Examples of Dynamic-MobilePoG.

Figure 2 shows some examples from Static-MobilePoG, which contains 12 types of static head poses. Figure 3 shows some examples from Dynamic-MobilePoG, which has continuously varying head pose in the frame sequences. It can be observed that the MobilePoG dataset exhibits rich head pose diversity, making it well-suited for evaluating a model’s robustness to head pose variations under different experimental settings.

Factor Analysis of Personalized PoG Calibration

Table 1: Results of a single calibration head pose and different numbers of calibration points on Static-MobilePoG.

Table 1 reports the performance of calibrated estimators based on samples collected under a single head pose. The results clearly indicate that calibration based on monotonous head pose samples fails to generalize to varying head poses. Thus, incorporating pose diversity in the calibration data is essential for achieving robust and reliable PoG estimation in real-world scenarios.

Figure 4: Comparison of increasing calibration points and head poses in calibration samples. (a) shows increasing points with a single head pose. (b) shows increasing head poses with one point. (c) shows increasing head poses with five points.

Figure 4 illustrates how the average error of the calibrated estimator changes with increasing numbers of calibration points and head poses in the calibration samples. The results strongly suggest that head pose diversity is a critical factor in personalized calibration. Incorporating varied head poses into calibration samples substantially enhances the generalization ability of calibrated estimators and should be prioritized when designing practical calibration strategies.

Pose-Robust Calibration Strategy

As shown in Figure 1 (b), we propose a new dynamic calibration strategy. For each calibration PoG, the user just needs to rotate and translate the mobile phone within a range while keeping gaze at the PoG, which not only allows for the collection of calibration samples with sufficiently diverse head poses, but is also user-friendly and easy to operate on mobile devices.

Table 2: Comparison of the static and the dynamic strategy on Dynamic-MobilePoG.

Table 2 reports the errors of the calibrated estimators using various configurations (i.e., two models, four calibration algorithms, and four different numbers of calibration points) under the static and the dynamic strategy. The above experimental results demonstrate that the dynamic strategy, which is capable of collecting calibration samples with rich pose diversity, significantly enhances the robustness of the model after calibration.

Figure 5: Calibration stability of the static and the dynamic strategy.

Figure 5 shows the performance of iTracker and AFFNet during calibration using Finetune MLP and Full Finetune across different epochs. With our proposed dynamic strategy, the person-specific estimator exhibits not only excellent but also stable performance, enabling efficient personalized PoG calibration.

BibTeX


      @misc{zhao2025poserobustcalibrationstrategypointofgaze,
        title={Pose-Robust Calibration Strategy for Point-of-Gaze Estimation on Mobile Phones}, 
        author={Yujie Zhao and Jiabei Zeng and Shiguang Shan},
        year={2025},
        eprint={2508.10268},
        archivePrefix={arXiv},
        primaryClass={cs.CV},
        url={https://arxiv.org/abs/2508.10268}, 
      }