Accurate baseball player pose refinement using motion prior guidance

ICT Express 2025

Seunghyun Oh, Heewon Kim
Soongsil University
Representative Image

We propose a novel baseball player pose refinement method
that leverages motion prior guidance to achieve accurate pose estimation
for baseball player swing motions

Abstract

Human pose estimation (HPE) is challenging due to the need to accurately capture rapid and occluded body movements, often resulting in uncertain predictions. In the context of fast sports actions like baseball swings, existing HPE methods insufficiently leverage domain-specific prior knowledge about these movements. To address this gap, we propose the Baseball Player Pose Corrector (BPPC), an optimization framework that utilizes high-quality 3D standard motion data to refine 2D keypoints in baseball swing videos. BPPC operates in two stages: first, it aligns the 3D standard motion to test swing videos through action recognition, offset learning, and 3D-to-2D projection. Next, it applies movement-aware optimization to refine the keypoints, ensuring robustness to variations in swing patterns. Notably, BPPC does not rely on additional datasets; it only requires manually annotated 3D standard motion data for baseball swings. Experimental results demonstrate that BPPC improves keypoint estimation accuracy by up to 2.4% on a baseball swing dataset, particularly enhancing keypoints with confidence scores below 0.5. Qualitative analysis further highlights BPPC’s ability to correct rapidly moving joints, such as elbows and wrists.


Baseball Player Pose Corrector (BPPC)

Method Pipeline

BPPC (Baseball Player Pose Corrector) is a framework that precisely refines baseball player poses via two optimization steps. First, it spatiotemporally aligns 3D standard movements with the test video. Second, it uses the aligned data as a reference to finely adjust keypoint locations for improved precision.

4D Motion Projection focuses on resolving the spatiotemporal misalignment between the 3D standard motion (\(s\)) and the keypoints (\(x\)) estimated from the test video. It aligns the start and end of the swing through Action Recognition using a grid sampler, compensates for individual swing pattern differences via Offset Learning, and matches camera viewpoints using a projection matrix \(P\). This entire process is designed to be differentiable, allowing the standard motion to be accurately projected onto the subject's position and pose in the test video using gradient descent, without requiring a separate training dataset.

Pose Refinement generates the final 2D keypoints (\(y\)) based on the previously optimized standard motion (\(\overline{s}^{*}\)). To effectively correct keypoints with low confidence (\(c_{f,k}\)) caused by motion blur or occlusion, the framework utilizes the physical velocity
(\(\mathcal{L}_{vel}\)) and acceleration (\(\mathcal{L}_{accel}\)) information from the standard motion as constraints. This enables the precise adjustment of fast-moving joints, such as wrists and elbows, according to the standard motion's trajectory, ultimately resulting in accurate pose data that aligns with actual baseball swing mechanics.


Results

Quantitative Results

Method Pipeline

BPPC enhances keypoint estimation accuracy particularly when confidence scores are low. In the low-confidence range \([0, 0.5)\), the framework achieves an accuracy improvement of over \(2.1\%\) compared to HRNet and DARK (HRNet-W48), whereas no notable improvement is observed in the high-confidence range \([0.9, 1.0]\). Given that low confidence often results from fast-moving or occluded joints during a swing, these findings suggest that prior-based optimization serves as a vital complement to data-driven Human Pose Estimation (HPE) models.


Qualitative Results

Method Pipeline

BPPC demonstrates superior qualitative performance by refining keypoints that baseline models often fail to capture. As shown in the visual comparisons, BPPC accurately corrects the positions of elbows and wrists—joints that are frequently affected by motion blur, occlusion, or visual similarity with the background and equipment (e.g., the bat). By addressing these errors in models like HRNet-W48, BPPC ensures high-fidelity pose estimation even under challenging visual conditions during high-speed batting motions.


BibTeX

@article{OH2025,
    title = {Accurate baseball player pose refinement using motion prior guidance},
    journal = {ICT Express},
    year = {2025},
    issn = {2405-9595},
    doi = {https://doi.org/10.1016/j.icte.2025.03.008},
    url = {https://www.sciencedirect.com/science/article/pii/S2405959525000360},
    author = {Seunghyun Oh and Heewon Kim},
    keywords = {Human pose estimation, Human pose refinement, Deep learning}
}