Project Overview
Collecting real training data for human pose estimation is slow, expensive, and legally complicated. Generating it synthetically is faster, cheaper, and infinitely scalable — if you can close the gap between rendered images and real ones.
This pipeline generates photorealistic training images with automatic keypoint annotations using TensorFlow and HRNet, reducing manual labeling costs while achieving a 40% improvement in training efficiency for pose estimation tasks. The core challenge wasn't rendering quality — it was domain adaptation: making models trained on synthetic data generalize to real-world images without degrading.
The pipeline covers rendering with domain randomization (lighting, textures, backgrounds), automated COCO-format annotation, and augmentation pipelines designed to push the synthetic distribution toward the real one during training.