# ST-Gaze: Spatio-Temporal Feature Representations for Video-Based Gaze Estimation

This repository contains the official implementation and supplementary materials for our WACV 2026 paper, _Learning Spatio-Temporal Feature Representations for Video-Based Gaze Estimation_.

## Overview

ST-Gaze is a deep learning framework for estimating gaze direction from video sequences. It leverages spatio-temporal modeling to improve accuracy over single-frame approaches. The codebase includes training, evaluation, and visualization scripts, as well as pretrained weights and configuration files.

## Directory Structure

- `models/` — Model architectures and loss functions
- `datasources/` — Data loading and preprocessing utilities
- `config/` — Configuration files and helpers
- `experiments/` — Training logs and experiment outputs
- `weights/` — Pretrained model weights
- `segmentation_cache/` — Cached segmentation results
- Main scripts: `training.py`, `training_combined.py`, `training_vectorized.py`, `evaluation.py`, `visualization.py`, `convert_numpy.py`, `inference_speed.py`, `eval_codalab.py`, `eval_codalab_combined.py`

## Installation

1. Create a virtual environment with Python 3.12+.
    ```bash
    conda create -n stgaze python=3.12
    conda activate stgaze
    ```
2. Install dependencies:
    ```bash
    pip install -r requirements.txt
    ```

## Usage

### Configuration

Modify the configuration files in the `config/` directory to set parameters for training and evaluation.
You can either modify `config_default.py` directly or create a new configuration file based on it.

### Training

To train a model:
```bash
python training.py
```

### Evaluation

To evaluate a trained model:
```bash
python evaluation.py
```

### Visualization

To visualize metrics on validation predictions:
```bash
python visualization.py
```

### Complexity and Inference Speed
To assess model complexity and inference speed:
```bash
python complexity_stats.py
```
```bash
python inference_speed.py
```

## Pretrained Weights

Pretrained weights for EfficientNet-B3 are provided in `weights/efficientnet_b3/best.pth`.

## Supplementary Materials

- Training logs and experiment results are available in the `experiments/` folder.
- Segmentation caches for different sequence lengths are in `segmentation_cache/`.