# HM-ViT: Hetero-modal Vehicle-to-Vehicle Cooperative perception with vision transformer

## Testing Environment
* OS: ubuntu 18.04
* CUDA: 11.4
* python: 3.7
* GPU: RTX3090

## Requirements
1. pytorch >= 1.10.1 
2. spconv >= 2.1.21
3. mmdet==2.14.0
4. mmsegmentation==0.14.1
5. mmcv-full==1.4.0
3. mmdetection3d, opencv, open3d, tqdm, tensorboardX, numpy, cython, einops, pyyaml, shapely

## Environment preparation
Please run the following command to prepare the running.
```python
python3 hmvit/utils/setup.py build_ext --inplace
```

Also setup the envrionment:
```python
python setup.py develop
```
## Running instructions
* Please download the OPV2V dataset to your local path.  
* Please change the dataset path in the `hmvit/hypes_yaml/opcl/bevformer_point_pillar_hetero.yaml` to the correct path of OPV2V dataset. 
* To working with multiple gpus for training, please use the following command:
```
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4  --use_env hmvit/tools/train_camera.py --hypes_yaml hmvit/hypes_yaml/opcamera/cvt.yaml --model_dir hmvit/logs/cvt_att_fuse
```
* After training the model, please run the following inference command:
```python
CUDA_VISIBLE_DEVICES=0 python -m hmvit.tools.inference_camera --hypes_yaml hmvit/hypes_yaml/opcamera/cvt.yaml  --model_dir hmvit/logs/saved_model_dir --fusion_method intermediate --ego_mode mixed --camera_to_lidar_ratio 0.5
```
* For visualizing the result, please add `--show_vis` flag to the inference command
* To save the visualizations, please add `--save_vis` flag to the inference command
* To change the sensor modalities, please change `ego_mode` and `camera_to_lidar_ratio`