UMBC data science MPS capstone, Fall 2023
Google Street View | Objects365 | Mapillary Vistas | |
---|---|---|---|
Size (train, val, test) | 473 / 119 / 79 | 393 / 107 / 54 | 3,202 / 929 / 484 |
Setting | Street | Outdoors & indoors | Street |
Used for | Detection & classification | Detection | Detection |
Release | Maybe a TOS violation? | Released for research | Released for research |
Source | Sheng, Yao, and Goel (2021) | Shao et al. (2019) | Neuhold et al. (2017) |
Ultralytics YOLOv8 | Models with built-in modules for training, tuning, & validation |
Pytorch | Underlies Ultralytics models |
Roboflow | Dataset creation & management |
Weights & Biases | Experiment tracking |
Paperspace | Virtual machine (8 CPUs, 16GB GPU) |
YOLO | RT-DETR |
---|---|
Latest generation YOLO model | Transformer-based model from Baidu |
Detection & classification (& others) | Detection only |
Smaller architecture (medium has 26M params) | Larger architecture (large has 33M params) |
Trains very quickly & can train small models on laptop | Trains slowly & needs more GPU RAM |
Doesn’t perform as well | Performs better |
Well-documented & integrated | New, not fully integrated to ecosystem (e.g. no tune method) |
After lots of trial & error, best bets for detection:
YOLO works well on tiled images, but it will need to transfer to full-sized images to be useful
Working interactive demo: https://camilleseab-surveillance.hf.space