I've hosted my code for this project at https://gitlab.com/dibz15/pytorch-fasterrcnn-trained-on-wider.
This project was my first independent foray into deep learning with PyTorch. I have worked with PyTorch before on my super-resolution project, but I wanted to try something more challenging and (at the time I started this) state-of-the-art.
My model is a modified FasterRCNN using the MobileNetV2 classifier network as the backbone (with frozen gradients). It is trained on the publicly available WIDER dataset (for which I wrote my own PyTorch dataloader). I trained my model for only 65 epochs on my RTX2060. The final model is about 300MB, and for an IoU threshold of 0.6, gives an mAP of about 0.5.
Here's a graph of mAP vs. IoU for my trained model:
This doesn't seem very good, but many of the examples in WIDER are very challenging. For the more typical example (and more like my own use-cases) the hero image of this article is a good example. It performs well! Also, check out my model's result on some animated examples:
I'm really happy with it. If you'd like to learn more about the model or about my source, check out my gitlab repository from above. I have tried to document it well, but if you have any questions feel free to contact me.
Now that I have a working face detection model, I would ideally like to feed its output to a facial recognition model. If I can find the right type of recognition model, my hope is that I can train it recognize custom faces from my own dataset (like my own face) and have a highly accurate facial recognition tool. If I do end up starting such a project, I will link it here.