You are on page 1of 10

Computer Vision Pretrained Models

What is pre-trained Model?


A pre-trained model is a model created by some one else to solve a similar problem. Instead of building
a model from scratch to solve a similar problem, we can use the model trained on other problem as a
starting point. A pre-trained model may not be 100% accurate in your application.

For example, if you want to build a self learning car. You can spend years to build a decent image
recognition algorithm from scratch or you can take inception model (a pre-trained model) from Google
which was built on ImageNet data to identify images in those pictures.

Other Pre-trained Models

NLP Pre-trained Models.


Audio and Speech Pre-trained Models

Framework
Tensorflow
Keras
PyTorch
Caffe
MXNet

Model visualization
You can see visualizations of each model's network architecture by using Netron.

CV logo

Tensorflow

Model Name Description Framework

ObjectDetection Localizing and identifying multiple objects in a single image. Tensorflow

The model generates bounding boxes and segmentation


masks for each instance of an object in the image. It's based
Mask R-CNN Tensorflow
on Feature Pyramid Network (FPN) and a ResNet101
backbone.

This is an experimental Tensorflow implementation of Faster


Faster-RCNN RCNN - a convnet for object detection with a region proposal Tensorflow
network.

YOLO This is tensorflow implementation of the YOLO:Real-Time


Tensorflow
TensorFlow Object Detection.

TensorFlow implementation of 'YOLO: Real-Time Object


YOLO
Detection', with training and an actual support for real-time Tensorflow
TensorFlow ++
running on mobile devices.

MobileNets trade off between latency, size and accuracy while


MobileNet Tensorflow
comparing favorably with popular models from the literature.

DeepLab Deep labeling for semantic image segmentation. Tensorflow

Colornet Neural Network to colorize grayscale images. Tensorflow

Photo-Realistic Single Image Super-Resolution Using a


SRGAN Tensorflow
Generative Adversarial Network.

Train TensorFlow neural nets with OpenStreetMap features and


DeepOSM Tensorflow
satellite imagery.
Model Name Description Framework

Domain
Implementation of Unsupervised Cross-Domain Image
Transfer Tensorflow
Generation.
Network

Show, Attend
Attention Based Image Caption Generator. Tensorflow
and Tell

Real-time object detection on Android using the YOLO network,


android-yolo Tensorflow
powered by TensorFlow.

This is a tensorflow implementation of "Fast and Accurate


DCSCN Super Image Super Resolution by Deep CNN with Skip Connection
Tensorflow
Resolution and Network in Network", a deep learning based Single-Image
Super-Resolution (SISR) model.

This is an experimental tensorflow implementation of


GAN-CLS Tensorflow
synthesizing images.

U-Net For Brain Tumor Segmentation. Tensorflow

Improved
Unpaired Image to Image Translation. Tensorflow
CycleGAN

Im2txt Image-to-text neural network for image captioning. Tensorflow

Identify the name of a street (in France) from an image using a


Street Tensorflow
Deep RNN.

SLIM Image classification models in TF-Slim. Tensorflow

DELF Deep local features for image matching and retrieval. Tensorflow

Compressing and decompressing images using a pre-trained


Compression Tensorflow
Residual GRU network.

AttentionOCR A model for real-world image text extraction. Tensorflow

↥ Back To Top

Keras

Model Name Description Framework


Model Name Description Framework

The model generates bounding boxes and segmentation masks


Mask R-CNN for each instance of an object in the image. It's based on Feature Keras
Pyramid Network (FPN) and a ResNet101 backbone.

Very Deep Convolutional Networks for Large-Scale Image


VGG16 Keras
Recognition.

Very Deep Convolutional Networks for Large-Scale Image


VGG19 Keras
Recognition.

ResNet Deep Residual Learning for Image Recognition. Keras

Image analogies Generate image analogies using neural matching and blending. Keras

Popular Image
Segmentation Implementation of Segnet, FCN, UNet and other models in Keras. Keras
Models

Ultrasound
This tutorial shows how to use Keras library to build deep neural
nerve Keras
network for ultrasound image nerve segmentation.
segmentation

DeepMask This is a Keras-based Python implementation of DeepMask- a


object complex deep neural network for learning object segmentation Keras
segmentation masks.

Monolingual
and Multilingual This is the source code that accompanies Multilingual Image
Keras
Image Description with Neural Sequence Models.
Captioning

Keras implementation of Image-to-Image Translation with


pix2pix Conditional Adversarial Networks by Phillip Isola, Jun-Yan Zhu, Keras
Tinghui Zhou, Alexei A.

Colorful Image
B&W to color. Keras
colorization

Implementation of Unpaired Image-to-Image Translation using


CycleGAN Keras
Cycle-Consistent Adversarial Networks.

Implementation of DualGAN: Unsupervised Dual Learning for


DualGAN Keras
Image-to-Image Translation.
Model Name Description Framework

Super- Implementation of Photo-Realistic Single Image Super-Resolution


Keras
Resolution GAN Using a Generative Adversarial Network.

↥ Back To Top

PyTorch

Model Name Description Framework

A Closed-form Solution to Photorealistic Image


FastPhotoStyle PyTorch
Stylization.

pytorch-CycleGAN-and- A Closed-form Solution to Photorealistic Image


PyTorch
pix2pix Stylization.

Fast, modular reference implementation of


maskrcnn-benchmark Instance Segmentation and Object Detection PyTorch
algorithms in PyTorch.

Image restoration with neural networks but without


deep-image-prior PyTorch
learning.

StarGAN: Unified Generative Adversarial Networks


StarGAN PyTorch
for Multi-Domain Image-to-Image Translation.

This project is a faster faster R-CNN


faster-rcnn.pytorch implementation, aimed to accelerating the training PyTorch
of faster R-CNN object detection models.

Synthesizing and manipulating 2048x1024 images


pix2pixHD PyTorch
with conditional GANs.

Image augmentation library in Python for machine


Augmentor PyTorch
learning.

albumentations Fast image augmentation library. PyTorch

Deep Video Analytics is a platform for indexing and


Deep Video Analytics PyTorch
extracting information from videos and images
Model Name Description Framework

Pytorch implementation for Semantic


semantic-segmentation-
Segmentation/Scene Parsing on MIT ADE20K PyTorch
pytorch
dataset.

This software implements the Convolutional


An End-to-End Trainable Recurrent Neural Network (CRNN), a combination
Neural Network for Image- of CNN, RNN and CTC loss for image-based PyTorch
based Sequence Recognition sequence recognition tasks, such as scene text
recognition and OCR.

PyTorch Implementation of our Coupled VAE-GAN


UNIT algorithm for Unsupervised Image-to-Image PyTorch
Translation.

Sequence labeling models are quite popular in


Neural Sequence labeling many NLP tasks, such as Named Entity
PyTorch
model Recognition (NER), part-of-speech (POS) tagging
and word segmentation.

This is a PyTorch implementation of Faster RCNN.


This project is mainly based on py-faster-rcnn and
TFFRCNN. For details about R-CNN please refer to
faster rcnn the paper Faster R-CNN: Towards Real-Time Object PyTorch
Detection with Region Proposal Networks by
Shaoqing Ren, Kaiming He, Ross Girshick, Jian
Sun.

pytorch-semantic-
PyTorch for Semantic Segmentation. PyTorch
segmentation

PyTorch version of the paper 'Enhanced Deep


EDSR-PyTorch Residual Networks for Single Image Super- PyTorch
Resolution'.

Collection of classification models pretrained on


image-classification-mobile PyTorch
the ImageNet-1K.

Fader Networks: Manipulating Images by Sliding


FaderNetworks PyTorch
Attributes - NIPS 2017.

Image captioning model in pytorch (finetunable


neuraltalk2-pytorch PyTorch
cnn in branch with_finetune).
Model Name Description Framework

Implementation of: "Exploring Randomly Wired


RandWireNN PyTorch
Neural Networks for Image Recognition".

Pytorch implementation for reproducing


stackGAN-v2 PyTorch
StackGAN_v2 results in the paper StackGAN++.

This code allows to use some of the Detectron


Detectron models for Object
models for object detection from Facebook AI PyTorch
Detection
Research with PyTorch.

This paper explores the use of extreme points in an


object (left-most, right-most, top, bottom pixels) as
DEXTR-PyTorch PyTorch
input to obtain precise object segmentation for
images and videos.

Pytorch implementation for "PointNet: Deep


pointnet.pytorch Learning on Point Sets for 3D Classification and PyTorch
Segmentation.

This repository includes the unofficial


implementation Self-critical Sequence Training for
self-critical.pytorch Image Captioning and Bottom-Up and Top-Down PyTorch
Attention for Image Captioning and Visual
Question Answering.

A Pytorch implementation for V-Net: Fully


vnet.pytorch Convolutional Neural Networks for Volumetric PyTorch
Medical Image Segmentation.

Pixel-wise segmentation on VOC2012 dataset


piwise PyTorch
using pytorch.

PyTorch implementation of PSPNet segmentation


pspnet-pytorch PyTorch
network.

Pytorch implementation for Photo-Realistic Single


pytorch-SRResNet Image Super-Resolution Using a Generative PyTorch
Adversarial Network.

PyTorch implementation of PNASNet-5 on


PNASNet.pytorch PyTorch
ImageNet.
Model Name Description Framework

Quickly comparing your image classification


img_classification_pk_pytorch PyTorch
models with the state-of-the-art models.

Deep Neural Networks are High Confidence Predictions for Unrecognizable


PyTorch
Easily Fooled Images.

PyTorch implementation of "Image-to-Image


pix2pix-pytorch Translation Using Conditional Adversarial PyTorch
Networks".

A PyTorch Implementation of Improving Semantic


NVIDIA/semantic-
Segmentation via Video Propagation and Label PyTorch
segmentation
Relaxation, In CVPR2019.

A PyTorch Implementation of Neural IMage


Neural-IMage-Assessment PyTorch
Assessment.

Pretrained models for chest X-ray (CXR) pathology


torchxrayvision PyTorch
predictions. Medical, Healthcare, Radiology

↥ Back To Top

Caffe

Model Name Description Framework

OpenPose represents the first real-time multi-person


OpenPose system to jointly detect human body, hand, and facial Caffe
keypoints (in total 130 keypoints) on single images.

Fully Convolutional
Networks for
Fully Convolutional Models for Semantic Segmentation. Caffe
Semantic
Segmentation

Colorful Image
Colorful Image Colorization. Caffe
Colorization

R-FCN: Object Detection via Region-based Fully


R-FCN Caffe
Convolutional Networks.
Model Name Description Framework

Inspired by Google's recent Inceptionism blog post, cnn-vis


cnn-vis is an open-source tool that lets you use convolutional Caffe
neural networks to generate images.

Learning Deconvolution Network for Semantic


DeconvNet Caffe
Segmentation.

↥ Back To Top

MXNet

Model Name Description Framework

Region Proposal Network solves object detection as a


Faster RCNN MXNet
regression problem.

SSD is an unified framework for object detection with a


SSD MXNet
single network.

Faster RCNN+Focal The code is unofficial version for focal loss for Dense
MXNet
Loss Object Detection.

I realize three different models for text recognition, and


CNN-LSTM-CTC all of them consist of CTC loss layer to realize no MXNet
segmentation for text images.

This is the official repo of paper DOTA: A Large-scale


Faster_RCNN_for_DOTA MXNet
Dataset for Object Detection in Aerial Images.

RetinaNet Focal loss for Dense Object Detection. MXNet

This is a MXNet implementation of MobileNetV2


architecture as described in the paper Inverted Residuals
MobileNetV2 MXNet
and Linear Bottlenecks: Mobile Networks for
Classification, Detection and Segmentation.

This code is a re-implementation of the imagenet


neuron-selectivity-
classification experiments in the paper Like What You MXNet
transfer
Like: Knowledge Distill via Neuron Selectivity Transfer.
Model Name Description Framework

This is a Gluon implementation of MobileNetV2


architecture as described in the paper Inverted Residuals
MobileNetV2 MXNet
and Linear Bottlenecks: Mobile Networks for
Classification, Detection and Segmentation.

This code is a re-implementation of the imagenet


sparse-structure-
classification experiments in the paper Data-Driven MXNet
selection
Sparse Structure Selection for Deep Neural Networks.

A Closed-form Solution to Photorealistic Image


FastPhotoStyle MXNet
Stylization.

↥ Back To Top

Contributions

Your contributions are always welcome!! Please have a look at contributing.md

License

MIT License

You might also like