BERT |
Bidirectional Encoder Representations from Transformers |
Designed for understanding the context of words in search queries. |
Google |
March 2021 |
GPT-3 |
Generative Pre-trained Transformer 3 |
A state-of-the-art language model capable of generating human-like text based on prompts. |
OpenAI |
June 2020 |
ResNet |
Residual Network |
A deep residual learning framework that helps in training very deep neural networks. |
Microsoft |
January 2020 |
VGG |
Visual Geometry Group |
A convolutional neural network architecture known for its simplicity and depth, used primarily for image classification. |
University of Oxford |
September 2015 |
YOLO |
You Only Look Once |
A real-time object detection system that detects objects in images and videos. |
Joseph Redmon |
April 2020 |
SSD |
Single Shot MultiBox Detector |
A single-shot detector that detects objects in images using a single deep neural network. |
Wei Liu |
March 2016 |
Faster R-CNN |
Faster Region-based Convolutional Neural Network |
An object detection model that combines region proposal networks with fast R-CNN. |
Microsoft |
April 2017 |
Mask R-CNN |
Mask Region-based Convolutional Neural Network |
An extension of Faster R-CNN that adds a branch for predicting segmentation masks on each Region of Interest (RoI). |
Facebook AI Research |
November 2018 |
U-Net |
U-Net Convolutional Network |
A convolutional network architecture for biomedical image segmentation. |
University of Freiburg |
January 2018 |
SegNet |
Segmentation Network |
A deep convolutional encoder-decoder architecture for semantic segmentation. |
University of Cambridge |
August 2017 |
PSPNet |
Pyramid Scene Parsing Network |
A semantic segmentation model that captures global context information through a pyramid pooling module. |
University of Science and Technology of China |
March 2018 |
DeepLabV3 |
DeepLab Version 3 |
A semantic segmentation model that uses atrous convolution to capture multi-scale context. |
Google |
October 2019 |
DeepLabV3+ |
DeepLab Version 3 Plus |
An improved version of DeepLabV3 that includes an encoder-decoder structure for better segmentation. |
Google |
October 2019 |
EfficientNet |
Efficient Neural Network |
A family of convolutional neural networks that scale efficiently with model size and input resolution. |
Google |
September 2020 |
MobileNet |
Mobile Neural Network |
A lightweight model designed for mobile and edge devices, optimized for speed and efficiency. |
Google |
June 2019 |
ShuffleNet |
Shuffle Neural Network |
A lightweight convolutional neural network architecture designed for mobile devices. |
Megvii Technology |
January 2018 |
SqueezeNet |
Squeeze Neural Network |
A small, efficient convolutional neural network that achieves AlexNet-level accuracy with fewer parameters. |
DeepScale |
March 2020 |
GAN |
Generative Adversarial Network |
A framework for training generative models using adversarial training. |
Ian Goodfellow et al. |
November 2014 |
VAE |
Variational Autoencoder |
A generative model that learns to encode data into a latent space and decode it back. |
Diederik P Kingma et al. |
December 2013 |
DCGAN |
Deep Convolutional GAN |
A type of GAN that uses deep convolutional networks for generating images. |
Facebook AI Research |
March 2016 |
WGAN |
Wasserstein GAN |
A variant of GAN that uses Wasserstein distance for training stability. |
Martin Arjovsky et al. |
June 2017 |
LSTM |
Long Short-Term Memory |
A type of recurrent neural network that can learn long-term dependencies. |
Google |
June 2020 |
GRU |
Gated Recurrent Unit |
A type of recurrent neural network that is simpler than LSTM and also captures long-term dependencies. |
Google |
June 2020 |
Seq2Seq |
Sequence-to-Sequence |
A model used for sequence-to-sequence tasks, such as translation. |
Google |
December 2016 |
Transformer |
Transformer Model |
A model architecture that uses self-attention mechanisms for sequence processing. |
Google |
June 2017 |
BART |
Bidirectional and Auto-Regressive Transformers |
A model that combines bidirectional and autoregressive transformers for sequence generation. |
Facebook AI Research |
March 2020 |
T5 |
Text-to-Text Transfer Transformer |
A text-to-text transformer that can handle various NLP tasks by converting them into a text generation format. |
Google |
July 2021 |
XLNet |
Generalized Autoregressive Pretraining |
A generalized autoregressive pretraining model that captures bidirectional context. |
Google |
June 2019 |
RoBERTa |
A Robustly Optimized BERT Pretraining Approach |
An optimized version of BERT that improves performance on various NLP tasks. |
Facebook AI Research |
July 2021 |
DistilBERT |
Distilled BERT |
A distilled version of BERT that is smaller and faster while retaining most of its performance. |
Hugging Face |
October 2020 |
ALBERT |
A Lite BERT |
A lite version of BERT that reduces model size with parameter sharing and factorized embeddings. |
Google |
September 2020 |
XLM |
Cross-lingual Language Model |
A cross-lingual language model that learns representations for multiple languages. |
Facebook AI Research |
July 2021 |
XLM-RoBERTa |
Cross-lingual RoBERTa |
A multilingual version of RoBERTa that performs well on various cross-lingual tasks. |
Facebook AI Research |
March 2021 |
FlauBERT |
French Language Model |
A French language model that is pre-trained on a large corpus of French text. |
GETALP |
December 2020 |
mBART |
Multilingual BART |
A multilingual sequence-to-sequence model that can perform various NLP tasks across languages. |
Facebook AI Research |
March 2021 |
ELECTRA |
Efficiently Learning an Encoder that Classifies Token Replacements Accurately |
A model that trains discriminators to distinguish real from fake tokens, improving efficiency. |
Google |
March 2021 |
DeBERTa |
Decoding-enhanced BERT with Disentangled Attention |
A model that enhances BERT with disentangled attention and an improved mask decoder. |
Microsoft |
February 2021 |
REALM |
Retrieval-Augmented Language Model |
A model that integrates retrieval into language modeling for enhanced understanding. |
Google |
October 2020 |
CTRL |
Conditional Transformer Language Model |
A conditional transformer language model that can generate text based on control codes. |
Salesforce |
September 2019 |
GPT-2 |
Generative Pre-trained Transformer 2 |
A predecessor to GPT-3, known for generating coherent and contextually relevant text. |
OpenAI |
November 2019 |
Megatron-LM |
Megatron Language Model |
A large language model designed for efficient training on multiple GPUs. |
Nvidia |
March 2021 |
DALL-E |
DALL-E: Creating Images from Text |
A model that generates images from textual descriptions, showcasing creativity and understanding. |
OpenAI |
January 2021 |
CLIP |
Contrastive Language-Image Pre-training |
A model that connects images and text, allowing for zero-shot classification of images based on text prompts. |
OpenAI |
January 2021 |
StyleGAN |
Style Generative Adversarial Network |
A generative adversarial network that creates high-quality images with controllable styles. |
Nvidia |
February 2021 |
BigGAN |
Large Scale GAN |
A large-scale GAN that generates high-fidelity images with improved diversity. |
Google |
January 2021 |
CycleGAN |
Cycle-Consistent Adversarial Networks |
A model that translates images from one domain to another without paired examples. |
UC Berkeley |
November 2020 |
StarGAN |
Star Generative Adversarial Network |
A unified model for multi-domain image-to-image translation. |
Yunjey Choi et al. |
September 2018 |
pix2pix |
Image-to-Image Translation with Conditional Adversarial Networks |
A conditional GAN for image-to-image translation tasks. |
Phillip Isola et al. |
January 2017 |
DeepSpeech |
Deep Speech Recognition |
An automatic speech recognition system based on deep learning. |
Mozilla |
March 2021 |
Wav2Vec |
Wav2Vec: Unsupervised Pre-training for Speech Recognition |
A self-supervised model for learning representations from raw audio data. |
Facebook AI Research |
June 2021 |
HuBERT |
Hidden Unit BERT |
A self-supervised learning model for speech representation learning. |
Facebook AI Research |
April 2021 |
Whisper |
Whisper: Automatic Speech Recognition |
A model for automatic speech recognition and transcription. |
OpenAI |
December 2022 |