Paresh Wagh

Odoo Setup and Customization Expert

Follow me on GitHub

Artificial Intelligence (AI)

Notes along the AI journey

Responsible AI

Prompt Engineering Roadmap

Prompt Engineering Guides

Best Practices

Cheatsheets

Prompt Engineering Patterns

Online Courses

Compilations

Tools

Development Frameworks

Communities

Name Description
Hugging Face One of the largest AI model and dataset repositories. Offers pre-trained models for various tasks like text generation, translation, image classification. Allows community contributions. Provides hosting solutions like Inference Endpoints.
Kaggle World’s largest data science community with tools and resources for machine learning and data science projects. Users can participate in coding competitions, access datasets, notebooks, and pre-trained models. Provides a platform for hosting and sharing models, datasets, and projects.

Research

Frontier Models

Model Name Capabilities Release Date
GPT-5 Next-generation language model Expected late 2024
Gemini Ultra Advanced language and multimodal capabilities January 2024
Claude 3 Enhanced conversational AI and reasoning December 2023
Mistral 7B Efficient language generation December 2023
Llama 2 70B Large language model for various tasks December 2023
Falcon 180B High-performance language model December 2023
Gemini Multimodal capabilities for AI tasks December 2023
Mixtral Multi-task language model December 2023
OpenAssistant Open-source conversational agent October 2023
DALL-E 3 Enhanced image generation capabilities September 2023
Stable Diffusion XL Enhanced image generation model August 2023
LLava Language and vision integration July 2023
Claude 2 Improved conversational AI July 2023
LLaMA 2 Open-source language model July 2023
Ray Serve Scalable model serving July 2023
vLLM Efficient language model serving August 2023
Vision Assistant Multimodal vision processing June 2023
PaLM 2 Multimodal language processing April 2023
Bard Conversational AI with web integration March 2023
Firefly Generative design and image editing March 2023
Anthropic’s Claude Safety-oriented conversational AI March 2023
GPT-3.5 Turbo Faster and cheaper version of GPT-3.5 March 2023
GPT-4 State-of-the-art language understanding March 2023
ChatGPT-4.5 Improved conversational capabilities February 2023
Langchain Framework for building LLM applications February 2023
LlamaIndex Data indexing for LLMs January 2023
EleutherAI’s GPT-NeoX Open-source large language model January 2023
Text Generation Inference Efficient text generation April 2023
Cohere Command R Language model for command generation April 2023
DeepMind’s Gato Generalist agent for various tasks May 2022
Stable Diffusion 2.0 Advanced image synthesis November 2022
ChatGPT-3.5 Conversational AI with improved context November 2022
Jurassic-2 Advanced natural language processing November 2022
DeepAI’s Text to Image Text-to-image generation November 2022
DALL-E 2 Image generation from text prompts September 2022
OpenAI’s Whisper Speech recognition and transcription September 2022
OpenAI Codex Code generation and understanding August 2022

Pre-trained Models

Model Full Name Description Company Date
BERT Bidirectional Encoder Representations from Transformers Designed for understanding the context of words in search queries. Google March 2021
GPT-3 Generative Pre-trained Transformer 3 A state-of-the-art language model capable of generating human-like text based on prompts. OpenAI June 2020
ResNet Residual Network A deep residual learning framework that helps in training very deep neural networks. Microsoft January 2020
VGG Visual Geometry Group A convolutional neural network architecture known for its simplicity and depth, used primarily for image classification. University of Oxford September 2015
YOLO You Only Look Once A real-time object detection system that detects objects in images and videos. Joseph Redmon April 2020
SSD Single Shot MultiBox Detector A single-shot detector that detects objects in images using a single deep neural network. Wei Liu March 2016
Faster R-CNN Faster Region-based Convolutional Neural Network An object detection model that combines region proposal networks with fast R-CNN. Microsoft April 2017
Mask R-CNN Mask Region-based Convolutional Neural Network An extension of Faster R-CNN that adds a branch for predicting segmentation masks on each Region of Interest (RoI). Facebook AI Research November 2018
U-Net U-Net Convolutional Network A convolutional network architecture for biomedical image segmentation. University of Freiburg January 2018
SegNet Segmentation Network A deep convolutional encoder-decoder architecture for semantic segmentation. University of Cambridge August 2017
PSPNet Pyramid Scene Parsing Network A semantic segmentation model that captures global context information through a pyramid pooling module. University of Science and Technology of China March 2018
DeepLabV3 DeepLab Version 3 A semantic segmentation model that uses atrous convolution to capture multi-scale context. Google October 2019
DeepLabV3+ DeepLab Version 3 Plus An improved version of DeepLabV3 that includes an encoder-decoder structure for better segmentation. Google October 2019
EfficientNet Efficient Neural Network A family of convolutional neural networks that scale efficiently with model size and input resolution. Google September 2020
MobileNet Mobile Neural Network A lightweight model designed for mobile and edge devices, optimized for speed and efficiency. Google June 2019
ShuffleNet Shuffle Neural Network A lightweight convolutional neural network architecture designed for mobile devices. Megvii Technology January 2018
SqueezeNet Squeeze Neural Network A small, efficient convolutional neural network that achieves AlexNet-level accuracy with fewer parameters. DeepScale March 2020
GAN Generative Adversarial Network A framework for training generative models using adversarial training. Ian Goodfellow et al. November 2014
VAE Variational Autoencoder A generative model that learns to encode data into a latent space and decode it back. Diederik P Kingma et al. December 2013
DCGAN Deep Convolutional GAN A type of GAN that uses deep convolutional networks for generating images. Facebook AI Research March 2016
WGAN Wasserstein GAN A variant of GAN that uses Wasserstein distance for training stability. Martin Arjovsky et al. June 2017
LSTM Long Short-Term Memory A type of recurrent neural network that can learn long-term dependencies. Google June 2020
GRU Gated Recurrent Unit A type of recurrent neural network that is simpler than LSTM and also captures long-term dependencies. Google June 2020
Seq2Seq Sequence-to-Sequence A model used for sequence-to-sequence tasks, such as translation. Google December 2016
Transformer Transformer Model A model architecture that uses self-attention mechanisms for sequence processing. Google June 2017
BART Bidirectional and Auto-Regressive Transformers A model that combines bidirectional and autoregressive transformers for sequence generation. Facebook AI Research March 2020
T5 Text-to-Text Transfer Transformer A text-to-text transformer that can handle various NLP tasks by converting them into a text generation format. Google July 2021
XLNet Generalized Autoregressive Pretraining A generalized autoregressive pretraining model that captures bidirectional context. Google June 2019
RoBERTa A Robustly Optimized BERT Pretraining Approach An optimized version of BERT that improves performance on various NLP tasks. Facebook AI Research July 2021
DistilBERT Distilled BERT A distilled version of BERT that is smaller and faster while retaining most of its performance. Hugging Face October 2020
ALBERT A Lite BERT A lite version of BERT that reduces model size with parameter sharing and factorized embeddings. Google September 2020
XLM Cross-lingual Language Model A cross-lingual language model that learns representations for multiple languages. Facebook AI Research July 2021
XLM-RoBERTa Cross-lingual RoBERTa A multilingual version of RoBERTa that performs well on various cross-lingual tasks. Facebook AI Research March 2021
FlauBERT French Language Model A French language model that is pre-trained on a large corpus of French text. GETALP December 2020
mBART Multilingual BART A multilingual sequence-to-sequence model that can perform various NLP tasks across languages. Facebook AI Research March 2021
ELECTRA Efficiently Learning an Encoder that Classifies Token Replacements Accurately A model that trains discriminators to distinguish real from fake tokens, improving efficiency. Google March 2021
DeBERTa Decoding-enhanced BERT with Disentangled Attention A model that enhances BERT with disentangled attention and an improved mask decoder. Microsoft February 2021
REALM Retrieval-Augmented Language Model A model that integrates retrieval into language modeling for enhanced understanding. Google October 2020
CTRL Conditional Transformer Language Model A conditional transformer language model that can generate text based on control codes. Salesforce September 2019
GPT-2 Generative Pre-trained Transformer 2 A predecessor to GPT-3, known for generating coherent and contextually relevant text. OpenAI November 2019
Megatron-LM Megatron Language Model A large language model designed for efficient training on multiple GPUs. Nvidia March 2021
DALL-E DALL-E: Creating Images from Text A model that generates images from textual descriptions, showcasing creativity and understanding. OpenAI January 2021
CLIP Contrastive Language-Image Pre-training A model that connects images and text, allowing for zero-shot classification of images based on text prompts. OpenAI January 2021
StyleGAN Style Generative Adversarial Network A generative adversarial network that creates high-quality images with controllable styles. Nvidia February 2021
BigGAN Large Scale GAN A large-scale GAN that generates high-fidelity images with improved diversity. Google January 2021
CycleGAN Cycle-Consistent Adversarial Networks A model that translates images from one domain to another without paired examples. UC Berkeley November 2020
StarGAN Star Generative Adversarial Network A unified model for multi-domain image-to-image translation. Yunjey Choi et al. September 2018
pix2pix Image-to-Image Translation with Conditional Adversarial Networks A conditional GAN for image-to-image translation tasks. Phillip Isola et al. January 2017
DeepSpeech Deep Speech Recognition An automatic speech recognition system based on deep learning. Mozilla March 2021
Wav2Vec Wav2Vec: Unsupervised Pre-training for Speech Recognition A self-supervised model for learning representations from raw audio data. Facebook AI Research June 2021
HuBERT Hidden Unit BERT A self-supervised learning model for speech representation learning. Facebook AI Research April 2021
Whisper Whisper: Automatic Speech Recognition A model for automatic speech recognition and transcription. OpenAI December 2022