ADVERTISEMENT

Last Updated:
2024-03-19 22:26:12

Large Language Models

Advanced language models
Advanced language models
Large Language Models (LLMs) excel in language generation and NLP tasks through self-supervised learning from text. They are artificial neural networks, with the most advanced using transformer-based architecture. Notable LLMs include GPT series, PaLM, Gemini, Grok, LLaMA, Claude models, Mistral AI models, and DBRX.
1883
Development of Semantics by Michel Bréal
Michel Bréal developed the concept of semantics in 1883, which laid the foundation for the history of large language models. He studied language organization, evolution, and word connections.
1966
Development of ELIZA
ELIZA, a language model program, was developed by Joseph Weizenbaum at MIT in 1966. It used a simple set of rules to mimic human conversation, representing an early example of language modeling technology.
1970
Development of SHRDLU
SHRDLU was a system developed to understand and respond to commands in a restricted world of geometric shapes. It was one of the first programs to demonstrate natural language understanding, but its capabilities were limited to the specific domain it was designed for.
1989
Inception of Large Language Models
The concept of large language models, requiring complex training with massive amounts of data, was initiated around 1989, leading to significant advancements in natural language processing.
1997
Introduction of Long Short-Term Memory (LSTM) networks
The advent of LSTM networks resulted in deeper and more complex neural networks that could handle greater amounts of data, contributing to the advancement of large language models.
2010
Introduction of Stanford’s CoreNLP suite
Stanford’s CoreNLP suite allowed developers to perform sentiment analysis and named entity recognition, marking a significant stage of growth in the development of large language models.
2011
Introduction of advanced features in Google Brain
Google Brain introduced advanced features such as word embeddings, which enabled NLP systems to gain a clearer understanding of context, signifying a significant turning point in the development of large language models.
2015-04-22
Attention
The concept of Attention in the context of transformers has revolutionized natural language processing.
2016-12-18
Attention->Ul2
The Ul2 model, based on the concept of Attention, has been a major development in large language models.
2017-06-20
Attention->Gpt
The Gpt model, utilizing the concept of Attention, has had a profound impact on natural language processing.
2018-06-11
Improving Language Understanding by Generative Pre-Training
On June 11, 2018, there were efforts to enhance language understanding through generative pre-training. This likely contributed to advancements in natural language processing and machine learning.
2018-08-25
Attention->Lamda
The Lamda model, leveraging the concept of Attention, has significantly improved large language models.
2018-10-11
BERT
BERT stands for Bidirectional Encoder Representations from Transformers. It is a pre-training technique for natural language processing based on deep bidirectional transformers.
2018-11-13
Attention->MegatronLm
The MegatronLm model, built on the concept of Attention, has made significant contributions to large language models.
2019-02-14
Introduction of Language Models are Unsupervised Multitask Learners
On February 14, 2019, the concept of Language Models as Unsupervised Multitask Learners is introduced.
2019-04-23
Generating Long Sequences with Sparse Transformers
A method for generating long sequences with sparse transformers, known as LSST, is presented to address the challenge of processing lengthy inputs in language models.
2019-05-30
Attention->Whisper
The Whisper model, incorporating Attention, has made notable advancements in large language models.
2019-09-17
Megatron-LM
Megatron-LM is a method for training multi-billion parameter language models using model parallelism.
2019-10-05
Attention->T5
The T5 model, incorporating Attention, has significantly advanced the capabilities of large language models.
2019-10-23
T5
T5, a state-of-the-art LLM for code, was introduced on October 23, 2019.
2019-11-06
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
A method called FlashAttention was introduced to provide fast and memory-efficient exact attention with IO-awareness.
2020-04-10
Longformer: The Long-Document Transformer
The Longformer, a long-document transformer, is introduced to effectively process and analyze lengthy documents, offering a solution for handling extensive textual data.
2020-05-28
Introduction of Language Models are Few-Shot Learners
On May 28, 2020, the concept of Language Models as Few-Shot Learners is introduced.
2020-06-19
Denoising Diffusion Probabilistic Models
Introduction of denoising diffusion probabilistic models.
2021-02-26
Learning Transferable Visual Models From Natural Language Supervision
The event involves the training of CLIP, a model for learning transferable visual models from natural language supervision.
2021-05-18
LaMDA: our breakthrough conversation technology
LaMDA is a conversational technology that represents a significant advancement in natural language processing. It enables more sophisticated and contextually relevant conversations compared to traditional models.
2021-06-04
GPT-J-6B: 6B JAX-Based Transformer
On June 4, 2021, the GPT-J-6B, a JAX-based transformer, was likely introduced or discussed. This transformer model is significant in the field of large language models and natural language processing.
2021-07-07
OpenAI Codex
OpenAI Codex is a large language model designed for code understanding and generation.
2021-09-03
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
On September 3, 2021, CodeT5 was launched, offering identifier-aware unified pre-trained encoder-decoder models for code understanding and generation.
2021-10-15
Multitask Prompted Training
Multitask Prompted Training (MTF) enables zero-shot task generalization, allowing models to perform tasks they were not explicitly trained for. It represents a significant advancement in the field of language models and artificial intelligence.
2021-12-16
WebGPT Announcement
OpenAI announced WebGPT in December 2021, expanding the capabilities of large language models in web-related tasks and applications.
2021-12-20
High-Resolution Image Synthesis with Latent Diffusion Models
A project or research related to generating high-resolution images using latent diffusion models.
2022-01-27
Training language models to follow instructions with human feedback
A project or research aimed at training language models to understand and follow instructions based on human feedback. This involves developing AI models that can learn from human input to accurately interpret and execute instructions.
2022-01-28
CoT
CoT is involved in adding conditional control to text-to-image diffusion models.
2022-04-04
Pathways Language Model (PaLM)
The Pathways Language Model (PaLM) is scaled to 540 billion parameters for achieving breakthrough performance.
2022-04-12
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
A project or research conducted on training an assistant using reinforcement learning and human feedback to ensure it is helpful and harmless. The focus is on developing an AI model that can learn from human input to provide assistance without causing harm.
2022-04-13
DALL-E 2
On April 13, 2022, DALL-E 2 was released, building upon the capabilities of the original DALL-E.
2022-05-10
UL2: Unifying Language Learning Paradigms
UL2 is an event focused on bringing together different language learning methods and approaches. It aims to create a unified paradigm for language learning.
2022-05-22
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
A project or research focused on creating text-to-image diffusion models with deep language understanding to generate photorealistic images based on textual input. This involves advanced AI technology that can interpret and generate images from text.
2022-05-27
A New Open Source Flan 20B with UL2
The release of a new open source version of Flan 20B with UL2 represents a significant development in the field of language models. It offers enhanced features and accessibility for language model enthusiasts and developers.
2022-08-22
Stable Diffusion
The StableLM model is achieving stable diffusion in the AI community.
2022-09-20
ScienceQA
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
2022-10-20
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo is a visual language model designed for few-shot learning, allowing it to understand and learn from a small amount of data. It aims to improve the performance of language models in tasks requiring minimal training examples.
2022-11-09
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM is introduced as a 176 billion parameter open-access multilingual language model.
2022-11-20
The Stack
The Stack event involves the release of 3 TB of permissively licensed source code.
2022-11-29
Releasing GPT-JT powered by open-source AI
GPT-JT, powered by open-source AI, is released on November 29, 2022.
2022-11-30
Introducing ChatGPT
The event involves the introduction of ChatGPT, a model designed for crosslingual generalization through multitask finetuning.
2022-12-20
Self-Instruct
The event 'Self-Instruct' took place on this date. No further details were provided.
2023-01-29
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
The event AudioLDM took place on January 29, 2023. It focused on the development of Text-to-Audio generation using Latent Diffusion Models.
2023-01-30
BLIP-2
BLIP-2 refers to Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models. It is a significant advancement in AI that involves pre-training models using both language and image data.
2023-02-10
ControlNet
ControlNet is focused on evaluating large language models trained on code.
2023-02-27
LLaMA: Open and Efficient Foundation Language Models
Introduction of LLaMA as an open and efficient foundation language model.
2023-03-03
Scaling Instruction-Finetuned Language Models
This event involves the scaling of instruction-finetuned language models, particularly Flan-UL2, to enhance their capabilities and performance. It signifies advancements in language model development and application.
2023-03-07
Training Language Models to Follow Instructions with Human Feedback
Efforts are made to train language models to follow instructions with human feedback, aiming to improve their understanding and responsiveness.
2023-03-10
OpenChatKit Open-Sourced
Together Computer released OpenChatKit, a design for large language models used for new chatbots, as an open-source project on March 10, 2023.
2023-03-13
Alpaca: A Strong, Replicable Instruction-Following Model
The paper titled 'Alpaca' introduces a robust and reproducible instruction-following model.
2023-03-15
GPT-4
On March 15, 2023, GPT-4 was released, representing the next version of the Generative Pre-trained Transformer (GPT) model.
2023-03-21
Bard
The event Bard took place on March 21, 2023. The specific details of the event were not mentioned.
2023-03-22
Sparks of Artificial General Intelligence
Early experiments with GPT-4 are leading to the emergence of Artificial General Intelligence.
2023-03-28
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
The LLaMA-Adapter is introduced as an efficient method for fine-tuning language models with zero-init attention, aiming to improve the performance of language models.
2023-03-30
Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90% ChatGPT Quality
Vicuna is an open-source chatbot that has impressed with its high quality conversation capabilities, achieving 90% ChatGPT quality. It showcases advancements in chatbot technology and open-source development.
2023-04-03
Koala: A Dialogue Model for Academic Research
Koala is a dialogue model designed specifically for academic research purposes. It aims to facilitate conversations and interactions related to scholarly topics.
2023-04-07
Generative Agents: Interactive Simulacra of Human Behavior
On April 7, 2023, Generative Agents, interactive simulacra of human behavior, were likely a topic of discussion or development. These agents are designed to simulate human behavior in an interactive manner.
2023-04-12
Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM
The event marks the introduction of the World's First Truly Open Instruction-Tuned Large Language Model (LLM) called Dolly 2.0, which aims to democratize the magic of ChatGPT with open models.
2023-04-15
OpenAssistant Conversations - Democratizing Large Language Model Alignment
On April 15, 2023, OpenAssistant introduced OpenAssistant Conversations, aiming to democratize large language model alignment.
2023-04-17
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
MiniGPT-4 is a development focused on improving the understanding of vision and language using advanced large language models. It represents a significant advancement in the field of artificial intelligence.
2023-04-18
VideoLDM
The event VideoLDM is not specified in the input. However, it could be related to video content generation using Large Language Models, showcasing advancements in multimedia content creation.
2023-04-19
StableLM
StableLM model continues to make significant progress in the field of language modeling.
2023-04-26
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
On April 26, 2023, a survey titled 'Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond' was conducted. The survey focused on exploring the practical applications of large language models like ChatGPT and their impact.
2023-04-27
We're Afraid Language Models Aren't Modeling Ambiguity
The blog post 'AmbiEnt' raises concerns about the ability of language models to accurately represent ambiguity.
2023-04-28
DeepFloyd IF Release
Stability AI releases DeepFloyd IF, a powerful text-to-image model that can intelligently integrate text into images.
2023-05-02
OpenLLaMA
OpenLLaMA is a large language model alignment project aimed at democratizing the development and use of large language models.
2023-05-04
StarCoderBase
StarCoderBase was introduced as a state-of-the-art LLM for code on May 4, 2023.
2023-05-05
Introducing MPT-7B
MPT-7B is a new standard for open-source, commercially usable Large Language Models (LLMs). It aims to set a benchmark for LLMs that can be used for various tasks and applications.
2023-05-10
PaLM 2 Technical Report
The technical report for PaLM 2 is released, providing detailed information about the model.
2023-05-20
CodeT5+
On May 20, 2023, CodeT5+ was introduced, likely as an enhanced version of the existing CodeT5 model.
2023-05-22
RWKV
RWKV: Reinventing RNNs for the Transformer Era
2023-05-24
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
This event highlights the concept of Direct Preference Optimization (DPO) and its implications on language models, suggesting that language models may function as reward models in secret. It represents a crucial development in the understanding of language model behavior.
2023-05-29
Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM
The event marks the introduction of the world's first truly open instruction-tuned Large Language Model (LLM), named Free Dolly. It signifies a significant advancement in language model technology.
2023-06-22
Stable Diffusion XL 0.9
Stable Diffusion XL 0.9 was released, aiming to improve Latent Diffusion Models for High-Resolution Image Synthesis.
2023-06-28
Long Sequence Modeling with XGen
A 7B Large Language Model trained on 8K input sequence length was introduced.
2023-07-18
Llama 2: Open Foundation and Fine-Tuned Chat Models
Introduction of Llama 2 as an open foundation with fine-tuned chat models.
2023-07-26
SDXL 1.0
The release of SDXL 1.0 was announced.
2023-08-01
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework
MetaGPT is a programming framework designed for multi-agent collaboration. It aims to enhance the capabilities of artificial intelligence through advanced programming techniques.
2023-08-24
Code Llama: Open Foundation Models for Code
On August 24, 2023, Code Llama introduced the Open Foundation Models for Code, aiming to enhance code understanding and generation.
2023-09-11
Textbooks Are All You Need II: phi-1.5 technical report
The technical report for Textbooks Are All You Need II with phi-1.5 was released on September 11, 2023.
2023-09-27
Mistral 7B
On September 27, 2023, Mistral 7B, a significant technological advancement, was introduced.
2023-10-19
DALL-E 3
In October 19, 2023, DALL-E 3 was introduced, further advancing the text-conditional image generation with CLIP latents.
2023-11-06
Robust Speech Recognition via Large-Scale Weak Supervision
An event showcasing robust speech recognition achieved through large-scale weak supervision.
2023-12-06
Gemini: A Family of Highly Capable Multimodal Models
Gemini, a family of highly capable multimodal models, was introduced to address various tasks effectively.
2023-12-11
Mixtral of experts
An event called Mixtral of experts is scheduled to take place on December 11, 2023.
2023-12-12
Phi-2: The surprising power of small language models
On December 12, 2023, the surprising power of small language models was discussed in the Phi-2 event.
2023-12-26
Viktor Garske Last Update
Viktor Garske made the last update on this date regarding AI/ML/LLM/Transformer Models timeline and list.
2023-12-28
Development of Large Language Models
Large language models (LLMs) are artificial neural networks that have rapidly transitioned from recent development to widespread use within a few years. They have played a crucial role in the advancement of ChatGPT, a more intelligent form of artificial intelligence achieved through the combination of generative AI with large language models.
2024-02-26
Supervised Fine-Tuning for Customizing LLM
A guide on how to perform supervised fine-tuning to customize Large Language Models for specific applications.
2024-03-13
Understanding the Difference Between Labeled and Unlabeled Data
An article explaining the distinction between labeled and unlabeled data in the context of data labeling and machine learning.
2024-03-18
LLM-based GPT Products Learning and Communication
LLM-based GPT products are capable of learning and communicating like humans, posing potential threats to job security across industries and raising concerns about the obsoletion of traditional academic essays. However, there is also excitement about the limitless potential and numerous opportunities offered by this innovative technology.
2024-04-30
Improving LLM Output by Combining RAG and Fine-Tuning
An improvement in Large Language Models (LLM) output was achieved by combining RAG and fine-tuning techniques.
2025
Expansion of Business Applications for Large Language Models
Large Language Models (LLMs) are anticipated to further expand their capabilities in handling business applications, particularly in terms of translating content across different contexts. This expansion is expected to make LLMs more accessible to business users with varying levels of technical expertise.
2026
LLMs Sentiment Analysis
Most LLMs can be used for sentiment analysis to help users better understand the intent of a piece of content or a particular response.
End of the Timeline
Large Language Models

Information

Large Language Models

Advanced language models
Last Updated:

Event

Last Updated: