News

College of Computing


Multimedia

[November 23, 2023] Large Language Models(in 2023) - Hyung Won Chung P…

페이지 정보

작성자 최고관리자 댓글 조회 작성일 23-11-28 16:49

본문

8c3e7dcfd8d62679dc99ac3cda5e7547_1701157749_0464.jpg


Abstract

There is one unique aspect of large language models (LLMs): larger models exhibit abilities that were not present in the smaller models. These emergent abilities

have far-reaching consequences in how we should work in the field of AI. I will share some of my observations on the implications of scaling and emergent

abilities. After that, I will introduce multiple stages involved in the current generations of LLM training: pre-training and post-training (including instruction

fine-tuning and RLHF).


Bio

Hyung Won is a research scientist at OpenAI ChatGPT team. He has worked on various aspects of Large Language Models: pre-training, instruction fine-tuning,

reinforcement learning with human feedback, reasoning, multilinguality, parallelism strategies, etc. Before OpenAI he spent 3.5 years at Google Brain.

Some of the notable work includes scaling Flan paper (Flan-T5, Flan-PaLM) and T5X, the training framework used to train the PaLM language model. He has

participated in open source projects such as Flan-T5, switch transformer, UL2. Before Google, he received a PhD from MIT where he worked on renewable

energy and clean water systems.


댓글목록

등록된 댓글이 없습니다.