Must watch video, "Intro to Large Language Models" by Andrej Karpathy


After reading about Andrej Karpathy's second departure from OpenAI last week, I decided to rewatch his YouTube video "Intro to Large Language Models." If you haven't watched it, and are interested in learning more about GenAI/LLMs, I highly recommend it. Andrej does an amazing job of describing the technology in an entertaining and relatable way with limited use of AI jargon. You can find links below.


Money slide #1 (@4:21):


Key slide takeaways:
  • Llama 2 is trained on 10TB of text scraped from the internet
  • It cost $2MM in compute to train (note to self, don't try this)
  • The generated base model is 140GB (manageable)

Money slide #2 (@17:56):

This slide provided a nice overview of how to fine-tune a base model. Note that stage 2 can be done multiple times across multiple different datasets dependent on your training goals.

Money slide #3 (@42:19):


Andrej introduces "LLM OS" as model. Others in the industry having similar thoughts. In fact this terminology was used to describe Alexa as "voice AI operating system" several years ago. Will be interesting to see how this plays out as OpenAI drive optimizations for inference from the cloud all the way to devices and the underlying silicon.

Thank you Andrej and look forward to learning more from you. Others feel the same as evidenced from the comments on this video.

@BAIR68 - "I am a college professor and I am learning from Andrej how to teach. Every time I watch his video, I not only I learn the contents, also how to deliver any topic effectively. I would vote him as the best “AI teacher in YouTube”. Salute to Andrej for his outstanding lectures."

@aryanrahman3212 - "You know when someone makes a topic so accessible and understandable you feel like you're hearing a story but learning a lot. This happened in this video."

You can watch the video here: https://youtu.be/zjkBMFhNj_g?si=uCoE0zuZi9qVBv8S

And read about Andrej's OpenAI departure here: https://www.reuters.com/technology/openai-researcher-andrej-karpathy-departs-firm-2024-02-14/




Comments

Popular posts from this blog

Be agile about agile!

When Do You Need Agile Product Portfolio Management?