Transformer Neural Networks - EXPLAINED! (Attention is all you need)

공유
소스 코드
  • 게시일 2020. 01. 12.
  • Please subscribe to keep me alive: krplus.net/uCodeEmporiu...
    BLOG: / dataemporium
    PLAYLISTS FROM MY CHANNEL
    ⭕ Reinforcement Learning: • Reinforcement Learning...
    Natural Language Processing: • Natural Language Proce...
    ⭕ Transformers from Scratch: • Natural Language Proce...
    ⭕ ChatGPT Playlist: • ChatGPT
    ⭕ Convolutional Neural Networks: • Convolution Neural Net...
    ⭕ The Math You Should Know : • The Math You Should Know
    ⭕ Probability Theory for Machine Learning: • Probability Theory for...
    ⭕ Coding Machine Learning: • Code Machine Learning
    MATH COURSES (7 day free trial)
    📕 Mathematics for Machine Learning: imp.i384100.net/MathML
    📕 Calculus: imp.i384100.net/Calculus
    📕 Statistics for Data Science: imp.i384100.net/AdvancedStati...
    📕 Bayesian Statistics: imp.i384100.net/BayesianStati...
    📕 Linear Algebra: imp.i384100.net/LinearAlgebra
    📕 Probability: imp.i384100.net/Probability
    OTHER RELATED COURSES (7 day free trial)
    📕 ⭐ Deep Learning Specialization: imp.i384100.net/Deep-Learning
    📕 Python for Everybody: imp.i384100.net/python
    📕 MLOps Course: imp.i384100.net/MLOps
    📕 Natural Language Processing (NLP): imp.i384100.net/NLP
    📕 Machine Learning in Production: imp.i384100.net/MLProduction
    📕 Data Science Specialization: imp.i384100.net/DataScience
    📕 Tensorflow: imp.i384100.net/Tensorflow
    REFERENCES
    [1] The main Paper: arxiv.org/abs/1706.03762
    [2] Tensor2Tensor has some code with a tutorial: www.tensorflow.org/tutorials/...
    [3] Transformer very intuitively explained - Amazing: jalammar.github.io/illustrated...
    [4] Medium Blog on intuitive explanation: / what-is-a-transformer
    [5] Pretrained word embeddings: nlp.stanford.edu/projects/glove/
    [6] Intuitive explanation of Layer normalization: mlexplained.com/2018/11/30/an...
    [7] Paper that gives even better results than transformers (Pervasive Attention): arxiv.org/abs/1808.03867
    [8] BERT uses transformers to pretrain neural nets for common NLP tasks. : ai.googleblog.com/2018/11/ope...
    [9] Stanford Lecture on RNN: cs231n.stanford.edu/slides/201...
    [10] Colah’s Blog: colah.github.io/posts/2015-08...
    [11] Wiki for timeseries of events: en.wikipedia.org/wiki/Transfo...)

댓글 • 687

  • @CodeEmporium
    @CodeEmporium  년 전 +11

    For more details and code on building a translator using a transformer neural network, check out my playlist "Transformers from scratch": krplus.net/bidio/gaeCg3dmdqyshWE

  • @ShotReverseShot
    @ShotReverseShot 2 년 전 +63

    2:03 I died at the Vsauce reference. Well played.

  • @ThisNameWasntTaken
    @ThisNameWasntTaken 4 년 전 +769

    what a hugely underrated video. You did such a better job at explaining this on multiple abstraction layers in such a short video than most videos I could find on the topic which were more than twice as long.

    • @CodeEmporium
      @CodeEmporium  4 년 전 +66

      Thanks a ton Jeffrey! Means a lot. I've come to realize (fairly recently) that only speaking in jargon isn't going to help. Pealing it down from highly abstract to more technical goes a long way for viewers and myself. I understand more when I break the jargon down. Using this more in future videos

    • @ThisNameWasntTaken
      @ThisNameWasntTaken 4 년 전 +7

      @@CodeEmporium Of course everyone has a different approach to understanding a topic. I recently had to get into a few topics quite quickly and fore me the best way of getting there fast was to start out with very general videos to get a sort of feel for the general ideas and how everything works together on a high level. Then I would watch some more detailed videos or switch to reading more detailed articles until finally reading the actual papers and looking at the formulas and all that stuff.
      Having an understanding of the bigger picture helped me comprehend the details better.
      I Also think no explanation can ever be "too simple" cause sometimes when explanations try to save time by glossing over parts or taking things for granted you spend way more time rewinding trying to wrap your head about some small detail just because you might be missing some needed knowledge.
      I think in an explanation it's like with spices on food. better keep it simple and easy. Individuals can always skip parts for themselves. same with the spices: Better not add too much thinking everyone will be able to take it, rather add a little and if it's not enough for someone they can add it themselves.

    • @bhargavyagnik
      @bhargavyagnik 4 년 전

      So true man !! i scraped the net to find a simple explanation !! you are a genius :)

    • @joosthorskamp1736
      @joosthorskamp1736 3 년 전 +1

      True, I did not understand the use of attention until watching this video

    • @sindhuorigins
      @sindhuorigins 3 년 전 +2

      @@CodeEmporium Great approach that you've taken. A high level understanding followed by deeper understanding of the topic pretty much clears up the concept. Subscribed.

  • @himeshph
    @himeshph 3 년 전 +98

    @2:02 that vsauce thing was
    cool

  • @Elanus19
    @Elanus19 3 년 전 +89

    Incredibly well explained and concise. I can't believe you pulled off such a complete explanation in just 13 minutes!

    • @CodeEmporium
      @CodeEmporium  2 년 전 +1

      Thank you for the kind words. Super glad you liked it :)

  • @lingding77
    @lingding77 3 년 전 +20

    I love the multi-pass way of explanation so that the viewer can process high level concepts and then build upon that knowledge, great job.

  • @seungjungjin9217
    @seungjungjin9217 3 년 전 +6

    Great video! I love how you go through multiple passes, each getting into deeper specifics!

  • @danbochman
    @danbochman 3 년 전 +38

    Wow.
    I've seen lectures that are 45m+ long trying to understand this architecture, even lectures from the original authors.
    Your video was hands-down the best, really helped me piece some key missing intuition pieces together.
    You have a gift for teaching and explaining -- I wholeheartedly hope you're able to leverage that in your professional career!

  • @gearoidmurphy4988
    @gearoidmurphy4988 3 년 전 +84

    The multi-pass approach to progressively explaining the internals worked well. Thanks for your content!

    • @ajcosta
      @ajcosta 년 전 +5

      The understanding converges!

  • @PhilbertLin
    @PhilbertLin 4 년 전 +225

    Great video! Watched it a few times already so these timestamps will help me out:
    0:00 Problems with RNNs and LSTMs
    3:34 First pass overview over transformer architecture
    8:10 Second pass with more detail
    10:34 Third pass focusing on attention networks and normalization
    11:57 Other resources (code & blog posts)

    • @CodeEmporium
      @CodeEmporium  4 년 전 +6

      Thanks for this! It'll help others watching too.

    • @akkipant
      @akkipant 3 년 전 +6

      @@CodeEmporium Pin this comment.

    • @mattcoakes5682
      @mattcoakes5682 3 년 전

      Thank you so much! I planned to watch this a few times for reference as I delve into transformer code. This will be very useful.

  • @frederikbrammer
    @frederikbrammer 년 전 +5

    This is by far the best explanation of the Transformers architecture that I have ever seen! Thanks a lot!

  • @abhijoysar
    @abhijoysar 3 년 전 +2

    Very Underrated. Please keep doing these videos. You have no idea what a great amount of service this is doing to the young research communities who are just learning to read research papers. Instant subscribed.🙌

  • @MrKfirlevi
    @MrKfirlevi 2 년 전 +17

    Great video!! I am taking a course in my university and one of the lectures was about RNNs and transformers. Your video of 13 mins explains way better than the 100 mins lecture i attended. Thank you!

  • @vahekassardjian5032
    @vahekassardjian5032 4 년 전 +16

    Outstanding explanations: to the point and well illustrated. Thank you.

  • @newginsam670
    @newginsam670 16 일 전

    Bro
    TBH no words to appreciate such a well structured video in a short time and the explanation was easly understandable even for people with less knowledge.
    Thanks for the video man.

  • @logicloudy2851
    @logicloudy2851 3 년 전 +2

    Thanks, man. This is a really clear and high-level explanation. Really helpful for some guys like me who just stepped into this area.
    I read many explanations online. They give tons of details but fail in explaining these abstract items. These explanations always use other abstract definitions to explain this one. This problem happens again in the explanation of the "other abstract item". Sometimes I just forgot originally what I want to understand. Or even worse, they form a circulation...
    Thank you so much! This video helped me a lot in understanding the paper

  • @rodrigoklosowski8219

    Amazing! The best explanation on youtube about Transformer Neural networks, the matrices visual representations helps a lot!

  • @ScriptureFirst
    @ScriptureFirst 3 년 전 +2

    KEEP IT UP! Please! This is outstanding :) love the diagrams, dry crisp active speaking, & overview technique.

  • @vtrandal
    @vtrandal 년 전 +2

    With videos like this one you should be having 100,000+ subscribers soon. Adding a bit of humor to uncompromising technical content is a very good way to go.

  • @PRUTHVIRAJRGEEB
    @PRUTHVIRAJRGEEB 4 년 전 +8

    This is heavily underrated!! Such an awesome video! Thanks Man!

  • @lonewolf2547
    @lonewolf2547 3 년 전

    I cant believe you explained so complicated things in crystal clear format. Excellent job dude

  • @moneyinahurry
    @moneyinahurry 년 전 +2

    One of the best or probably the best explanation I've seen. Thank you very much for the effort.

  • @Ashwin436
    @Ashwin436 3 년 전

    This was very helpful! You broke it down into such simpler concepts. I'm sure I'll be needing you again. Please keep at it. Thanks!

  • @eashwaraerahan861
    @eashwaraerahan861 3 년 전 +2

    Great work. I really like the info graphics. I’m a person with no background on NLP but I was still able to follow till the second pass, thanks to your great work.

  • @snehashishpaul2740

    I had to make 2 passes of your video to fully understand and appreciate the underlying mathematics and working of the model. You have put a great effort in making it simpler to understand with illustration and animation.

  • @adwaitpatil8300
    @adwaitpatil8300 개월 전

    One of the cleanest explanation for transformers without dabbling too much into the theory!! Thanks man

  • @jonathanburrell7055
    @jonathanburrell7055 11 개월 전 +5

    This is awesome!!! Thank you for breaking it down concisely, understandably, and deeply! It’s hard to find explanations that aren’t so simplistic they’re useless, or so involved they don’t save time in achieving understanding. Thank you!!

    • @CodeEmporium
      @CodeEmporium  11 개월 전

      My pleasure! If you are into building the transformer piece by piece from scratch, I suggest checking out the “Transformers from scratch” playlist.

  • @simonevagnoni1758
    @simonevagnoni1758 2 년 전 +2

    Your way to breaking down step by step is very effective! Congrats and thanks. School systems should use it more

  • @Themojii
    @Themojii 년 전

    Very well, clearly explained the concepts, and nice visualization video. I spent a couple of hours and read multiple blog/tutorial about transformers, but I learned a lot more from your 13 mins video compared to those tutorials. Great job. I subscribed to your channel after watching this. Keep up the good work

  • @paragrk1
    @paragrk1 3 년 전 +2

    Went through several videos on 'Attention is all you need' paper before this, all the details you managed to cover in thirteen minutes is amazing. Could not find explanation that is so easy to understand anywhere else. Great job!

    • @enjakuro7048
      @enjakuro7048 2 년 전

      right? I couldn't believe this video is only 13 minutes! That's a very good talent to have.

  • @fadeyduh6422
    @fadeyduh6422 3 년 전

    That was the best way of explaining thins in my opinion. Start big picture, getting more detailed over time.

  • @TheJonathanLugo
    @TheJonathanLugo 9 개월 전

    Wow, I am so glad I found your channel. The concept is clearly explained and assumes an intelligent audience. Well done!

  • @Kasper-ev3zh
    @Kasper-ev3zh 3 년 전 +1

    Thank you so much for uploading this, appreciate the work you put into making it! Your explanation really helped my understanding, and much more so than a lot of other videos covering the subject.

    • @CodeEmporium
      @CodeEmporium  3 년 전 +1

      Absolutely! Thanks for taking the be time to watch. Will be making more videos on the subject. So stay tuned

  • @gajeshladhar3777
    @gajeshladhar3777 3 년 전

    it is really straight forward video which talks exactly on what we need to learn !!

  • @mattcoakes5682
    @mattcoakes5682 3 년 전 +3

    LOVE the multipass strategy for explaining the architecture. I don't think I've seen this approach used with ML, and it's a shame as this is an incredibly useful strategy for people like me trying to play catch up. I hopped on the ML train a little late, but stuff like this makes me feel not nearly as lost.

  • @paramveersingh2919

    Watched Andrew Ng, watched this, you got me to stick through the video and Andrew who i consider one of the best in this field did not manage to express as clearly as you did! Cheers man, amazing video!

  • @klam77
    @klam77 년 전

    BEAUTIFUL......deep, concise, pithy, each word is meaningful.....well done.

  • @derrxb
    @derrxb 4 년 전 +4

    This is one of the best explanations for transformers I've come across online! Awesome job, man! Thanks. I'll totally recommend your channel to some classmates!! :)

    • @CodeEmporium
      @CodeEmporium  4 년 전

      Thanks! And Glad it's helpful! Spreading the word of my channel is the best thing you can do to help :)

  • @M0I0D
    @M0I0D 3 년 전 +2

    너무 감사합니다ㅠㅜㅜ 덕분에 확 이해가네요. 배뎃 쌉공감 대체 이거 왜이렇게 안 유명하냐!!! thank you for your clear explanation!! This is what I was looking for!!!!!!

  • @navinahmed
    @navinahmed 3 년 전 +3

    Didnt know what a transformer hype was until I landed on this video. Thanks a lot ! Subscribed. Gotta check more content on this channel now

  • @shaina2231
    @shaina2231 년 전

    very good presentation indeed!! explains even complex concepts in very simple and easy way, well done

  • @lisandrocesaratto3012

    Best video on Transformers I have seen so far! The examples really help to understand how the architecture works. Subscribed.

  • @karakadir8860
    @karakadir8860 2 개월 전

    dude you absolutely deserve each and any subsriber. thank you very much for your highly helpful and quality content.

    • @CodeEmporium
      @CodeEmporium  2 개월 전

      Thanks so much for the lovely comment! And also for subscribing! More to come!

  • @fahdciwan8709
    @fahdciwan8709 4 년 전

    Thanks a lot man!! i know u explained is clearly but it'll still take me a few more times to watch the video and digest the concept

  • @farnazfaramarzi4137

    Super informative, thorough explanation! Thanks!

  • @yahyasowti9711
    @yahyasowti9711 3 년 전

    Great job in explaining such a complicated concept!

  • @pipe_runner_lab
    @pipe_runner_lab 년 전 +2

    I saw Yanik's explaination and now I saw yours. Yanik does a terrible job at explaining papers, he usually just jokes around. Your explanation is probably one of the best I have seen so far. Thanks man.

  • @leoisikdogan
    @leoisikdogan 4 년 전 +2

    Very well explained! Great video, as always.

  • @piyalikarmakar5979

    What an explanation!!! You have made such complex concepts so easy..Thank you so much...

  • @Sidnv
    @Sidnv 2 년 전 +2

    Really great video. As someone transitioning from pure math into machine learning and AI, I find the language barrier to be the biggest hurdle and you broke down these concepts in a really clear way. I love the multiple layer approach you took to this video, I think it worked really well to first give a big picture overview of the architecture before delving deeper.

  • @bauwndule
    @bauwndule 2 년 전

    Best explanation ever. I have an interview with Microsoft tomorrow. This was the best brushing up I could get.

  • @Fruchtkotzekiddy
    @Fruchtkotzekiddy 3 년 전

    this video was one of the best learning videos i EVER SAW
    first you give a high level overview, then u step in deeper
    every step with an understandable example
    THANK YOU SO MUCH!!!

    • @CodeEmporium
      @CodeEmporium  3 년 전

      You are very welcome. Thank you for the compliments :)

  • @utsabkhakurel9742
    @utsabkhakurel9742 3 개월 전

    Simple and easy to understand. Great job!

  • @emeralde3761
    @emeralde3761 2 년 전

    Just want to say this video is amazing. Watched like three other 30+ mins videos but they all failed to train my stupid brain. This 13 minutes video is intuitive, detailed, and beginner-friendly. Thank you :3

  • @mikashaw7926
    @mikashaw7926 3 년 전

    this is literally the best explanation ever, thank you!

  • @ritwikdubey5331
    @ritwikdubey5331 6 개월 전

    I was searching for this particular explanation from a long tym! thanks for this!

  • @yonistoller1
    @yonistoller1 3 년 전

    This is one of the best tutorial videos I've watched on any subject, thank you!

  • @cristianarteaga
    @cristianarteaga 3 년 전

    What a great explanation! Thanks a lot and please keep creating material like this. It's such a great help.

  • @sshubam
    @sshubam 년 전

    Wow ! Truly magnificient video ! Really understood transformers very well. Thank you so much. Please keep making these videos. You are very good at it. Thanks again.

  • @ryanhewitt9902
    @ryanhewitt9902 년 전 +1

    Thank you for making this! As a curious outsider I have been anxious about falling behind in recent years and this was perfect to bring me up to speed - at least enough to follow the conversation.

  • @usamahussain4461
    @usamahussain4461 2 년 전

    Man! Brilliant video. I saw a 27 mins video and was totally spent out and didn't even understand much. But this was just awesome, and in half the time!
    The only thing lacking might be the examples of keys, values and queries but i mostly got the hang of it.

  • @ApprovingSeal
    @ApprovingSeal 년 전

    Finally a basic explanation I can understand. I tried reading the original "Attention is all you need" paper, but it felt like it was assuming I was already familiar with the basics of NLP, like the encoder-decoder setup. Which I wasn't.

  • @guygirineza4001
    @guygirineza4001 4 년 전

    Your explanations are just superb !! Thank you very much, you earned a new subscriber !

  • @tictacX1
    @tictacX1 3 년 전 +2

    Good job CodeEmporium! Very well made overview. thanks.

  • @KrazeeKrab
    @KrazeeKrab 7 개월 전 +1

    This was a phenomenal video. You managed to explain transformers in 13 minutes better than my professor could in three hours.
    Thank you and keep on creating content!

  • @tvanpeer
    @tvanpeer 10 개월 전 +1

    Great video! I love the layered approach for explaining the concepts. Very well done. Thank you!

  • @gokuson6399
    @gokuson6399 2 년 전 +1

    Best explanation so far! Keep up the good work!

  • @motherbear55
    @motherbear55 3 년 전 +1

    Great explanation. Could you do another video on positional encoding specifically? It seems to be very important, but I’ve found it the most confusing part of this architecture.

  • @VaibhavPatil-rx7pc
    @VaibhavPatil-rx7pc 9 개월 전

    TOP OF TOP clear explanation you provided !!!

  • @theneilpowers
    @theneilpowers 4 개월 전

    This earned a subscription! Excellent explanation!

  • @ahmedelayek2110
    @ahmedelayek2110 2 년 전

    what a guide for the transformer in just 13 min.
    thanks a lot for this simplicity.

  • @dannysuarez6265
    @dannysuarez6265 3 년 전 +1

    How is it possible such a beautiful video don't have more views/likes? Thank you CodeEmporium:)

  • @shipan5940
    @shipan5940 2 년 전 +2

    By far, the MOST comprehensible explaination on Transformer available in the whole internet space.

    • @shipan5940
      @shipan5940 2 년 전 +1

      You deserve 1M subscribers at least.

    • @CodeEmporium
      @CodeEmporium  2 년 전

      Thank you for the kind words! Maybe one day

  • @tarat.techhh
    @tarat.techhh 3 년 전 +6

    2:04 or are we... not gonna lie this is the best channel and best explanation ever....................

  • @gurudevilangovan
    @gurudevilangovan 3 년 전

    Wow. Amazing video. Better than anything I've watched on the topic, all in thirteen minutes.

  • @d63810728
    @d63810728 3 년 전

    This is by far the most comprehensive yet short video i haave seen

  • @TusharKale9
    @TusharKale9 2 년 전

    Superb explanation in 13 minutes. I have been watching videos over 1 hour long to get this concept. Well done and keep it up. Regards

  • @abhishekswain2502
    @abhishekswain2502 4 년 전

    thank you, bro! this a very well made and precise video. Helped me a lot !

  • @rashedulhasanrijul5506

    Very simple explanation. Thank you

  • @HIMANSHUKUMARSINHA7

    I bought udemy course for Transformer and BERT but with no help and wasted my time, money and energy. This video and your BERT video made my day. thanks. I may explain in my interview well. :)

  • @NoOne-sy5fg
    @NoOne-sy5fg 년 전 +2

    Great video bro! You're underrated af. One of the best if not the best explanation of some neural network architecture. Keep up the good work. Kudos!

    • @CodeEmporium
      @CodeEmporium  년 전 +1

      Thanks so much! I am making more related videos. So do check ‘em out :)

  • @hareshwedanayake7427

    Great video. Definitely struggled to understand this concept but I think I understand it a bit now. Will definitely have to read up on this

  • @k.sladkina872
    @k.sladkina872 3 년 전

    Wow!! Thank you for this video. It really makes things clear.

  • @anishpratheepkumar4184

    Nice work man, that's a pretty nice explanation of transformers. Thanks!

  • @user-uc5xc5sb9z
    @user-uc5xc5sb9z 10 개월 전

    Very helpful and insightful, nice job!

  • @prateek4546
    @prateek4546 2 년 전

    Wow, the best explaination on youtube ! Had to subscribe after watching !

  • @sairamsubramaniam8316

    This is the best explanation! I came in search for transformers but I found Gold.

  • @punamroyce2484
    @punamroyce2484 년 전

    That's a wonderful explanation.. now i understood the topic in detail.. Thank you so much :)

  • @gabrielcournelle3055
    @gabrielcournelle3055 3 년 전 +3

    That was an awesome explanation. I have a question about the Add & Norm block. Do you add the embedded vector before or after performing normalization ? Is there even a difference if we do one instead of the other ?

  • @ayushkant392
    @ayushkant392 3 년 전

    This channel is so under-rated. Amazing video

  • @tariqnahmad
    @tariqnahmad 3 년 전

    Absolutely outstanding and amazing. Explained complex concepts succinctly and brilliantly. Thanks a lot.👍

  • @ctyuang
    @ctyuang 3 년 전

    Amazing explanation! Thank you sir!

  • @gautamgalada8774
    @gautamgalada8774 3 년 전

    I was exploring all the videos on transformers, tbh this was the best video to hop in. Things got pretty clear. Thanks a lot for this video. Can you suggest me some good pytorch tutorials ?

  • @QuanNguyen-vo2xh
    @QuanNguyen-vo2xh 4 년 전

    This video deserves more views. It's awesome. Good job buddy.

  • @nazanin5162
    @nazanin5162 4 년 전 +1

    excellent explanation!! Thank you so much

  • @danielpwagner
    @danielpwagner 년 전

    Outstanding. Best explanation of transformers that I’ve seen by far.

  • @maverick3069
    @maverick3069 3 년 전

    Brilliantly explained! Thanks man!

  • @wangy01
    @wangy01 년 전 +1

    I watched this video four times. After each time, I feel I understand this topic better than the previous one.

  • @gagecarpenter7532

    The best video I have seen on this topic. Great job

    • @CodeEmporium
      @CodeEmporium  년 전

      Glad it was helpful! And Thanks for watching :)

  • @halittalhature2438

    Such a great explanation! Many thanks...