3Blue1Brown
3Blue1Brown
  • 174 Videos
  • 497 018 285 조회수
Attention in transformers, visually explained | Chapter 6, Deep Learning
Demystifying attention, the key mechanism inside transformers and LLMs.
Instead of sponsored ad reads, these lessons are funded directly by viewers: 3b1b.co/support
Special thanks to these supporters: www.3blue1brown.com/lessons/attention#thanks
An equally valuable form of support is to simply share the videos.
Demystifying self-attention, multiple heads, and cross-attention.
Instead of sponsored ad reads, these lessons are funded directly by viewers: 3b1b.co/support
The first pass for the translated subtitles here is machine-generated, and therefore notably imperfect. To contribute edits or fixes, visit translate.3blue1brown.com/
And yes, at 22:00 (and elsewhere), "breaks" is a typo.
-----------...
조회수: 792 847

비디오

But what is a convolution?
조회수 2.5M년 전
The Summer of Math Exposition
조회수 721K2 년 전

댓글

  • @VividhKothari-rd5ll
    @VividhKothari-rd5ll 2 시간 전

    Completely beyond my capabilities to understand, but fascinating, nonetheless. The visual aspects make it so much accessible to everyone. Sometimes I ask ChatGPT a complex question and when it comes up with an amazing answer. And I find it so strange and fascinating that how did all these techniques led to an amazing and correct answer from basically "just a software."

  • @Juliet_Papa
    @Juliet_Papa 2 시간 전

    Better hope both prisoners have good pattern recognition.

  • @morpher44
    @morpher44 2 시간 전

    now go on to understand coil geometry

  • @Wtfinc
    @Wtfinc 3 시간 전

    Wait! Does light go through an object or does it absorb and re-emit it?

  • @user-om4by2ig8g
    @user-om4by2ig8g 3 시간 전

    how can you make this videos? what are the basics

  • @adwinang4188
    @adwinang4188 4 시간 전

    ngl, I got a bit delirious at this point after watching all the previous 11 chapters in one shot as a last ditch effort for tomorrow's Linear Algebra exam

  • @Matt-qi5ff
    @Matt-qi5ff 4 시간 전

    bro is mesmerizing

  • @annika6081
    @annika6081 4 시간 전

    So basically all of English is a vector space now? That's awesome!

  • @Ladyoftheroundtable
    @Ladyoftheroundtable 4 시간 전

    What is the purpose of the 0 movement vector at the start? Surely an offset would do the exact same function

  • @josephholland6787
    @josephholland6787 5 시간 전

    Flip the coin, and then put it in your pocket.

  • @toagodnameddream
    @toagodnameddream 6 시간 전

    i love you T_T

  • @jamesbond_AMK
    @jamesbond_AMK 6 시간 전

    n x (n-1) / 2 n is the number of points

  • @maxisunleashed1855
    @maxisunleashed1855 7 시간 전

    I believe this is proof that we live in a simulation with some static variables that solve certain problems

  • @michaelkeller5714
    @michaelkeller5714 7 시간 전

    These visuals and your explanations are unparalleled.

  • @oinvestigard
    @oinvestigard 8 시간 전

    For what is this usefull? Also, I found a big number... 3.161978e+286 or 3161697779300000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

  • @Killerkraft975
    @Killerkraft975 8 시간 전

    So its like reading right to left, where the first matrix is the positions, and the subsequent matrices are the 'functions' to transform the positions

  • @tomasseeber
    @tomasseeber 8 시간 전

    I believe sometimes you overcomplicate things. - This comment is intended to give you an external point of view so you can work on it, which I'm sure you would be happy to, if you find it true, since it is evident that you have put a lot of work in your psyche, which I admire. I = -Log2(p) was the objective, so the explanation and the steps are correct, necessary and clear, but, the objective was wrong. If instead of probability we speak in terms of possibilities, we get rid of the fraction, and the need of a s negative logarithm. P = 2^I Then I = Log2(P)

  • @samriddhi6129
    @samriddhi6129 8 시간 전

    i still don't understand, perhaps i should rewatch it multiple times

  • @Szwifty_
    @Szwifty_ 8 시간 전

    im confused, i thought everyone knew this.

  • @user-swathi999
    @user-swathi999 9 시간 전

    1 - 16

  • @Alexander_Sannikov
    @Alexander_Sannikov 9 시간 전

    is there any difference to the key and the query matrices, or is their difference only conceptual, as in "this is what these matrices could do"?

  • @evanlara7107
    @evanlara7107 10 시간 전

    What is the website name 🙏😁

  • @lhard123l
    @lhard123l 10 시간 전

    that was The most interesting trip on shŕooms

  • @UltraSuperDuperFreak
    @UltraSuperDuperFreak 10 시간 전

    It would only be a paradox if it actualy happens . Since it havent, there is litterly no reason to waste your time thinking about it. Also 1000 is too little a test pool. and if all are from same area its even worse. Needs to be more, preferly many millions. and they need to be from all over the world to get as many X factors , factored into the test. for an overall result. Since people live diffirently. Or make a test for each area. You should instead spend yout time on a real test. Or go enjoy life doing something else :)

  • @ThanQRadu
    @ThanQRadu 10 시간 전

    helical particle waves

  • @itsmrmostepic
    @itsmrmostepic 11 시간 전

    12:50 look closely and try to find out whats wrong

  • @auxzioplays6758
    @auxzioplays6758 11 시간 전

    min 11:50 s(2) is 8 not 2 why did s suddenly disappear and 2 remains? Please someone answer

  • @theamazinghippopotomonstro9942

    This is why I hate math

  • @petersinger5459
    @petersinger5459 11 시간 전

    Thanks for this. While there are several series that give a value for pi the Basel series converges realky quite quickly. I was learning to program in python and wondered how easy it would be to calculate the first 50 or so digits of pi. using a fairly simple program and the Basel sequence you can churn out the first 50 or 100 digits quite quickly and easily - a lot of fun !

  • @vivianegoncalves9287
    @vivianegoncalves9287 12 시간 전

    O que isso

  • @bernardogeocometto5562

    Beautiful

  • @SmartAPresident
    @SmartAPresident 12 시간 전

    Thank you.

  • @hgp314
    @hgp314 12 시간 전

    0:39 casually dropped the most elegant and clear equation as a throwaway line

  • @toqup_inua8023
    @toqup_inua8023 12 시간 전

    My time have come, you are the dragon warrior now

  • @AtariDays80
    @AtariDays80 13 시간 전

    Awesome video. However, brakes ≠ breaks

  • @pavelpospisil5918
    @pavelpospisil5918 13 시간 전

    "Don't you bet, you'll be dead!"

  • @tolulopemakanjuola2588

    Great video How did you visualize the rotations for the matrices brought up around 4:30?

  • @stevenrs11
    @stevenrs11 13 시간 전

    Seems familiar

  • @postblitz
    @postblitz 13 시간 전

    When you deal with the infinite either way, no monster's size is special.

  • @ryanbartram7122
    @ryanbartram7122 14 시간 전

    Physics🤯🤯🤯 Math🤯🤯🤯

  • @LilKrobik
    @LilKrobik 14 시간 전

    In the czech language, the word education comes from farming, where farmers would "educate" the land - prepare it for actually growing (in work, but also life in general). I think this series is exactly what education is supposed to be. Thank you!

  • @ChairmanHehe
    @ChairmanHehe 14 시간 전

    bad computer!

  • @sudsierspace9010
    @sudsierspace9010 15 시간 전

    Just gotta say thank you for the best of videos of math in the universe.

  • @ryumak
    @ryumak 15 시간 전

    As always, great content - thanks!

  • @lee-oe7rr
    @lee-oe7rr 16 시간 전

    Gambling trinkit plus 10 luck

  • @christianquintili
    @christianquintili 16 시간 전

    This video is an act of democracy. Thank you

  • @Faroshkas
    @Faroshkas 16 시간 전

    Well I heard there was a sequence of chords splits the circle to 1, 2 and 4 n points seem to cut in the powers of 2, yeah It goes on like this, with the 4th and the 5th but something's odd, when you add a 6th it cuts in 31 patterns fool ya how they fool ya, how they fool ya how they fool ya, how they fooo-ooo-ool ya When your faith is strong you still need proof what seems natural to guess can lead to goof each integral up on the left is pi over 2, yeah you might think that's true for the next, which is fair but like a joke we've shown that it's off by a hair it's a subtle slip, but it's true the pattern fooled ya how they fool ya, how they fool ya how they fool ya, how they fooo-ooo-oool ya Now take a prime, and write it in base 4 read those digits like you'd have before each prime gives a new prime with this rule, yeah or does it though? you'll eventually find new primes are not so simply designed patterns hold then they're broken how they fool ya how they fool ya, how they fool ya how they fool ya, how they fooo-ooo-oool ya

  • @martinda7446
    @martinda7446 17 시간 전

    The phase slowing at the beginning (passing through the 'glass') shows a wave with a higher frequency. That can't be right as it would change the colour... Anyhow, I'll continue...😁

  • @RedEMM1
    @RedEMM1 18 시간 전

    my dumbass doesnt see the pattern to begin with