Dot products and duality | Essence of linear algebra, chapter 9

Dot products and duality | Essence of linear algebra, chapter 9


Traditionally, dot products or something that’s
introduced really early on in a linear algebra course typically right at the start. So it might seem strange that I push them
back this far in the series. I did this because there’s a standard way
to introduce the topic which requires nothing more than a basic understanding
of vectors, but a fuller understanding of the role the
dot products play in math, can only really be found under the light of
linear transformations. Before that, though, let me just briefly cover the standard way that products are introduced. Which I’m assuming is at least partially review
for a number of viewers. Numerically, if you have two vectors of the
same dimension; to list of numbers with the same length, taking their dot product, means, pairing up all of the coordinates, multiplying those pairs together, and adding the result. So the vector [1, 2] dotted with [3, 4], would be 1 x 3 + 2 x 4. The vector [6, 2, 8, 3] dotted with [1, 8,
5, 3] would be: 6 x 1 + 2 x 8 + 8 x 5 + 3 x 3. Luckily, this computation has a really nice
geometric interpretation. To think about the dot product between two
vectors v and w, imagine projecting w onto the line that passes
through the origin and the tip of v. Multiplying the length of this projection
by the length of v, you have the dot product v・w. Except when this projection of w is pointing
in the opposite direction from v, that dot product will actually be negative. So when two vectors are generally pointing
in the same direction, their dot product is positive. When they’re perpendicular, meaning, the projection of one onto the other is the
0 vector, the dot product is 0. And if they’re pointing generally the opposite
direction, their dot product is negative. Now, this interpretation is weirdly asymmetric, it treats the two vectors very differently, so when I first learned this, I was surprised
that order doesn’t matter. You could instead project v onto w; multiply the length of the projected v by
the length of w and get the same result. I mean, doesn’t that feel like a really different
process? Here’s the intuition for why order doesn’t
matter: if v and w happened to have the same length, we could leverage some symmetry. Since projecting w onto v then multiplying the length of that projection
by the length of v, is a complete mirror image of projecting v
onto w then multiplying the length of that projection by the length of w. Now, if you “scale” one of them, say v
by some constant like 2, so that they don’t have equal length, the symmetry is broken. But let’s think through how to interpret the
dot product between this new vector 2v and w. If you think of w is getting projected onto
v then the dot product 2v・w will be exactly twice the dot product v・w. This is because when you “scale” v by
2, it doesn’t change the length of the projection
of w but it doubles the length of the vector that
you’re projecting onto. But, on the other hand, let’s say you’re thinking
about v getting projected onto w. Well, in that case, the length of the projection
is the thing to get “scaled” when we multiply v by 2. The length of the vector that you’re projecting
onto stays constant. So the overall effect is still to just double
the dot product. So, even though symmetry is broken in this
case, the effect that this “scaling” has on
the value of the dot product, is the same under both interpretations. There’s also one other big question that confused
me when I first learned this stuff: Why on earth does this numerical process of
matching coordinates, multiplying pairs and adding them together, have anything to do with projection? Well, to give a satisfactory answer, and also to do full justice to the significance
of the dot product, we need to unearth something a little bit
deeper going on here which often goes by the name “duality”. But, before getting into that, I need to spend some time talking about linear
transformations from multiple dimensions to one dimension which is just the number line. These are functions that take in a 2D vector
and spit out some number. But linear transformations are, of course, much more restricted than your run-of-the-mill
function with a 2D input and a 1D output. As with transformations in higher dimensions, like the ones I talked about in chapter 3, there are some formal properties that make
these functions linear. But I’m going to purposely ignore those here
so as to not distract from our end goal, and instead focus on a certain visual property
that’s equivalent to all the formal stuff. If you take a line of evenly spaced dots and apply a transformation, a linear transformation will keep those dots
evenly spaced, once they land in the output space, which
is the number line. Otherwise, if there’s some line of dots that
gets unevenly spaced then your transformation is not linear. As with the cases we’ve seen before, one of these linear transformations is completely determined by where it takes
i-hat and j-hat but this time, each one of those basis vectors
just lands on a number. So when we record where they land as the columns
of a matrix each of those columns just has a single number. This is a 1 x 2 matrix. Let’s walk through an example of what it means
to apply one of these transformations to a vector. Let’s say you have a linear transformation
that takes i-hat to 1 and j-hat to -2. To follow where a vector with coordinates,
say, [4, 3] ends up, think of breaking up this vector as 4 times
i-hat + 3 times j-hat. A consequence of linearity, is that after
the transformation the vector will be: 4 times the place where
i-hat lands, 1, plus 3 times the place where j-hat lands,
-2. which in this case implies that it lands on
-2. When you do this calculation purely numerically,
it’s a matrix-vector multiplication. Now, this numerical operation of multiplying
a 1 by 2 matrix by a vector, feels just like taking the dot product of
two vectors. Doesn’t that 1 x 2 matrix just look like a
vector that we tipped on its side? In fact, we could say right now that there’s
a nice association between 1 x 2 matrices and 2D vectors, defined by tilting the numerical representation
of a vector on its side to get the associated matrix, or to tip the matrix back up to get the associated
vector. Since we’re just looking at numerical expressions
right now, going back and forth between vectors and 1
x 2 matrices might feel like a silly thing to do. But this suggests something that’s truly awesome
from the geometric view: there’s some kind of connection between linear
transformations that take vectors to numbers and vectors themselves. Let me show an example that clarifies the
significance and which just so happens to also answer the
dot product puzzle from earlier. Unlearn what you have learned and imagine that you don’t already know that
the dot product relates to projection. What I’m going to do here is take a copy of
the number line and place it diagonally and space somehow
with the number 0 sitting at the origin. Now think of the two-dimensional unit vector, whose tips sit where the number 1 on the number
line is. I want to give that guy a name u-hat. This little guy plays an important role in
what’s about to happen, so just keep them in the back of your mind. If we project 2D vectors straight onto this
diagonal number line, in effect, we’ve just defined a function that
takes 2D vectors to numbers. What’s more, this function is actually linear since it passes our visual test that any line of evenly spaced dots remains
evenly spaced once it lands on the number line. Just to be clear, even though I’ve embedded the number line
in 2D space like this, the output of the function are numbers, not
2D vectors. You should think of a function that takes
into coordinates and outputs a single coordinate. But that vector u-hat is a two-dimensional
vector living in the input space. It’s just situated in such a way that overlaps
with the embedding of the number line. With this projection, we just defined a linear
transformation from 2D vectors to numbers, so we’re going to be able to find some kind
of 1 x 2 matrix that describes that transformation. To find that 1 x 2 matrix, let’s zoom in on
this diagonal number line setup and think about where i-hat and j-hat each
land, since those landing spots are going to be
the columns of the matrix. This part’s super cool, we can reason through
it with a really elegant piece of symmetry: since i-hat and u-hat are both unit vectors, projecting i-hat onto the line passing through
u-hat looks totally symmetric to protecting u-hat
onto the x-axis. So when we asked what number does i-hat land
on when it gets projected the answer is going to be the same as whatever
u-hat lands on when its projected onto the x-axis but projecting u-hat onto the x-axis just means taking the x-coordinate of u-hat. So, by symmetry, the number where i-hat lands
when it’s projected onto that diagonal number line is going to be the x coordinate of u-hat. Isn’t that cool? The reasoning is almost identical for the
j-hat case. Think about it for a moment. For all the same reasons, the y-coordinate
of u-hat gives us the number where j-hat lands when
it’s projected onto the number line copy. Pause and ponder that for a moment; I just
think that’s really cool. So the entries of the 1 x 2 matrix describing
the projection transformation are going to be the coordinates of u-hat. And computing this projection transformation
for arbitrary vectors in space, which requires multiplying that matrix by
those vectors, is computationally identical to taking a dot
product with u-hat. This is why taking the dot product with a
unit vector, can be interpreted as projecting a vector
onto the span of that unit vector and taking the length. So what about non-unit vectors? For example, let’s say we take that unit vector u-hat, but we “scale” it up by a factor of 3. Numerically, each of its components gets multiplied
by 3, So looking at the matrix associated with that
vector, it takes i-hat and j-hat to 3 times the values
where they landed before. Since this is all linear, it implies more generally, that the new matrix can be interpreted as
projecting any vector onto the number line copy and multiplying where it lands by 3. This is why the dot product with a non-unit
vector can be interpreted as first projecting onto
that vector then scaling up the length of that projection
by the length of the vector. Take a moment to think about what happened
here. We had a linear transformation from 2D space
to the number line, which was not defined in terms of numerical
vectors or numerical dot products. It was just defined by projecting space onto
a diagonal copy of the number line. But because the transformation is linear, it was necessarily described by some 1 x 2
matrix, and since multiplying a 1 x 2 matrix by a
2D vector is the same as turning that matrix on its
side and taking a dot product, this transformation was, inescapably, related
to some 2D vector. The lesson here, is that anytime you have
one of these linear transformations whose output space is the number line, no matter how it was defined there’s going
to be some unique vector v corresponding to that transformation, in the sense that applying the transformation
is the same thing as taking a dot product with that vector. To me, this is utterly beautiful. It’s an example of something in math called
“duality”. “Duality” shows up in many different ways
and forms throughout math and it’s super tricky to actually define. Loosely speaking, it refers to situations
where you have a natural but surprising correspondence between two types of mathematical thing. For the linear algebra case that you just
learned about, you’d say that the “dual” of a vector
is the linear transformation that it encodes. And the dual of a linear transformation from
space to one dimension, is a certain vector in that space. So, to sum up, on the surface, the dot product
is a very useful geometric tool for understanding projections and for testing whether or not vectors tend
to point in the same direction. And that’s probably the most important thing
for you to remember about the dot product, but at deeper level, dotting two vectors together is a way to translate one of them into the
world of transformations: again, numerically, this might feel like a
silly point to emphasize, it’s just two computations that happen to
look similar. But the reason I find this so important, is that throughout math, when you’re dealing
with a vector, once you really get to know its personality sometimes you realize that it’s easier to
understand it, not as an arrow in space, but as the physical embodiment of a linear
transformation. It’s as if the vector is really just a conceptual
shorthand for certain transformation, since it’s easier for us to think about arrows
and space rather than moving all of that space to the
number line. In the next video, you’ll see another really
cool example of this “duality” in action as I talk about the cross product.

Only registered users can comment.

  1. I've understood all the previous videos very well, and have reviewed this one for hours, but I still fail to see the "beauty" or rather the "significance" of the duality explanation. It just makes sense naturally… so I don't get what is so enlightening about the unit vector projection example.

  2. What was that piano music that just started running near the end? I know the music but not its name

  3. 7:02 i feel stupid for just realizing that theres always "three blue and one brown" π in those talking animations

  4. He lost me at the diagonally positioned number line part… how can a number line be diagonally positioned and why is that important, what's the point?

  5. To me, duality is not clear after this. The other video's were really clear, but this is the first one I've watched twice and I still don't understand it. Duality, and how projection is brought into this, are two things that are still vague.

  6. From what I understand dot products are a way to transform n dimension vector into a point on the number line. Question is what's it use ???

  7. I think it would be good to mention that the length of the projection for v will also be doubled because the projection triangles with original v and 2v side are similar to each other. It's not that hard to realize it, but I feel without it the explanation seems a bit confusing

  8. thank you very much for the good explination! but, could you maybe explain, what is actually the meaning of the dot product between two matrices?

  9. I am simply overwhelmed by the clarity that I get after watching your video. Inspiring. Keep making these videos. Please.

  10. Something is still not clear to me.

    10:34 I understood why we the associated transformation for unit vector u is [ux uy], why i-hat and j-hat land where they land. But why would i-hat and j-hat land on 3ux and 3uy when we scale up u? If we take their projections on the "number line" they would still land on just ux and uy, wouldn't they? Why can we use the associated transformation [3ux 3uy]?

  11. I really wish they'd emphasised these more in school in the 90's, given how important they are for game programming today.

  12. I join the group of commenters who found this to be the first confusing video. I wonder if it could be split into two videos, each of approx. 10 minutes (the typical length): (1) dot products, expanded, (2) relationship between dot products and linear transformations, expanded.

    … 10 minutes later… Lol I just realized that's exactly what you did with cross products.

  13. I love your channel and I consistently struggle to understand over and over until I get it. I think I heard once that a dot product is a measure of the parallelness of two vectors and the cross product was a measure of their perpendicularity. Is this accurate? Anyone? Thanks for the correction in advance.

  14. This is really how dot-products should be introduced.
    Here in germany, we don't even learn about matrices in our highschool-equivalent, just about vectors. Matrices then only come up in college…
    I honestly think that separating these subjects in such a way doesn't do any good.
    As can be seen here, an intuition about the dot-product can be easily obtained when one has understood the basic concepts of matrices as linear transformations.

  15. "Sometimes you realize that it's easier to understand it (a vector) not as an arrow in space, but as the physical embodiment of a linear transformation – it's as if a vector is a conceptual shorthand for a linear transformation." Definitely one of the most beautiful ideas I have ever learned, thank you for articulating it so well.

  16. I don't know…I feel this tutorial invested too much time into duality and matrices rather than the dot product itself and what overall uses it has. I learned how to calculate it for sure, but not so much what I can use that calculation for in the grand scheme of things.

  17. To those who are struggling with this topic like me, perhaps this might help – https://www.youtube.com/watch?v=KDHuWxy53uM&t=328s

  18. Is the dot product supposed to tell you how much "energy" two vectors contribute to each other when both are experienced together or is it something else?

    If you are on a boat going north with vector (0, 4) and the current is going northeast (2, 1), then you would end up going (2, 5) and the length of the resulting vector would be (2)^2+(5)^2= sqrt(29).

    What does the dot product tell you then since the new vector has a magnitude of sqrt(29) while the dot product is 4?

  19. Hi Grant. Like everyone else here I think your videos are truly amazing. They really get me thinking beyond the dry presentation of my college course. One thing that struck me was if you are thinking of the transposed vector in the dot product as just a 1 by 2 matrix that transforms another vector to the number line, can we think of the transposed vector in the dot product as really a 2 by 2 matrix with the transposed vector just the first row and zeroes in the second row. Indeed you can think of any vector (transposed or not) like this no? In the case of a column vector it would obviously be a column of zeroes. Always bugged me what exactly the relationship between matrix and vectors actually was.

  20. Also, just one thing I don't fully understand. Why can one just transpose a vector. The dot product would not be possible without the ability to transpose vectors (ie we couldn't then transform the second vector to the number line). Is a transposed vector a fundamentally different object to the vector itself

  21. I understand what the dot product has to do with projections and the angle between two vectors, but where you've lost me is your tangent about duality. I think bringing in the number line somehow made it harder to understand by distracting me from the fact that the dot product is really just the magnitude of two vectors and the cosine of their angle all multiplied together.

    In other words, it's a measure of, pretending some object is at the origin, how well those two vectors help each other pull that object in their respective directions. If they're pointing in roughly the same direction, they help each other. If they're perpendicular, they're not really helping each other but not hurting, either. If they're pointing in the opposite direction, they're fighting against each other, which is the negative of helping each other. You can then break this interpretation into x and y components to informally derive the familiar x_1 * x_2 + y_1 * y_2 form. You multiply their efforts in the x direction with each other to determine how well they work together in that direction, do the same thing for the y direction, and then add them together.

    I'm not a physicist, but imagining the physics of vectors really helps to understand dot products. Maybe the duality tangent and involving number lines helps some people, but I think this is a topic that benefits from multiple interpretations of what's going on.

  22. I think of the dual space as having negative dimension, and scalars as having 0 dimension, so mulplying say a row 3 vector with a column 3 vector gives a scalar, dim 3 and dim -3 gives 0

  23. But matrix-vector miltiplication and the dot product aren’t exactly the same. The output for matrix-vector multiplication is another vector and the output of the dot product is a number

  24. This magic only applies if you had already assumed there was only one number line – rather than the coordinate system being composed of an infinite number of number lines heading in all directions and functioning as a measuring system in waiting. This may be a conflation of the abstraction of a number-line-measuring-tool with an actual one dimensional number line. But then, you would have to believe that there is a such actual entity of a one dimensional number line: if there is please show me (sssh that will be hard as my brain is only able to secure proof of information in three dimensions – Kantian Categories of thought).

  25. I just asked myself, why don't we have a way to multiply 2 vectors v and w, with v = (a, b, c) and w = (d, e, f), with a, b, c, d, e, f being in the real numbers, so that v*w = (a*d, b*e, c*f)

  26. This is a unique, incredible, and amazing course that will without a doubt remain. This is how Linear Algebra should be taught and it astounds me it's not taught this way. I thank you again, for you're amazing and have turned Linear Algebra into one of my favourite subjects from my least favourite

  27. Took me a day to wrap my head completely around this and explain to myself why the dot product takes into account the length of the vector that it's projecting onto, but I'm happy now and finally sleep.

  28. As a mathematician myself who likes explaining stuff, I somewhat adore your work. However, I’d like to make the following remark. The example of duality could have been more conceptually formulated as follows: the dual of a linear transformation from a vector space to the number line is actually a vector in the dual vector space. It just so happens that if the vector space in question is endowed with a Euclidian metric then the dual vector space is canonically identified with the original vector space.

  29. Watched video 1st time = Got confused
    2nd time= understood a little
    3rd time = connected points from previous videos
    4th time = mind blown 😦😮😯

  30. 7:22 "unlearn what you have learned" makes me want to smash the like button more times than YouTube will allow, Master Yoda.

  31. First took Linear Algebra at a college (learned the mechanics but never grasped the concept; used David C. Lay's textbook which I would only recommend for learning the mechanics of the Linear Algebra), then reviewed it using Gilbert Strang's textbook (got a better grasp on the concept and a different view at the mechanics); finally got to your videos (the gist of it all is clearly presented). My learning experience would be perfect if it was reversed! Thank you.

  32. I feel like we're jumbling things up too much here. Suddenly [1, -2] is a transform matrix that can be applied to a two-dimensional vector, when the "y" coordinate of the original 2-dimensional i-hat and j-hat are completely absent? Are they just zeroes? If so, you should make that clear.

  33. when he explained why matrix vector product equals dot product I had tears in my eyes. It is just so beautiful why the 2 things are kind of related to each other

  34. It's the most complicated part in these linear algebra series, IMO. However dot product isn't too complicated thing to imaging it in mind

  35. The much simpler real life application i think for finding a "dot product", imagine you and your friend are pulling a big stone, tied by 2 ropes. Your friend is pulling the rope making an angle Theta, like V, with your rope. Now the dot product gives the actual amount of force your friend is adding to yours, given there is Theta angle between you and your friend's direction of rope pulling. But when your friend stand in same line as you, pulls exactly in the same direction where you want to move the stone towards, then thats the most efficient way of pulling the stone, all his force magnitude will add up to yours i.e Cos(0)=1. Otherwise the lesser the angle between the forces, the higher the magnitude can be.
    But if your friends force direction is making some angle with yours, then though he's putting some X amount of force, only a fraction of X (projection of your friend's on yours) will add up to yours. If your friend makes an angle 90 degree with yours, then he's not adding up anything to your effort, cos(90)=0. But if your friend is pulling the stone in opposite direction, then the stone moves in the direction where the force is higher, but the amount of displacement would be much lesser, yourForce-friendForce, cos(180)=-1. Means your friend is working against you.
    So basically the Dot Product tells us, how much a vector is working FOR/AGAINST the other.

  36. Hi, im spanish speaker and i think your videos are very very interesting and didactics,so i would like very much see them in spanish, in apart inspanish there are nobody make videos like this (math). is one opportunity to expand the maths , im sure the comunity of spanishspeakers will be thanks for you. please translate in a future!!! 🙁 thanks you so much for your attention 🙂

  37. I find the statement "Applying a transformation is the same as taking the dot product with that vector" confusing since the output of a transformation is a vector and output of a dot product is a scalar. Appreciate any clarifications!

  38. Very nicely explained :). Far better than what most professors/lecturers are teaching at leading universities based on my experience.

  39. I did not understand a few things from this video
    If we consider a vector as a transformation matrix giving a single output, so what was the need of using a diagonal vector uhat, could we not represent a vector onto the axis of ihat. Also does projection of vector v1 onto v2 means the position of vector v2 in the span of v1

  40. This whole presentation is just…beautiful. Thank you so much for your consistently thoughtful and original work.

    One concept I’m still struggling to absorb here is that the transformation of i-hat and j-hat onto the superimposed number line should necessarily be a “projection” of i-hat and j-hat onto u-hat and the number line. Somewhere you seem to have smuggled in the notion that transforming and projecting are one and the same, but I can imagine countless transformations onto a number line that aren’t also projections. My gut tells me the fact that i-hat, j-hat, and u-hat are all unit vectors plays an important role in this case, but I can’t seem to reason my way through the “how” and “why” of it. If anyone has light to shed, I’d be most grateful!

  41. Hello, i love these series. I'm in first year of college and, if it wasnt for this, i dont know what i had done with the lineal algebra subject.

    Well, I wanted to ask you a question. What happens when you simplify a matrix? I'm refering to when it comes to multiplications, additions o sustractions or even changing the order of the rows in, for example, the Gauss method. The thing is that i don't understand how can you get a solution when you "destroy" the matrix, and the vectors aren't the same nor the coordinates. I don't know if I'm explaining myself. Thank you very much.

  42. The number line imposed in the 2d space seems random to me. The tip of u^ must land inside the square formed by i^ and j^, right? And u^ is just 1 on the number line. What if you stretched the number line out?

  43. Fantastic. Another simple geometric explanation for the fact, that "order does not matter", is that the projections create "two similar right triangles." Thanks-a-million for the excellent series.

  44. Normally I watch videos on 1.25. this is the only video that I have to put on.75 and write down the caption and spend 2 hrs reviewing it to fully understand the concept. the caption is such a well-crafted sentences that need to be fully and slowly comprehend. thank you.

  45. This is the most enlightening video I’ve ever watched on YT and probably in my entire life… I still can’t stop smiling :’) thanks a ton, Grant! 🤲🏼

  46. I am confused. Is a 2D vector represented by a column vector or a row vector? Does it make a difference? If or if not, how so?

Leave a Reply

Your email address will not be published. Required fields are marked *