Linear Algebra for Deep Learning
WORK IN PROGRESS! FEEL FREE TO UPDATE OR CORRECT ERRORS!
- 1 Linear Algebra
- 2 Vector
- 3 Matrix
- 4 Linear Algebra in Deep Learning
- 5 References
Linear Algebra is a branch of mathematics that seeks to describe lines and planes using structures like vectors and matrices.
Vectors in geometry are 1-dimensional arrays of numbers or functions used to operate on points on a line or plane. Vectors store the magnitude and direction of a potential change to a point. A vector with more than one dimension is called a matrix.
There are a variety of ways to represent vectors:
Or more simply:
How Vectors Work
Vectors typically represent movement from a point. They store both the magnitude and direction of potential changes to a point. This vector says move left 2 units and up 5 units.
A vector can be applied to any point on a plane:
The vector’s direction equals the slope created by moving up 5 and left 2. Its magnitude equals the length of the hypotenuse (the long side in a right angle triangle).
Why are vectors useful?
Vectors can be used to manipulate gradients. Vector operations on gradients help us find the slope of a point in any direction.
Vector addition is fairly straightforward:
Vector multiplication is called Dot Product. The vectors must be of equal length and the output is a scalar number.
A unit vector is a vector with a magnitude of 1. Unit vectors can have any slope (move in any direction), but the magnitude (length of vector) must equal 1. They are useful when you care only about the direction of the change, and not the magnitude. Unit vectors are used by directional derivatives.
Given the vector (3, 4), the magnitude is 5 (hypotenuse). This is not a unit vector. To find the unit vector of this vector, we divide each value by the magnitude of the vector. The new vector (3/5, 4/5) points in the same direction and has a magnitude of 1.
A vector field is a diagram that shows how for a given point in space (a,b), where that point would move in your if you apply a vector function to it. Given a point, the vector field shows the “power” and “direction” of our vector function. Here is an example vector field:
Wait, why does the vector field point in different directions? Shouldn’t it look like this:
The difference is because the second vector contains only scalar numbers. So from any point we always move over 2 and up 5. The first vector on the other hand contains functions. So for each point, we derive the direction by inputing the coordinates into a function. For non-linear functions, things can become very fancy indeed.
A matrix is a rectangular grid of numbers. Like an Excel spreadsheet. We describe the dimensions of a matrix as Rows x Columns. There are a variety of matrix operations, but we will focus on multiplication as it is the most relevant to deep learning.
Matrix multiplication specifies a set of rules for multiplying matrices to produce a new matrix.
Why is it Useful?
It turns complicated problems into simple, more efficiently calculated problems. It’s used in a number of fields including machine learning, computer graphics, and population ecology. Source
Not all matrices are eligible for multiplication. In addition, there is a requirement on the dimensions of the matrix product. Source.
- The number of columns in the first matrix must equal the number of rows in the second
- The product of an M x N matrix and an N x K matrix is an M x K matrix. The new matrix takes the rows of M1 and columns of M2.
Matrix multiplication uses Dot Product to multiply various combinations of rows and columns to derive its product. In the image below, each entry in Matrix C is the dot product of a row in matrix A and a column in matrix B.
In the image above, represents the vector and represents the vector . When we see , it really means we take the dot product of the first row in matrix A and the first column in matrix B.
Q1: What are the dimensions of the matrix product?
Q2: What are the dimension of the matrix product?
Q3: What is the matrix product?
Why Does It Work This Way?
It’s an arbitrary human construct. There is no mathematical law underlying why it's done this way. Mathematicians decided on this approach because it turned out to be very useful in real life.
Linear Algebra in Deep Learning
Explanation of how linear algebra is used in Deep Learning.
- matrix product
- matrix inverse
- orthogonal matrices