Starting With Weights and Vectors
00:00 In this lesson, you’ll begin to implement a neural network. To get started, you’ll need to represent the inputs. Start off by implementing some operations on vectors, first with Python lists and later using NumPy arrays.
00:15 A vector has a direction and magnitude. You can represent a vector as an arrow in a graph. The magnitude of the vector is the length of the arrow. Here, you can see three vectors: two for the weights of a neural network and another for the input.
00:32
It’s obvious that weights_2
is more similar to the input than weights_1
by visualizing them. But how can you do this in Python?
00:43
This is the purpose of the dot product. As the vectors are two-dimensional, you can compute the dot product of the input and weights_1
by first multiplying the first index of the input by the first index of weights_1
.
00:57
Then, multiply the second index of the input by the second index of weights_1
. And finally, sum the products. Implementing this in Python is straightforward.
01:09
Using Python lists for the vectors and manipulating the indices explicitly, you can calculate the dot product to be 2.1672
.
01:19 However, not all vectors are two-dimensional. As you work with higher-dimensional structures, the code to compute the dot products gets more complex and slower as well.
01:29
But NumPy ndarrays support dot products out of the box. Simply call the np.dot()
function and give it the input and weight vectors. As you can see, the result is the same.
01:45
Compute the dot product of the input and weights_2
, and the result is 4.1259
. So, how can you use this to determine the similarity?
01:56
If the dot product is zero, the vectors are not similar at all. However, larger values mean that they are more similar. Since the dot product of the input and weights_2
is greater than that of the dot product and weights_1
, weights_2
is more similar.
02:15
The model you will train in this course will only have two outcomes: 0
or 1
. This makes it a classification problem, as there is a fixed set of outcomes as the result.
02:26 Suppose that you have this dataset. The input vectors are what the neural network will use to make a prediction, and the target is what you want to predict.
02:35 This, of course, is much simpler than a real-life network, which would work with text or image data. So far, you’ve seen the dot product and the sum, and both of these are linear operations.
02:47 And if all the operations in the network are linear, adding more layers will yield nothing more than linear results. Therefore, you need to introduce nonlinear layers with activation functions.
02:59 For example, the rectified linear unit, or ReLU function, is nonlinear in that it sets any negative value to zero. Essentially, it turns off all negative weights.
03:11
The network you build will use the sigmoid activation function. This squashes the output between 0
and 1
, and this is appropriate as the outputs in the dataset are 0
and 1
.
03:23
Thus, you’ll assume any value less than or equal to 0.5
is 0
and all others as 1
. This is the formula for the sigmoid function.
03:33
You don’t need to worry about proving it, but you will implement it here in code. That will be straightforward. The symbol e is a mathematical constant for Euler’s number, and the function np.exp()
in NumPy will handle that.
03:50
This diagram shows the different parts that you’ll implement. Hexagons are functions, and the purple rectangles are outputs. You’ll simply compute the dot product of the input and weights and the bias and then apply the sigmoid function to get a number between 0
and 1
.
04:08
And here’s the code. First, express the input, weights, and bias as NumPy ndarrays. Then a function for the sigmoid activation function. Again, use np.exp()
for Euler’s constant.
04:22
The function make_prediction()
will implement the layers. The first layer takes the dot product of the input and the weight, then adds the bias. The second layer uses the sigmoid()
function, and the return value is the prediction. If you run the code, the prediction is 0.798
.
04:40
This is higher than 0.5
, so assume it to be 1
.
04:45
Looking back at the training data, the expected outcome is 1
, so this prediction was correct. Try it again with the second example. It yields a prediction of 0.871
, which you will assume to again be 1
.
05:02
But the expected outcome is 0
. This prediction is incorrect. You’ll need to adjust the weights to do better next time, but how much should each weight be modified? Watch the next lesson to find out.
Become a Member to join the conversation.
Rick Hubbard on May 18, 2023
Do not understand the dot product coordinate system used in this lesson.
Is it |magnitude_a| * |magnitude_b| * cos(theta)? Or, is it ((vector_1_x * vector_2_x) + (vector_1_y) * (vector_2_y))?
Further, the coordinates in the python code (~00:40; for example:
input_vector = [1.72, 1.23]
) do not seem to reflect the vectors as shown on the chart (00:00 - ~00:39; for example: the chart’s input vector Cartesian coordinates seem to be ~(2.3, 2.1) and ~(4.1, 4.2)).Is the python code intended to reflect the chart’s coordinantes? If so, are there typos on the chart?
Also, the apparent difference between the chart and the coordinates is unclear (at least to me) as to which coordinate value is [presumably] the x-value and which is the y-value. Although I believe I understand
weights_1 = [1.26, 0]
, it sure seems like the x-Cartesian coord and y-Cartesian coord are reversed. Is this the case?Or, am I completely missing the point?
Any help gaining an understanding regarding this question is invited and warmly welcomed!