Locked learning resources

Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Locked learning resources

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Customizing Vectorization

00:00 In the previous lesson, I showed you how to do scalar and vector operations with NumPy arrays. In this lesson, I’ll show you how to write functions that can be used as custom operators.

00:11 Simple operations are well and good, but sometimes you want to do something more complicated. NumPy has a mechanism to turn a function into one that is compatible with vector operations on NumPy arrays.

00:22 The vectorize() function takes a regular function and gives you something you can use to operate on your array. Once more into the REPL, this time to write a custom function.

00:34 I’m in the same REPL from the last session and still have the portfolio data loaded. Let’s say you are a day trader who gets a bonus if your stock picks show a decent profit.

00:44 If you’re up by more than 1%, you get 10% of the profit to keep for yourself. But if you’re down or up by less than 1%, no bonus. Forget vectorization for the moment.

00:55 Let’s write a simple function that calculates the resulting value.

01:03 My function takes the start and end values.

01:08 If I’m up by more than 1%,

01:14 I get a 10% bonus.

01:20 Otherwise, I just get the difference. Now, no brokerage firm is ever going to give this to you. It would be crazy profitable and you’re not being penalized for loss.

01:30 But the conditional makes it more complicated than any of the vector operations you’ve seen so far. Let me just quickly test this out and there you go. Bit of a bonus. I can’t just pass in my Monday and Friday columns to this function because the operations have to be applied item by item.

01:49 The conditional comparing the two columns doesn’t make sense as the conditional is meant to apply to each pair of items, not to the full vectors. Instead, I use NumPy’s vectorize() function to turn our custom function into a vector-compatible operation.

02:14 Note that what gets passed to vectorize() is a reference to the function. You’re not calling the function, you’re passing the function as an object.

02:22 Likewise, when vectorize() returns, it’s another function object. Let me demonstrate. Evaluating the function in the REPL without using the parentheses means it isn’t being called and the REPL tells you its object ID.

02:43 Likewise for the vectorized function, now that I have it, I can call the vectorized version, passing in the columns and NumPy magic has translated it into a vector operation.

03:02 The result is a new array. You might remember some of these values from the previous lesson as anything less than 1% profit is just the difference in the two values, which I did as a vector operation before.

03:14 I’ve got one more thing to show you with this data. If you’re coding along, don’t close your REPL yet.

03:21 The advantage of writing a function and then calling vectorize() is you get to keep the original function. But often when you’re meaning to write a vector operation, you don’t care about the original function.

03:32 In this case, you can use vectorize() as a decorator, wrap your function and then you don’t have to make a separate call. Code on the screen here is identical to the function I just showed you, except that I renamed it.

03:43 By wrapping it with the vectorize decorator, it becomes a vector operation allowing you to skip the other step. Using a vectorized function means that Python code is getting invoked.

03:55 That means the operation is coming up out of NumPy-land and into Python-land. This is almost always slower. If you can, you want to stay down in NumPy-land.

04:06 Depending on how complicated the operation you want to perform is, you may or may not be able to do this. The bonus example I just showed you is just an if-else clause though, and there’s a better way to do it that allows you to stay in NumPy-land.

04:20 The where() function allows you to operate on an array conditionally, performing one operation if the condition passes and a different one if it doesn’t, and that’s all without leaving NumPy-land.

04:33 One last time down into the REPL. This time I’ll show you a better way to calculate that same profit with bonus situation. Back in the same REPL with the same vectorized function and results on the screen.

04:47 The where() function takes three arguments.

04:56 The first argument is the condition, which in this case, is whether the values in Friday are 1% bigger than Monday. Know that you’re referencing the column here, the where() splits it up into the individual parts.

05:15 The second argument is what to do if the condition evaluates as true, which is our bonus state.

05:26 And the third argument is what to do if the condition is false. The result is the same depending on what you’re doing, a custom vectorized function might be your only choice, but if you can use where() instead, it’ll be far more performant.

05:42 And that’s the last of our four practical NumPy examples. The last lesson summarizes the course and points you at other courses and tutorials that might be of interest to you.

Become a Member to join the conversation.