Loading video player…

Manipulating Strings With map()

Here are resources and additional documentation about list comprehensions:

The quote used in the example is available here:

"""On a visit to the NASA space center, President Kennedy spoke to a man sweeping \
up in one of the buildings.  "What's your job here?" asked Kennedy.  "Well Mr. President," \
the janitor replied, "I'm helping to put a man on the moon"."""

00:00 A common task where you might find it useful to use the map() function is when you need to transform an iterable of string objects. Maybe you’re working with a file and you’re interested in taking the words, or the strings, in the file and doing some sort of processing.

00:16 Maybe you want to simply remove punctuation or you want to convert, say, certain words to uppercase or lowercase, and a place where you can find a lot of methods that are useful is in the str (string) class. The str class provides a whole bunch of useful methods for string manipulation. Here are some examples.

00:37 There’s the capitalize method, which takes the first character of a string and capitalizes it—in other words, converts it to uppercase. There’s the lowercase method, which takes all characters to lowercase. And then the uppercase method, which does the opposite—it takes all characters and converts them to uppercase.

00:56 There’s the title method, which takes the first character of each word and converts it to uppercase, and this is usually the case in, say, titles or headings.

01:07 There’s the swap case method, which converts every uppercase to lowercase and vice versa.

01:13 And then probably the most useful method, or the one that’s used most often, is the strip method. And the strip method, by default, takes no input arguments, and if that happens, then what the strip method does is it removes leading and trailing whitespace. However, if you’re interested in removing—I don’t know—maybe punctuation or some other characters that could appear at the beginning or end of your string, then you pass those in as input arguments, and we’ll see some examples of that.

01:43 It’s important to note that because strings are immutable objects—in other words, you can’t change them—all of the methods return a transformed version of the input string and the original input string remains unchanged.

01:56 Let’s do a few examples for you to see some of these in action.

02:01 Let’s do a quick example that uses the .strip() method from the str class to eliminate any whitespace that may occur at the beginning or end of some words.

02:12 So, we’ll have a list of words and we’ll call this raw_words. And I have this, sort of, in the history here of my REPL. And basically, you can just either write down these words… But the thing that you want to do is you want to add some whitespace either after the word or at the beginning of the word. Or, you know what? Maybe in both cases where you have whitespace at the beginning or the end of some of the words or even all of the words.

02:40 And maybe you have one that has no whitespace. The idea is that maybe you generated this list of words by processing a file and extracting the words, and possibly some of the words have some whitespace. All right, so that’s raw_words.

02:57 And so what we want to do is clean these words up. We want to map the .strip() method from the str class and the iterable is this list of words.

03:10 And the default call to .strip() contains no arguments, and so the default character that’s stripped away is whitespace, and so that’s all we need to use as the function into the map() call.

03:22 And let’s call this, say, clean_words, and that will store the iterator returned by map().

03:30 Now, maybe what you’ll want to do with this iterator is just simply loop over it and maybe use those words to create some other string that you will later write to a file, but so that we can visually see how these words have been cleaned up, let’s just call the list() method so that we get a list that we can see visually. All right, so notice that in all cases, all of the whitespace that occurs either at the beginning of the word or at the end of the word is stripped away.

04:00 Now, this might be a good time to compare the map() function with using, say, a list comprehension. Now, here’s a way to think about what’s happening.

04:10 We are mapping the .strip() method at every word in the raw_words list. All right? So a way to think about it is we’re mapping the function.

04:22 Let’s do this, say, using a list comprehension. So we’ll clear that up. We still have the raw_words list. All right, that’s not going to change, we want to compare this with the same list.

04:33 And if we were to use a list comprehension, probably the way that you would do it is you would say word, and then we would call the .strip() method on the wordso here we’re making a function call on the word—and this we want to do for each word in raw_words, right?

04:53 So, if you’re not familiar with list comprehensions, there’ll be a link in the notes that accompany the video, but basically, they are sort of a quicker way for you to write a for loop when, say, maybe the code inside the for loop is pretty short. And in this case, what we want to do is simply strip away the whitespace for each word.

05:13 So, we’ll go ahead and save this list in clean_words, just like we did with the iterator, and there we go! This is now, of course, a list, and just as before. Now, in terms of actual number of characters, for example, in terms of the code—when you compare it between the list comprehension and using map()it doesn’t seem like there’s any real advantage to using one or the other.

05:40 I mean, there’s not a lot of code here to use a list comprehension, and if you compare this to the map() call… So, let me just bring up that map() call. You know, that is the amount of code that you’re using for generating the iterator via the map() call. But it’s not just about how much code because either way the amount of code isn’t that much—it’s more about the way to think about the problem. See, because here we are calling the method .strip() on the word object, whereas with the map() call, we are mapping the function .strip() on each of the words.

06:15 So in other words, I’m calling .strip() method on each of the words and if I were to take, say, the first word in the raw_words list…

06:25 Right? That’s essentially what we’re doing in the map() call or what’s being done in the map() call. So again, just a slight different way to think about it.

06:33 When you write the list comprehension, you are calling the method on the word object, whereas in the map() call, you are applying the .strip() function on each of the words.

06:46 You know, at the end of the day, you’ll get the same iterator or the same iterable, essentially. In this case, you generate the list automatically and in the map() method, you simply get an iterator. All right, let’s try another example.

07:00 I’m going to have down at the bottom as part of the video notes a quote that I’m just going to display here.

07:08 So, the purpose is to have some text that has, say, quotes in there, obviously some punctuation like periods and commas, and maybe even a question mark. And what we want to do is simply take some text, we want to extract all of the words without any punctuation.

07:26 So, this is a quote, an anonymous quote, about a visit that President Kennedy made to NASA and his interaction with somebody that was working there in one of the buildings. But, you know, you can basically grab any text that you want.

07:40 This text will be available in the notes in the video. All right, so the first thing that we want to do is we want to extract all of the words in the quote. And notice that I’m using triple-quoted string notation ("""), because it’s just easier when you have long text and if the text contains double quotes, you can embed these in there without any problems.

08:02 Let’s go ahead and extract the words. We’re going to call the .split() method on the quote string, and the splitting empty character will be whitespace, so this will essentially break things up where the delimiter is just whitespace.

08:18 This returns a list and it’s a pretty long list, so let me just take a look at, let’s say, the first 10 words.

08:28 And so, again, the delimiter—or the splitting character—was whitespace and so we get 'On' as our first element in the list of words, and then the word 'a', and then 'visit', and so on. Now, what we want to do is… Some of the words have punctuation, so the problem again is let’s just eliminate all of the punctuation that occurs in the words.

08:50 We know that the quote contained double quotes and it contained commas and periods and it also contained a question mark. So let’s write a quick function. Let’s say we’ll call it remove_punctuation().

09:03 This is going to be the function that we’re going to use in the map() call. It’s going to take a word string and it’s going to return the string obtained by removing—or by stripping away—any of these characters. So this is going to be, say, exclamation mark or question mark, and then any type of punctuation that you think you may want to get rid of, like semi-colons, commas, a single quote, and then we also want to eliminate double quotes, so we’re going to have to escape that, and then maybe parentheses, maybe a dash, and also whitespace.

09:38 So, notice what I’m passing over to .strip() is a string—or an iterable, really—that contains all of the characters that we want to strip away from the word that occur at the beginning or the ending of the word.

09:53 So, go ahead and run that. There’s our function. We’ve got our list of words, and what we want to do is map the remove function to our list of words. And then that way, what we’ll end up with is just an iterator that contains just the words all cleaned up. Now, just for us to see what’s happening, why don’t we just make a call to the list() function as well and then that way, once the iterator is returned by map(), then the list() function will create the list and we’ll be able to see it.

10:27 All right, so there’s the list. It’s got all of the words, no punctuation. We’ve stripped away all of the periods, all of the commas, the double quotes that appeared in there, and also the question mark.

10:43 The remove_punctuation() function was a pretty short function, it only had one line, so this is a good candidate for a lambda function.

10:51 So, another way you could have done all of this was let’s go ahead and define the remove_punctuation() function as a lambda function.

10:59 And so the same idea is that this function will strip away—for the input string—it’s going to strip away all of the punctuation. So, if you recall, this is exclamation and the question mark and so on.

11:14 Don’t forget to escape the double quote.

11:20 All right, so this is just a one-line definition for the remove_punctuation() function as a lambda function. And then if we do the same thing, let’s call the map() function, let’s map the remove_punctuation() function, and the iterable again is the words list.

11:38 And this time, let’s use the .join() function on the iterator returned by map(). And we want to join all the words with just some whitespace.

11:50 That way we can kind of see a different way how the words have been cleaned up without any punctuation.

11:58 So, this is the exact same quote. It’s just that, again, all of the punctuation—this is just a visual thing for us to see that things got cleaned up, and so now we’ve got all the punctuation removed.

12:10 So, when you’re going to be calling the map() function, if the function that you’re going to want to map on each of the elements of the iterable is just a small function like this, then maybe you want to use a lambda function. All right! Well, this ends this lesson but in the next lesson, we’re going to continue with this idea of mapping a function on an iterable of strings, and we’re going to do this by implementing a Caesar cipher.

Avatar image for matthewmeastman

matthewmeastman on Jan. 22, 2023

When I run the map() function with the remove_punctuation, I am I still getting quotes around the words I’m and What’s? I noticed in the video that they were there as well. How do I fully remove those quotes?

Avatar image for matthewmeastman

matthewmeastman on Jan. 22, 2023

Never mind my previous question. I just realized that because there is a apostrophe in those words, Python put the double quotes to be able to use it. When it’s printed, it’s joined and printed it is correct. :P

Become a Member to join the conversation.