Locked learning resources

Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Locked learning resources

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Character Classification

In this lesson, you’ll explore string methods that classify a string based on the characters it contains. Here are character classification methods:

  • str.isalnum()
  • str.isalpha()
  • str.isdigit()
  • str.isidentifier() *
  • iskeyword(<str>) *
  • str.isprintable()
  • str.isspace()
  • str.istitle()
  • str.islower()
  • str.isupper()
  • str.isascii() *

The method .isidentifier() determines whether the target string is a valid identifier. What is a Python identifier? An identifier is a name used to define a variable, function, class, or some other type of object.

Python identifiers:

  • Must begin with an alphabetic character or underscore (_)
  • Can be a single character
  • Can be followed by any alphanumeric or the underscore
  • Cannot have other punctuation characters

One other potential pitfall to watch for in naming your own identifiers is that you can’t use an existing Python keyword. To check if the name you’re considering using is an existing keyword, you can import a function called iskeyword() from the module called keyword:

Python
>>> from keyword import iskeyword
>>> iskeyword('and')
True

Here are Python keywords:

Python Keywords
False break else if not while
True class except import or with
None continue finally in pass yield
and def for is raise global
as del from lambda return nonlocal
assert elif try

For more information on Python modules, check out Python Modules and Packages — An Introduction . The .isascii() method was introduced in Python 3.7.

Here’s how to use str.isalnum():

Python
>>> s = 'abc123'
>>> s.isalnum()
True

>>> ''.isalnum()
False

>>> s = 'abc$123'
>>> s.isalnum()
False

Here’s how to use str.isalpha():

Python
>>> s = 'ABCabc'
>>> s.isalpha()
True

>>> s = 'ABC123'
>>> s.isalpha()
False

Here’s how to use str.isdigit():

Python
>>> s = '123456'
>>> s.isdigit()
True

>>> s = '123abc'
>>> s.isdigit()
False

Here’s how to use str.isidentifier():

Python
>>> 'spam32'.isidentifier()
True
>>> '32spam'.isidentifier()
False
>>> 'foo$32'.isidentifier()
False
>>> 'def'.identifier()
True

Here’s how to use iskeyword():

Python
>>> from keyword import iskeyword

>>> 'def'.isidentifier()
True
>>> 'def'.iskeyword()
Traceback (most recent call last):
  File "<input>", line 1, in <module>
    'def'.iskeyword()
AttributeError: 'str' object has no attribute 'iskeyword'

>>> iskeyword('def')
True
>>> iskeyword('and')
True
>>> iskeyword('spam32')
False

Here’s how to use str.isprintable():

Python
>>> s = 'a\tb'
>>> s.isprintable()
False
>>> s = 'a b'
>>> s.isprintable()
True
>>> s = ''
>>> s.isprintable()
True
>>> s = 'a \n b'
>>> s.isprintable()
False

Here’s how to use str.isspace():

Python
>>> s = 'a \n b'
>>> s.isspace()
False
>>> s = '\t\n '
>>> s.isspace()
True

Here’s how to use str.istitle():

Python
>>> s = 'The Sun Also Rises'
>>> s.istitle()
True
>>> s = "Bob's Burgers!"
>>> s.istitle()
False

Here’s how to use str.islower():

Python
>>> s = 'asdlkjgadb'
>>> s.islower()
True
>>> s = "spamIsgood"
>>> s.islower()
False

Here’s how to use str.isupper():

Python
>>> s = 'SPAMBACON'
>>> s.isupper()
True
>>> s = "SPAMBACON#1!"
>>> s.isupper()
True

Here’s how to use str.isascii():

Python
>>> s = 'ABCabc#$%'
>>> s.isascii()
True
>>> ''.isascii()
True
>>> ' '.isascii()
True
>>> '∑'.isascii()
False

00:00 In this video, I’m going to show you character classification. These are methods that classify a string based upon the characters that it contains. The first method is .isalnum().

00:12 This method determines whether the target string consists of alphanumeric characters. It’s going to return True if the target string is non-empty and all of its characters are alphanumeric—either a letter or a number. It will return False otherwise.

00:28 To test out the .isalnum() method, create a short string.

00:35 And to test it, bring up the method. And again, it will return True if it’s simply alphanumeric, and False otherwise. One note, again, it still has to have at least one character in the string.

00:45 So in this case of the current string s that has 'abc123' in it, it returns True. So if you were to have a empty string and try to run it on it, it would return False, because there’s nothing there to check. Similarly, if s had some non-alphanumeric character in it, like in this case, the dollar sign ('$'), it would return False also. So again, as long as it is not empty and has alphabetic characters or numeric characters inside of it that would return True for .isalnum().

01:19 The method .isalpha() determines whether the target string consists of only alphabetic characters. If the target string is non-empty, and in this case, if all the characters are alphabetic, it’ll return True.

01:32 Otherwise, it’ll return False. For .isalpha(), if you have a string

01:40 with a combination of uppercase and lowercase characters, as long as they’re all alphabetic, .isalpha() will return True.

01:50 And again, at least one character in the string. If you had a string that combined digits along with your alphabetic characters,

02:02 it would return False.

02:05 The next method, .isdigit(), can be used to determine if your string is made up of only digits. If the string is not empty and all of its characters are numeric digits, it’ll return True, and return False, otherwise. For the method .isdigit(), make a new string containing just numbers.

02:24 And as you can see here, .isdigit() will return True if the string is a digit string, meaning made up of digits. So your string returns True. If it’s a mix of characters,

02:39 .isdigit() will return False.

02:44 The method .isidentifier() will look at the target string and determine if it’s a valid Python identifier. A Python identifier is a name that’s used to identify a variable, a function, or a class, or some other type of object. Identifiers usually start with a letter a through z, either uppercase or lowercase, or it can also start with an underscore (_), followed by zero or more letters, underscores, and digits.

03:09 Python doesn’t allow punctuation characters within a Python identifier. So, .isidentifier() is going to return True if this string is a valid Python identifier following those formatting rules. Note that .isidentifier() will return True for a string that matches any Python keyword.

03:30 Now, this could cause some errors because you might not know that this is already a keyword. Luckily, you can test that also, by importing .iskeyword(). .iskeyword() is not a string method. It’s its own function, and it’s imported from the module keyword. I’m going to show how that works.

03:51 You’ve been creating a lot of identifiers all throughout this whole tutorial. Every time you create a string object and give it the name s or t or u, and every time that you define a function, you have to give it a name.

04:06 If it’s a valid object name, then it will work. As an example, let’s try out 'spam32'. Is that an identifier? And here again, bpython is showing what the method’s going to do.

04:26 It looks like it’s True. What if you had a number at the beginning? Can that be used as an identifier? No, that would be False.

04:33 It is not a valid identifier because it starts with a number. What also is invalid? Well, if you use illegal characters inside there like the number sign, that’s the type of character that’s not allowed inside of an identifier, so you’d get False. Here’s one of the tricks, though, is a keyword such as the word 'and' or 'def'is that a valid identifier? And as you can see, it is, but what’s the problem with def? Well, you can see it already is lighting up differently here, ‘cause it’s a keyword—just as the word and is, or the word or is. Python has many keywords.

05:11 You have used a lot of them already, even within this tutorial. So instead of staring at this chart, you can follow these steps to check if something’s a keyword.

05:22 You can import a function, from keyword import iskeyword and then you could do a test. So, if you were thinking of naming a variable def, again, it might come up here as True but if you tried to use .iskeyword() in the same way that you’ve been using these methods, you’ll get this error. .iskeyword() is not applied to a string in the same way.

05:50 .iskeyword() is its own function, and you give this function an argument of the string that you want check. In this case, the word 'def' that you’re thinking about using as an identifier, it would say, “Oh, actually that is a keyword.” So is 'and'. Whereas

06:14 'spam32', is that a keyword? No. That’s a safe identifier that you can use. If you’ve done some programming and you’re getting strange errors, it’s possibly because something like that might already be a keyword.

06:29 .isprintable() determines whether the target string consists entirely of printable characters. So it would return True if all of the alphabetic characters in it are printable, or otherwise it will return False. Non-alphabetic characters are ignored.

06:46 So in the case of a string like this, where I’ve used an escape sequence to enter in a tab character ('\t'),

06:55 a check as to .isprintable(), in this case, it’s going to return False. Versus if you had 'a b', it would return True.

07:11 This is one unique case where, as you can see here at the bottom of this definition, if you had an empty string, it’s one of the few methods that actually will return True.

07:25 What are other characters that are not printable? Well, along with the tab ('\t'), so is the newline ('\n'). So here is '\n', for newline.

07:36 Is that printable? False.

07:41 .isspace() returns if the string is non-empty and all the characters are whitespace characters. That would include spaces (' '), tabs ('\t'), or newline characters ('\n').

07:52 So for .isspace(), kind of looking at those same characters again, such as s, which has the '\n' character inside of that—is that considered .isspace()? It would return True as long as it’s a whitespace string.

08:06 Okay, so what’s missing? I have characters 'a' and 'b' inside of it. Well, you can fix that. What if you just had '\t\n '?

08:20 That would return True. .istitle() determines where the target string is title cased. It’ll return True if the string is not empty and the first alphabetic character of each word is uppercase and all other alphabetic characters in each word are lowercase.

08:37 For .istitle(), in this case, that would be True. The problem with it is it uses that same kind of faulty logic where it is unfamiliar with… I’m starting with a double quote (") to be able to do apostrophes (').

08:56 Would it say that that’s a title? Well, it would think this is the beginning of a new word here, so it would say False. So just remember the logic is looking for very simple, every word having an uppercase at the beginning of it.

09:13 The method .islower() will determine whether the target string’s alphabetic characters are all lowercase. All non-alphabetic characters will be ignored. For .islower()pretty straightforward—if all the characters in the string are lowercase, it will return True.

09:32 So even one character…and it will return False. If you have an empty string—so again, this is another one where .islower() will return False also. It needs at least one character. .isupper(), similar to .islower(), determines whether the target string’s alphabetic characters are uppercase.

09:55 If the string is not empty and all the alphabetic characters in it are uppercase, it’ll return True, and False otherwise. Okay, well how about upper?

10:06 So very simple—again, .isupper() is going to just return True or False. If it’s an entirely uppercase string, that would be True. Also note on .isupper() and .islower(),

10:18 is it’s going to ignore the punctuation marks and digits. So is it upper? Actually, yes. All the alphabetic characters are upper. And here’s an additional one that’s not in the written tutorial.

10:33 It’s a newer method introduced in Python 3.7. It’ll determine if all the characters in your string are part of the ASCII set. For .isascii(), as long as it’s the standard set of ASCII characters,

10:48 it will return True. It’s only going to return True if the characters have code points within this range, meaning the first 128 Unicode characters.

10:58 An empty string in this case is ASCII also. So, let’s try it out with the string here. That would be True. An empty string—is that ASCII? Well, based on what it just said there, yes.

11:09 And a space—is that ASCII? Yes. Yep. Okay, what isn’t? Well, other characters beyond the ASCII set, like the Sigma character ('∑')—again on the Mac, I held Option + W—that would fall outside of ASCII, those first 128 characters. This is new, introduced in Python 3.7.

11:28 Now that you can determine the contents of strings using character classification using True/False testing, next you’ll dive into string formatting.

Become a Member to join the conversation.