In this lesson, you’ll implement command-line options for your wordcount
command using Python’s argparse
module. This will allow you to selectively display counts for lines, words, and bytes.
Note: Completing this task involves several more steps compared to your previous tasks, which is why the solution is divided into two parts. This is part one. If you prefer to see the complete solution right away, then you may skip ahead to the second part now.
Know Your Starting Point
If you got lost while working on the previous task, then expand the section below to view the full source code of the expected word count implementation so far:
src/wordcount.py
import sys
from dataclasses import dataclass
from pathlib import Path
from typing import NamedTuple
class Counts(NamedTuple):
lines: int = 0
words: int = 0
bytes: int = 0
@property
def max_digits(self):
return len(str(max(self)))
def as_string(self, max_digits):
return (
f"{self.lines:>{max_digits}} "
f"{self.words:>{max_digits}} "
f"{self.bytes:>{max_digits}}"
)
def __add__(self, other):
return Counts(
lines=self.lines + other.lines,
words=self.words + other.words,
bytes=self.bytes + other.bytes,
)
@dataclass(frozen=True)
class FileInfo:
path: Path
counts: Counts
@classmethod
def from_path(cls, path):
if path.name == "-":
raw_text = sys.stdin.buffer.read()
elif path.is_file():
raw_text = path.read_bytes()
else:
return cls(path, Counts())
text = raw_text.decode("utf-8")
return cls(
path,
Counts(
lines=text.count("\n"),
words=len(text.split()),
bytes=len(raw_text)
)
)
def main():
if len(sys.argv) > 1:
file_infos = [FileInfo.from_path(Path(arg)) for arg in sys.argv[1:]]
else:
file_infos = [FileInfo.from_path(Path("-"))]
total_counts = sum((info.counts for info in file_infos), Counts())
max_digits = total_counts.max_digits
for info in file_infos:
line = info.counts.as_string(max_digits)
if info.path == Path("-"):
print(line)
elif not info.path.exists():
print(line, info.path, "(no such file or directory)")
elif info.path.is_dir():
print(line, f"{info.path}/ (is a directory)")
else:
print(line, info.path)
if len(file_infos) > 1:
print(total_counts.as_string(max_digits), "total")
As a quick word of encouragement, the code above already satsifies the first acceptance criterion for this task, which verifies your script’s default behavior.
Parse Command-Line Arguments With argparse
To allow your script to accept various command-line options, also known as flags or switches, you’ll replace the argument vector (sys.argv
) with a more robust argument parser from the standard library:
src/wordcount.py
import sys
from argparse import ArgumentParser
from dataclasses import dataclass
from pathlib import Path
from typing import NamedTuple
# ...
def main():
args = parse_args()
if len(args.paths) > 0:
file_infos = [FileInfo.from_path(path) for path in args.paths]
# ...
def parse_args():
parser = ArgumentParser()
parser.add_argument("paths", nargs="*", type=Path)
return parser.parse_args()
First, you import the ArgumentParser
class from the argparse
module, which you then instantiate in a helper function, parse_args()
, placed at the bottom.
Your parser specifies that the script can accept multiple file paths as positional arguments in the command line. These paths are stored in the resulting object’s .paths
attribute. The asterisk (*
) in the nargs
argument allows for zero or more paths to be provided, and each path is automatically converted into a Path
instance.
You call parse_args()
at the beginning of your main()
function and assign its return value to a local variable named args
. This lets you replace sys.argv[1:]
with args.paths
on the following line. The rest of the function remains untouched.