This is a demonstration of how to use python dataclasses to build a Zuul client that shows build information from a REST api.
Introduction
Python dataclasses provides many advantages over traditional datastructure such as dict or object. Before we use them, let's take a look at typing, immutability and parsing.
Typing
Python typing may be used to improve code readability.
def show_build(build): ...
This show_build function definition does not indicate what it does:
- Is the input a build id or a build dict ?
- Does it call print or does it return something ?
An annotated version would look like this:
def show_build(build: Build) -> str: ...
This annotated function definition tells us a lot more about its purpose.
Such functions can be checked automatically using a type checker like mypy. In another terminal, you would start the typechecker like so:
while inotifywait -e close_write *.py; do clear; mypy *.py; done
Then mypy would be acting as an assistant that ensures the codes match the signature. This process greatly reduces the early test feedback loop as we don't have to wait for a runtime execution.
Thus, it's not surprising to see companies like Dropbox adding type annotation to their code-base: Our journey to type checking 4 million lines of Python
Immutability
Immutable records augment what one should expect from an object and they reduce the number of states. Each time a mutation is made, it creates a before and after state.
For example let's consider this Build implementation:
class Build: def __init__(self): self.job_name = None self.result = None ... def fromJson(self, dict): self.job_name = dict['job_name'] self.result = dict['result']
Build has at least two states, and users need to ensure it is in the correct state before using it efficiently.
Before looking at how dataclasses can leverage typing and immutability, we'll look at one more concept: parsing
Parsing
Parse, don’t validate design is a great companion to typing and immutability.
Instead of implementing a validation layer, we can focus on parsing immutable dataclasses.
First, some functions to parse an input and produce an optional output:
from typing import Optional from datetime import datetime import math def parse_str(s: str) -> Optional[str]: if len(s) > 0: return s return None def parse_isodate(s: str) -> Optional[datetime]: try: return datetime.strptime(s, "%Y-%m-%dT%H:%M:%S") except ValueError: return None def parse_float(s: float) -> Optional[float]: if not math.isnan(s): return s return None
Then, using a bit of typelevel abstraction, a couple of functions to run the parsers:
from typing import Callable, Optional, TypeVar, List Input = TypeVar('Input') Output = TypeVar('Output') def run(parser: Callable[[Input], Optional[Output]], input_value: Input) -> Output: result = parser(input_value) if result is None: raise RuntimeError("Expected %s, got: %s" % (parser.__name__, input_value)) return result def run_many(parser: Callable[[Input], Optional[Output]], input_values: List[Input]) -> List[Output]: return [run(parser, input_value) for input_value in input_values]
We are now ready to implement the Zuul client.
Zuul build dataclass
A Zuul build dataclass can be written as:
from dataclasses import dataclass @dataclass(frozen=True) class BuildArtifact: name: str url: str @dataclass(frozen=True) class Build: job_name: str result: str duration: float start_time: datetime artifacts: List[BuildArtifact] def show_build(build: Build) -> str: return "\n".join([ "# Build: " + str(build.job_name), "result: " + build.result, "date: " + str(build.start_time), "duration: " + str(build.duration), "", "## Artifacts:" ] + list(map(show_artifacts, build.artifacts))) def show_artifacts(artifact: BuildArtifact) -> str: return "\n".join([ "* name: " + artifact.name, " url: " + artifact.url])
To create the Build dataclass, a parser can be written as:
from typing import Any, Dict def parse_artifact(json_obj: Dict[str, Any]) -> Optional[BuildArtifact]: try: return BuildArtifact( run(parse_str, json_obj['name']), run(parse_str, json_obj['url']) ) except RuntimeError: return None def parse_build(json_obj: Dict[str, Any]) -> Optional[Build]: try: return Build( run(parse_str, json_obj['job_name']), run(parse_str, json_obj['result']), run(parse_float, json_obj['duration']), run(parse_isodate, json_obj['start_time']), run_many(parse_artifact, json_obj['artifacts']), ) except RuntimeError: return None def build_from_json(json_obj: Any) -> Build: return run(parse_build, json_obj)
And the rest of the client implementation is:
import argparse import requests def read_json(url: str): import requests return requests.get(url).json() def main() -> None: parser = argparse.ArgumentParser() parser.add_argument("--build-url") parser.add_argument("--pretty", action="store_true") args = parser.parse_args() build = build_from_json(read_json(args.build_url)) print(show_build(build) if args.pretty else build) if __name__ == "__main__": main()
Using dataclasses-json and argparse-dataclass
Some convenient external libraries are available to work with dataclasses. The above implementation may be simplified like so:
from dataclasses import dataclass from datetime import datetime from typing import List from uuid import UUID from dataclasses_json import dataclass_json, Undefined # type: ignore from argparse_dataclass import ArgumentParser # type: ignore @dataclass(frozen=True) class BuildArtifact: name: str url: str @dataclass_json(undefined=Undefined.EXCLUDE) @dataclass(frozen=True) class Build: uuid: UUID job_name: str result: str duration: float artifacts: List[BuildArtifact] def show_build(build: Build) -> str: return "\n".join([ "# Build: " + str(build.uuid), "name: " + build.job_name, "duration: " + str(build.duration), "", "## Artifacts:" ] + list(map(show_artifacts, build.artifacts))) def show_artifacts(artifact: BuildArtifact) -> str: return "\n".join([ "* name: " + artifact.name, " url: " + artifact.url]) @dataclass class BuildCLI: pretty_print: bool zuul_url: str tenant: str id: str def read_json(url: str): import requests return requests.get(url).json() def build_url(args: BuildCLI) -> str: return args.zuul_url + "/api/tenant/" + args.tenant + "/build/" + args.id def main() -> None: import requests args = ArgumentParser(BuildCLI).parse_args() build = Build.from_dict(read_json(build_url(args))) # type: ignore if args.pretty_print: print(show_build(build)) else: print(build) if __name__ == "__main__": # Install these requirements first: # python3 -m pip install --user argparse-dataclass dataclasses-json requests # Demo: # python3 dataclass.py --zuul-url https://zuul.opendev.org/ --tenant zuul --id e142dd27c4554397b3cdbf8bb4f68224 main()