Setting Up Python Projects: Part II | by Johannes Schmidt

Setting Up Python Projects: Part II | by Johannes Schmidt

[ad_1]

Photo by Volodymyr Hryshchenko on Unsplash

Whether you’re a seasoned developer or just getting started with 🐍 Python, it’s important to know how to build robust and maintainable projects. This tutorial will guide you through the process of setting up a Python project using some of the most popular and effective tools in the industry. You will learn how to use GitHub and GitHub Actions for version control and continuous integration, as well as other tools for testing, documentation, packaging and distribution. The tutorial is inspired by resources such as Hypermodern Python and Best Practices for a new Python project. However, this is not the only way to do things and you might have different preferences or opinions. The tutorial is intended to be beginner-friendly but also cover some advanced topics. In each section, you will automate some tasks and add badges to your project to show your progress and achievements.

The repository for this series can be found at github.com/johschmidt42/python-project-johannes

  • OS: Linux, Unix, macOS, Windows (WSL2 with e.g. Ubuntu 20.04 LTS)
  • Tools: python3.10, bash, git, tree
  • Version Control System (VCS) Host: GitHub
  • Continuous Integration (CI) Tool: GitHub Actions

It is expected that you are familiar with the versioning control system (VCS) git. If not, here’s a refresher for you: Introduction to Git

Commits will be based on best practices for git commits & Conventional commits. There is the conventional commit plugin for PyCharm or a VSCode Extension that help you to write commits in this format.

Overview

  • Part I (GitHub, IDE, Python environment, configuration, app)
  • Part II (Formatting, Linting, Command management, CI)
  • Part III (Testing, CI)
  • Part IV (Documentation, CI/CD)
  • Part V (Versioning & Releases, CI/CD)
  • Part VI (Containerisation, Docker, CI/CD)

Structure

  • Formatters & linters (isort, black, flake8, mypy)
  • Configurations (isort, .flake8, .mypy.ini)
  • Command management (Makefile)
  • CI (lint.yml)
  • Badge (Linting)
  • Bonus (Automatic linting in PyCharm, Create requirements.txt with Poetry)

If you’ve ever worked in a team, you know that to achieve code and style consistency, you need to agree on formatters and linters. It will help you with onboarding new members to the codebase, create fewer merge conflicts and generally save time because developers don’t have to care about formatting and style while coding.

If you don’t know the difference between a formatter & linter and/or would like to see them in action, check out this tutorial!

One option for formatting and linting Python code is wemakepyhton, which claims to be the “strictest and most opinionated Python linter ever”. However, I prefer the popular combination of isort and black as formatters, flake8 as linter and mypy as static type checker. mypy adds static typing to Python, which is one of the most exciting features in Python development right now.

We are going to add these tools to our project with Poetry. But since these tools are not part of the application, they should be added as dev-dependencies. With Poetry 1.2.0, we now can use dependency groups:

Poetry provides a way to organize your dependencies by groups. For instance, you might have dependencies that are only needed to test your project or to build the documentation.

When adding the dependencies, we can specify the group the should belong to with --group <name>.

> poetry add --group lint isort black flake8 mypy

Structuring the dev-dependencies in groups will make more sense later. The main idea is that we can save time and resources in CI pipelines by installing only the dependencies that are required for a specific task, such as linting.

Because isort and black don’t agree on a very few points, we need to enforce that isort uses the profile black.

So we add the configuration in the pyproject.toml file:

# pyproject.toml
...
[tool.isort]
profile = "black"
...

flake8 also needs to “use the black profile”. However, flake8 has not (yet) adopted pyproject.toml as the central location for project configuration (see this heated discussion, or use the pyproject-plugin), that’s why we add it in a .flake8 file:

# .flake8[flake8]
max-line-length = 88
extend-ignore = E203

For mypy, we can add the configuration of the tool according to the docs:

# pyproject.toml
...
[tool.mypy]
# 3rd party import
ignore_missing_imports = true
# dynamic typing
disallow_any_unimported = true
disallow_any_expr = false
disallow_any_decorated = false
disallow_any_explicit = true
disallow_any_generics = false
disallow_subclassing_any = true
# platform
python_version = "3.10"
# untyped
disallow_untyped_calls = true
disallow_untyped_defs = true
disallow_incomplete_defs = true
disallow_untyped_decorators = true
# None and Optional
no_implicit_optional = true
# Warnings
warn_return_any = false
warn_unreachable = true
# Misc
pretty = true
...

Mypy has many settings that you can customize to suit your preferences. I won’t cover all of them here, but I encourage you to read the mypy documentation and learn how to configure the static type checker for your project!

Let’s see our new tools in action:

> isort . --checkSkipped 2 files> black . --checkwould reformat src/example_app/app.pyOh no! 💥 💔 💥
1 file would be reformatted, 1 file would be left unchanged.
> flake8 ....> mypy .Success: no issues found in 2 source files

Only one of the tools (black) reported an issue that we can fix. Omitting the --check flag will run the formatter black for us on our Python files.

> black .

At this point we could think of adding pre-commit hooks that run these linters every time we commit. But using mypy with pre-commit is a little fiddly, so I’ll leave it up to you if you want (and like) pre-commit hooks.

As we add new tools to our project, we also need to remember some commands to use them. These commands can get complicated and hard to remember over time. That’s why it’s useful to have a single file where we can store and name commands for our project. This is where the Makefile comes in. Many devs are unaware that you can use make in a Python project to automate different parts of developing a project. It is a common tool in the world of software development with languages such as C or C++. It can be used, for example, to run tests, linters, builds etc. It’s an underutilized tool, and by integrating it into your routine, you can save time and avoid errors.

GNU Make controls the generation of executables and other non-source files of a program from the program’s source file.

That way, we don’t need to remember all the commands and their arguments and options. It lets us specify a set of tasks via a common interface and allows us to run several commands sequentially.

# Makefile
format-black:
@black .
format-isort:
@isort .
lint-black:
@black . --check
lint-isort:
@isort . --check
lint-flake8:
@flake8 .
lint-mypy:
@mypy ./src
lint-mypy-report:
@mypy ./src --html-report ./mypy_html
format: format-black format-isortlint: lint-black lint-isort lint-flake8 lint-mypy

To do stuff with make, you type make in a directory that has a file called Makefile. You can also type make -f <Makefile> to use a different filename. By default, make prints out the command before it runs it, so that you can see what it’s doing. But there is a UNIX dogma saying that “success should be silent”. So to silent commands in a target, we can start the command with a `@` character. Now we just need to run these two commands in a shell

> make format
> make lint

to run all our formatters and linters on our source code. If you want to know more about the format in a makefile, how to set variables, add pre-requisites and phonies, I highly recommend to read: python-makefie by Aniket Bhattacharyea!

If you want to have a well documented Makefile, check out the bonus part of this part at the bottom!

Now that we have a few more config files and a new Makefile as a task runner, our project should resemble this:

.
├── .flake8
├── LICENSE
├── Makefile
├── README.md
├── poetry.lock
├── pyproject.toml
└── src
└── example_app
├── __init__.py
└── app.py
2 directories, 8 files

Working in a team of professional software developers brings a number of challenges. Making sure that nothing is broken and everyone is working on the same formatted code is one of them. For this we use continuous integration (CI), a software development practice that allows members of a team to integrate their work frequently. In our case, so far, new features (feature branches) that modified source files need to pass our linters to preserve style consistency. There are a lot of CI tools such as CircleCI, TravisCI, Jenkins etc., but in the scope of this tutorial we will use GitHub’s CI/CD workflow solution GitHub Actions.

Now that we can run our formatters and linters locally, let’s set up our first workflow that will run on a GitHub server. To do this, we will create a new feature branch called feat/lint-ci and add the file .github/workflows/lint.yml

Let’s break it down to make sure we understand each part. GitHub action workflows must be created in the .github/workflows directory of the repository in the format of .yaml or .yml files. If you’re seeing these for the first time, you can check them out here to better understand them. In the upper part of the file, we give the workflow a name name: Linting and define on which signals/events, this workflow should be started: on: ... . Here, we want that it runs when new commits come into a PullRequest targeting the main branch or commits go the main branch directly. The job runs in an ubuntu-latest* (runs-on) environment and executes the following steps:

  • checkout the repository using the branch name that is stored in the default environment variable ${{ github.head_ref }} . GitHub action: checkout@v3
  • install Poetry with pipx because it’s pre-installed on all GitHub runners. If you have a self-hosted runner in e.g. Azure, you’d need to install it yourself or use an existing GitHub action that does it for you.
  • Setup the python environment and caching the virtualenv based on the content in the poetry.lock file. GitHub action: setup-python@v4
  • Install only the requirements that are needed to run the different linters with poetry install --only lint **
  • Running the linters with the make command: poetry run make lint Please note, that running the tools is only possible in the virtualenv, which we can access through poetry run.

*We could also run this in a container (docker) but containerisation will be covered in Part VI

**We used poetry install --only lint to just install the dependencies in the group lint . You might wonder: How can we check if these dependencies are enough to run the tools locally? Well, in poetry 1.2.0, the environment depends on both the Python interpreter and the pyproject.toml file. So we would need to delete the existing environment with poetry env remove <env name> or poetry env remove --all , then create a new clean environment with poetry env use python3 and run poetry install --only lint . This seems like a hustle, right? agree, but that’s how it works for now. You can read more about this issue in this StackOverFlow Post.

Now that we have our first workflow, how can we see it in action? Or better yet: How can we test it before pushing it to GitHub? There are two ways to do that:

  1. We can push our changes and see the results on GitHub
  2. We can use the tool act, which lets us run GitHub actions locally and avoid the trial-and-error approach.

Let’s try the first option and push our changes to our feature branch. When we open a pull request, we can see that the workflow has started running.

First running workflow in GitHub actions (linting) — Image by author

And we can also see that it actually failed:

First failed workflow in GitHub actions (linting) — Image by author

The reason for this error is that we didn’t run this command

> poetry install/home/runner/work/python-project-johannes/python-project-johannes/example_app does not contain any element

before to check if our app was installed correctly in the site-packages directory or if the name or mapping was wrong. We can solve this by making sure that the name attribute in our pyproject.toml matches the name of our src directory and also removing the package attribute for now:

# pyproject.toml[tool.poetry]
name = "example_app"
...

Running the pipeline a second time, we see that … it fails again!

Failing workflow — Image by author

This time, our static type checker mypy reported errors because of unfollowed imports . We can reproduce this by running the same commands from the workflow locally (only install lint packages). Turns out that mypy tries to follow the imports in a file but if it can’t (because it was not installed with poetry install —- group lint), then it will have Any types! This is described in the mypy documentation. We can solve this by installing our application dependencies AND the lint dependencies with

> poetry install --with lint

This time, we see that it succeeded, Hallelujah!

Successful workflow — Image by author

And to summarise, here’s how our repository tree looks like now:

.
├── .flake8
├── .github
│ └── workflows
│ └── lint.yml
├── LICENSE
├── Makefile
├── README.md
├── poetry.lock
├── pyproject.toml
└── src
└── example_app
├── __init__.py
└── app.py
4 directories, 9 files

When we merge our PR to the main branch, the workflow will run again. We can display the status of our CI pipeline on the homepage of our repository by adding a badge to the README.md file.

To get the badge, we need to click on a workflow run (main branch) and copy the lines

Create a status badge from workflow file on GitHub — Image by author
Copy the badge markdown — Image by author

The badge markdown can be copied and added to the README.md:

Our landing page of the GitHub now looks like this ❤:

Image by author

If you want to know how this magically shows the current status of the last pipeline run in main, have a look the commit statuses API on GitHub.

[ad_2]
Source link

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *