7. Packaging#
So, what’s the deal with packaging our project? We already have our code in a folder—can’t we just zip it up and send it to someone?
Sure, you can do that! But packaging goes beyond simply zipping up your code. It’s about making your code easy to install and use for others. When you package your code, you create a distributable version that can be installed on other machines. Now, what’s this new term: distributable?
A distributable is a version of your code that’s ready for others to use. Think of it like a cake that only needs to be taken out of the box and served—no baking or frosting required. In this analogy, the baking and frosting represent the installation and setup that a user would have to handle if you just sent them a zipped folder. By packaging your code, you make it so that the user only needs to run a few simple commands, like pip install your-package.
The pyproject.toml
file#
When you package your code, you create a distributable version known as a package. A package is a collection of Python files that can be easily installed using a package manager like pip
. pip
is the Python package manager that helps you install and manage Python packages.
So, how does pip
know what to do with your package? How does it know what to install and where? That’s where the pyproject.toml
file comes in. The pyproject.toml
is a file, which provides pip
with all the necessary information about your project, such as your package name, version, dependencies, as well as instructions on how to build it. With this file, you ensure that your package is straightforward to install and use!
What does the pyproject.toml
file look like? Here’s an example:
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
Create a new file in your project’s root directory and name it pyproject.toml
(do not change the name of the file, keep it exactly pyproject.toml
). Copy the above content into the file and save it.
Note
The TOML file is not a Python file, so you don’t need to add the .py
extension to the file name. The file name should be exactly pyproject.toml
. TOML is a configuration file format that is easy to read and write to help package managers like pip
understand the structure of your project, but at the same time be easily human-readable.
This is the most basic pyproject.toml
file that you can have. It tells pip
that your project uses setuptools
as the build system. setuptools
is a package that helps you build and distribute Python packages. The build-backend
key specifies the build backend that setuptools
should use to build your package. In this case, it’s setuptools.build_meta
. But that is not something you need to worry about right now. Just know that this is the minimum you need to have in your pyproject.toml
file and that setuptools
is a popular example of a build system, not the only one.
Adding metadata to your package#
The pyproject.toml
file is not just for specifying the build system. It can also contain metadata about your package. Metadata is information about your package, such as its name, description, version, and author. This information helps users understand what your package does and how it can be used.
Here’s an example of how you can add metadata to your pyproject.toml
file:
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
[project]
name = "my-package"
description = "A simple package that does something cool"
authors = [
{name = "Your Name", email = "you@yourdomain.com"},
]
In this example, we’ve added a new section called [project]
to the pyproject.toml
file. This section contains metadata about your package, such as its name, description, and authors.
Adding the version of your package#
We should also add the version of the package. The version is simply a string that specifies the version of your package. Here’s how you can add the version to the pyproject.toml
file:
[project]
name = "my-package"
version = "0.1.0"
description = "A simple package that does something cool"
authors = [
{name = "Your Name", email = "you@yourdomain.com"},
]
Here, we’ve hardcoded the version by typing out a string. But you can also use a variable to specify the version. This is useful if you want to update the version automatically when you release a new version of your package. This is called dynamic versioning, as opposed to static versioning, where you manually update the version string.
You can add more metadata to this section, such as the license, keywords, and classifiers. This information helps users understand what your package does and how it can be used.
Adding dependencies to your package#
Another essential part of packaging your code is specifying its dependencies. Dependencies are other packages that your package relies on to work correctly. By specifying your dependencies, you ensure that pip
installs all the necessary packages when someone installs your package. This makes it easier for users to use your package without having to worry about installing dependencies manually.
Here’s an example of how you can add dependencies to your pyproject.toml
file:
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
[project]
name = "my-package"
version = "0.1.0"
description = "A simple package that does something cool"
authors = [
{name = "Your Name", email = "you@yourdomain.com"},
]
dependencies = ["numpy", "pandas", "matplotlib"]
In this example, we’ve added a new key called dependencies
to the [project]
section of the pyproject.toml
file. This key specifies the dependencies of your package, such as numpy
and pandas
. When someone installs your package using pip
, pip
will also install these dependencies to ensure that your package works correctly. If you have other dependencies such as tkinter
or scikit-learn
, you can add them to the list as well.
Often, it is a good idea to specify the version of the dependencies that your package requires because different versions of a package can have different features or bugs. By specifying the version, you ensure that there are no compatibility issues or conflicts between the dependencies. Here’s an example of how you can specify versions of dependencies:
dependencies = ["numpy>=1.0", "pandas>=1.0", "matplotlib==3.0.2"]
In this example, we’ve specified that our package requires numpy
version 1.0 or higher, pandas
version 1.0 or higher, and matplotlib
to be exactly version 3.0.2. This ensures that pip
installs the correct versions of the dependencies when someone installs your package.
Building your package#
Note
Your project only needs to have the pyproject.toml
file in the root directory. The final project assessment does not require you to build your package. However, it is a simple step after you have created your pyproject.toml
file, which will help you understand the packaging process better. Learning this will help you in the future and also to understand better what actually happens when you install a package using pip
.
What you have done so far is to create a pyproject.toml
file that specifies the build system, metadata, and dependencies of your package. After this, you need to build your package so that it can be installed using pip
. As explained earlier, building your package involves creating distributable versions of your code that can be easily installed on other machines.
Warning
Make sure you have the setuptools
and wheel
packages installed before running this command. You can install it using pip install setuptools wheel
. Also, it’ll be good to upgrade to the latest build
package by running pip install --upgrade build
.
If you have multiple folders in your project, pip
might have trouble building your package, because it expects a single folder with all the necessary files. In such cases, you can create a src
folder and a folder inside there with your package’s name. Move all your Python files into this directory. So, now your project structure will look something like this:
my-package/
--- pyproject.toml
--- src/
|--- your_project_name/
|--- data_folder/
|--- reference_folder/
|--- some_functions.py
|--- main.py
Now, to build your package, you need to run the following command in your terminal in the same directory as your pyproject.toml
file:
python -m build
When you run this command, the terminal will print out a bunch of messages as it builds your package. If everything goes well, you should see a message that says something like Successfully built my-package-0.1.0.tar.gz
and Successfully built my-package-0.1.0-py3-none-any.whl
. This means that pip
has successfully created a distributable version of your package.
You’ll see that pip
has created a dist
folder in your project directory. This folder contains the distributable versions of your package, namely a .tar.gz
file and a .whl
file. These are the distributable versions of your package. So, what are these files? The .tar.gz
file is a source distribution of your package, which contains all the necessary files to build and install your package. The .whl
file is a built distribution of your package, which is a binary distribution that can be installed on other machines.
For us, it doesn’t matter which file we use, because pip
can handle both types of distributions. That said, the .whl
file is more efficient and faster to install, so it is the preferred distribution format. I highly recommend going through this tutorial if you want to understand more about the differences between source and wheel distributions.
Installing your package#
Now that you have built your package, you can install it on your machine using pip
. To install your package, you need to run the following command in your terminal in the same directory as your pyproject.toml
file:
pip install dist/my-package-0.1.0.tar.gz
Alternatively, you can install the .whl
file:
pip install dist/my-package-0.1.0-py3-none-any.whl
When you run one of these commands, pip
will install your package on your machine. You should see a message that says something like Successfully installed my-package-0.1.0
. This means that pip
has successfully installed your package.
Now, you can use your package in your Python scripts by importing it like any other package. For example, if you have a function called my_function
in your package, you can import it in your script like this:
from your_project_name import my_function
And that’s it! You’ve successfully packaged your code and installed it on your machine. You can now share your package with others by uploading it to a package repository like PyPI. This way, others can install your package using pip
and use it in their projects.
Note
TestPyPI is a good starting point to test your package before uploading it to PyPI. It is a separate instance of the Python Package Index that allows you to test your package before making it public. The link provides a detailed guide on how to get started with TestPyPI. Once you are ready, you can upload your package to PyPI using the twine
package and then, anyone can install your package using pip install my-package
.