Skip to content

Creating a Python Package

Tips

If there are any datasets that you wish your users should try out when checking your library, you can just host them in GitHub in the same repository as your package and provide them with the raw.githubusercontent.com URL. pandas can directly read it from there.

prism_df = pd.read_csv(
    "https://raw.githubusercontent.com/casact/chainladder-python/master/chainladder/utils/data/prism.csv"
)
prism_df.head()

Flow

  1. Need a Source Tree
  2. pyproject.toml
  3. Need to create build artifacts
    • sdist (source distributions)
    • wheels (built distributions) ← Often just one for a pure Python package
  4. Upload to package distribution service (PDS)
Using the Package

User runs pip install which would

  1. Download one of the package's build artifacts from the PDS
  2. Install it in their Python environment

Config File

  • Config depends on the tool to create build artifacts
[build-system]
requires = ["setuptools>=61.0", "wheel"]
build-backend = "setuptools.build_meta"
  • I used Setuptools as the build backend. (for no obvious reason)
  • A build frontend e.g. pip can run the chosen backend1

Use this guide: https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#writing-pyproject-toml for project.toml

Build Artifacts

python3 -m build --sdist source-tree-directory
  • Allow to build from source.
python3 -m build --wheel source-tree-directory
  • Only the files needed for end user's Pythong environment

Upload

twine upload dist/package-name-version.tar.gz dist/package-name-version-py3-none-any.whl

After this you can install the package using pip the build frontend.

Steps

Create a new isolated python environment for all your build dependencies.

conda create --name <environ-name> python=3.10
conda activate <environ-name>
  • Press y to install the base packages.
pip install setuptools wheel twine

After creating the required __init__.py file create the stuff.

python setup.py sdist bdist_wheel

Notes

  • The __init__.py file in the package directory tells python that the whole directory should be treated as a python package.
    • It controls what can get imported through the package. Whatever is imported into this init file will be available directly through the package.
  • Configure setup.py file to bund

Publishing for Private Repositories

Project structure:

.git/
pyproject.toml #(1)
  1. Should be in the same level as the .git folder

The pyproject.toml file:

[build-system]
# 1. Specify the required build dependencies
# Setuptools (the build backend) and setuptools-scm (for dynamic versioning) are essential.
requires = ["setuptools>=61.0.0", "setuptools_scm[toml]>=6.2"]
build-backend = "setuptools.build_meta"

# --- PROJECT METADATA ---
[project]
# 2. Package Identity
name = "ganaka"
authors = [
    {name = "Hursh Gupta", email = "hurshgupta1@gmail.com"},
]
description = "Bridging the gap."
readme = "README.md"
license = {file = "LICENSE"}
keywords = ["actuarial"]

# 3. Dynamic Versioning (The key for auto-versioning)
# The version will be determined at build time by setuptools_scm based on Git tags.
dynamic = ["version"]

# 4. Standard Python Requirements
requires-python = ">=3.8"
classifiers = [
    "Development Status :: 3 - Alpha",
    "Intended Audience :: Developers",
    "License :: OSI Approved :: MIT License",
    "Programming Language :: Python :: 3",
    "Programming Language :: Python :: 3.8",
    "Programming Language :: Python :: 3.9",
    "Programming Language :: Python :: 3.10",
    "Programming Language :: Python :: 3.11",
]

# 5. Core Dependencies
dependencies = [
    "requests>=2.28.1",
    "numpy",
]

# 6. Optional Dependencies (e.g., for development, testing, or specific features)
[project.optional-dependencies]
dev = [
    "pytest>=7.0",
    "coverage",
    "flake8",
    "black",
    "isort",
]

# 7. Entry Points (If your package provides command-line scripts)
[project.scripts]
my-cli-command = "ganaka.cli:main"


# --- BUILD TOOL CONFIGURATION ---

# 8. Setuptools Configuration (Optional but recommended for src layout)
[tool.setuptools]
# Tells setuptools to look for packages inside the 'src' directory.
package-dir = {"" = "lib"} #(1)

# 9. Setuptools SCM Configuration (For versioning)
[tool.setuptools_scm]
# Optional: Writes the dynamically generated version to an internal file.
# You can then access it in your code as: from my_package_name._version import version as __version__
write_to = "lib/_version.py"


# --- CODE FORMATTING & LINTING TOOLS (Optional) ---

# 10. Black (Code Formatter)
[tool.black]
line-length = 88
target-version = ['py38', 'py39', 'py310', 'py311']
  1. Specify where in the (mono)repo the file source to the library exists. If the pyproject.toml file is in one of the parent directories.

After this we need to upload files.

Setup.py

We didn't have to create a setup.py file for the reason that we don't really need it. With my limited understanding, I think people don't use it much for basic use-cases and thus, rely directly on build and the pyproject.toml file.

Now, note that pypi has a standard of versions that need to be maintained. Since we have already set the versioning to dynamic, build will take care of it. But what we do need to do is to create a git tag that will tell build which versions are stable and which are not. When we create a git tag, we are indicating that the last commit can be released into pypi. So, let's do that first:

git tag -a v0.0.1 -m "Release notes here..."
python -m build

Now, build will ensure that the next build is going to take version v0.0.1 and also, if we make new builds again just to test out our package, and accidently run the twine upload command, pypi just won't accept it (which is a good thing), and it will only release stable builds.

twine upload ./dist/*

Automating this process

Still figuring it out using GitHub workflows. One downside that I found was it doesn't work on private repositories.


  1. The build backend takes the source tree and builds an sdist or wheel. All backends have a standardized interface