5 - Extras
In this section we briefly hightlight some of the tools and libraries that are useful for working with PyTorch.
5.1 - HuggingFace
HuggingFace is a popular library for working with machine learning. It provides a wide range of pre-trained models and tools for fine-tuning them on your own data.
Useful links:
- HuggingFace (opens in a new tab)
- HuggingFace Hub (opens in a new tab)
- HuggingFace Transformers (opens in a new tab)
- PyTorch Image Models (opens in a new tab)
- HuggingFace Datasets (opens in a new tab)
- HuggingFace Tokenizers (opens in a new tab)
- HuggingFace Accelerate (opens in a new tab)
- HuggingFace Gradio (opens in a new tab)
- HuggingFace Optimum (opens in a new tab)
5.2 - Lightning
PyTorch Lightning (opens in a new tab) is a PyTorch framework that provides a high-level interface for building and training neural networks. It simplifies many common tasks, such as data loading, logging, and checkpointing, and allows you to focus on the core aspects of your research.
5.3 - Git Hook and CI/CD
Git hook and CI/CD are useful tools for automating the development process and ensuring that your code is consistent and error-free. Git hooks allow you to automate tasks, such as running tests or linting code, whenever you commit or push changes to your repository. CI/CD allows you to automate the build, testing, and deployment of your code, and provides a way to ensure that your code runs smoothly across different environments.
Here we show how to use Git hooks and pre-commit (opens in a new tab) to run a series of checks on your code whenever you commit changes to your repository.
- Create a file called
.pre-commit-config.yaml
in the root directory of your repository and add the following lines:
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.3.0
hooks:
- id: check-ast
- id: check-yaml
- id: check-json
- id: end-of-file-fixer
- id: trailing-whitespace
- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: 'v0.0.253'
hooks:
- id: ruff
- repo: https://github.com/psf/black
rev: 22.10.0
hooks:
- id: black
- repo: https://github.com/codespell-project/codespell
rev: v2.2.2
hooks:
- id: codespell
exclude: >
(?x)^(
.*\.ipynb
)$
- Create
ruff.toml
# Enable pycodestyle (`E`) and Pyflakes (`F`) codes by default.
select = ["E", "F"]
ignore = [
"E731",
"F405",
"E501",
"E722",
"E741",
]
# Allow autofix for all enabled rules (when `--fix`) is provided.
unfixable = []
# Exclude a variety of commonly ignored directories.
exclude = [
".bzr",
".direnv",
".eggs",
".git",
".hg",
".mypy_cache",
".nox",
".pants.d",
".pytype",
".ruff_cache",
".svn",
".tox",
".venv",
"__pypackages__",
"_build",
"buck-out",
"build",
"dist",
"node_modules",
"venv"
]
# Same as Black.
line-length = 88
# Allow unused variables when underscore-prefixed.
dummy-variable-rgx = "^(_+|(_+[a-zA-Z0-9_]*[a-zA-Z0-9]+?))$"
# Assume Python 3.9.
target-version = "py39"
[mccabe]
# Unlike Flake8, default to a complexity level of 10.
max-complexity = 10
- Then run the following commands:
# Install pre-commit
pip install pre-commit
# Install pre-commit hooks
pre-commit install
# Run pre-commit hooks
pre-commit run --all-files
5.4 - Github Copilot
Github Copilot (opens in a new tab) is an AI-powered code assistant that provides suggestions for code completion and generation based on the context of your code. To use the tool for free you need sign up for the Github Student Developer Pack (opens in a new tab).
5.5 - Profiling Tools
PyTorch provides a number of profiling tools, such as torch.utils.bottleneck (opens in a new tab) and torch.profiler (opens in a new tab), which can help you identify performance bottlenecks in your code, especially when the problem is related to the GPU. Other tools that can be useful for profiling include:
- CProfile (opens in a new tab) - Python profiler
- pyinstrument (opens in a new tab) - Call stack profiler for Python with web interface.
- SnakeViz (opens in a new tab) - Python profiler visualizer
- PyCharm Profiler (opens in a new tab) - Python profiler integrated with PyCharm
5.6 - Debug Tools
When debugging PyTorch code, it is useful to know how to use the following tools:
- pdb (opens in a new tab) - Python debugger
- Visual Studio Code Debug mode (opens in a new tab) - Code editor debugger
- PyCharm (opens in a new tab) - IDE debugger