| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180 |
- ================
- Vendoring Policy
- ================
- * Vendored libraries **MUST** not be modified except as required to
- successfully vendor them.
- * Vendored libraries **MUST** be released copies of libraries available on
- PyPI.
- * Vendored libraries **MUST** be available under a license that allows
- them to be integrated into ``pip``, which is released under the MIT license.
- * Vendored libraries **MUST** be accompanied with LICENSE files.
- * The versions of libraries vendored in pip **MUST** be reflected in
- ``pip/_vendor/vendor.txt``.
- * Vendored libraries **MUST** function without any build steps such as ``2to3``
- or compilation of C code, practically this limits to single source 2.x/3.x and
- pure Python.
- * Any modifications made to libraries **MUST** be noted in
- ``pip/_vendor/README.rst`` and their corresponding patches **MUST** be
- included ``tools/vendoring/patches``.
- * Vendored libraries should have corresponding ``vendored()`` entries in
- ``pip/_vendor/__init__.py``.
- Rationale
- =========
- Historically pip has not had any dependencies except for ``setuptools`` itself,
- choosing instead to implement any functionality it needed to prevent needing
- a dependency. However, starting with pip 1.5, we began to replace code that was
- implemented inside of pip with reusable libraries from PyPI. This brought the
- typical benefits of reusing libraries instead of reinventing the wheel like
- higher quality and more battle tested code, centralization of bug fixes
- (particularly security sensitive ones), and better/more features for less work.
- However, there are several issues with having dependencies in the traditional
- way (via ``install_requires``) for pip. These issues are:
- **Fragility**
- When pip depends on another library to function then if for whatever reason
- that library either isn't installed or an incompatible version is installed
- then pip ceases to function. This is of course true for all Python
- applications, however for every application *except* for pip the way you fix
- it is by re-running pip. Obviously, when pip can't run, you can't use pip to
- fix pip, so you're left having to manually resolve dependencies and
- installing them by hand.
- **Making other libraries uninstallable**
- One of pip's current dependencies is the ``requests`` library, for which pip
- requires a fairly recent version to run. If pip depended on ``requests`` in
- the traditional manner, then we'd either have to maintain compatibility with
- every ``requests`` version that has ever existed (and ever will), OR allow
- pip to render certain versions of ``requests`` uninstallable. (The second
- issue, although technically true for any Python application, is magnified by
- pip's ubiquity; pip is installed by default in Python, in ``pyvenv``, and in
- ``virtualenv``.)
- **Security**
- This might seem puzzling at first glance, since vendoring has a tendency to
- complicate updating dependencies for security updates, and that holds true
- for pip. However, given the *other* reasons for avoiding dependencies, the
- alternative is for pip to reinvent the wheel itself. This is what pip did
- historically. It forced pip to re-implement its own HTTPS verification
- routines as a workaround for the Python standard library's lack of SSL
- validation, which resulted in similar bugs in the validation routine in
- ``requests`` and ``urllib3``, except that they had to be discovered and
- fixed independently. Even though we're vendoring, reusing libraries keeps
- pip more secure by relying on the great work of our dependencies, *and*
- allowing for faster, easier security fixes by simply pulling in newer
- versions of dependencies.
- **Bootstrapping**
- Currently most popular methods of installing pip rely on pip's
- self-contained nature to install pip itself. These tools work by bundling a
- copy of pip, adding it to ``sys.path``, and then executing that copy of pip.
- This is done instead of implementing a "mini installer" (to reduce
- duplication); pip already knows how to install a Python package, and is far
- more battle-tested than any "mini installer" could ever possibly be.
- Many downstream redistributors have policies against this kind of bundling, and
- instead opt to patch the software they distribute to debundle it and make it
- rely on the global versions of the software that they already have packaged
- (which may have its own patches applied to it). We (the pip team) would prefer
- it if pip was *not* debundled in this manner due to the above reasons and
- instead we would prefer it if pip would be left intact as it is now.
- In the longer term, if someone has a *portable* solution to the above problems,
- other than the bundling method we currently use, that doesn't add additional
- problems that are unreasonable then we would be happy to consider, and possibly
- switch to said method. This solution must function correctly across all of the
- situation that we expect pip to be used and not mandate some external mechanism
- such as OS packages.
- Modifications
- =============
- * ``setuptools`` is completely stripped to only keep ``pkg_resources``.
- * ``pkg_resources`` has been modified to import its dependencies from
- ``pip._vendor``, and to use the vendored copy of ``platformdirs``
- rather than ``appdirs``.
- * ``packaging`` has been modified to import its dependencies from
- ``pip._vendor``.
- * ``CacheControl`` has been modified to import its dependencies from
- ``pip._vendor``.
- * ``requests`` has been modified to import its other dependencies from
- ``pip._vendor`` and to *not* load ``simplejson`` (all platforms) and
- ``pyopenssl`` (Windows).
- * ``platformdirs`` has been modified to import its submodules from ``pip._vendor.platformdirs``.
- Automatic Vendoring
- ===================
- Vendoring is automated via the `vendoring <https://pypi.org/project/vendoring/>`_ tool from the content of
- ``pip/_vendor/vendor.txt`` and the different patches in
- ``tools/vendoring/patches``.
- Launch it via ``vendoring sync . -v`` (requires ``vendoring>=0.2.2``).
- Tool configuration is done via ``pyproject.toml``.
- To update the vendored library versions, we have a session defined in ``nox``.
- The command to upgrade everything is::
- nox -s vendoring -- --upgrade-all --skip urllib3 --skip setuptools
- At the time of writing (April 2025) we do not upgrade ``urllib3`` because the
- next version is a major upgrade and will be handled as an independent PR. We also
- do not upgrade ``setuptools``, because we only rely on ``pkg_resources``, and
- tracking every ``setuptools`` change is unnecessary for our needs.
- Managing Local Patches
- ======================
- The ``vendoring`` tool automatically applies our local patches, but updating,
- the patches sometimes no longer apply cleanly. In that case, the update will
- fail. To resolve this, take the following steps:
- 1. Revert any incomplete changes in the revendoring branch, to ensure you have
- a clean starting point.
- 2. Run the revendoring of the library with a problem again: ``nox -s vendoring
- -- --upgrade <library_name>``.
- 3. This will fail again, but you will have the original source in your working
- directory. Review the existing patch against the source, and modify the patch
- to reflect the new version of the source. If you ``git add`` the changes the
- vendoring made, you can modify the source to reflect the patch file and then
- generate a new patch with ``git diff``.
- 4. Now, revert everything *except* the patch file changes. Leave the modified
- patch file unstaged but saved in the working tree.
- 5. Re-run the vendoring. This time, it should pick up the changed patch file
- and apply it cleanly. The patch file changes will be committed along with the
- revendoring, so the new commit should be ready to test and publish as a PR.
- Debundling
- ==========
- As mentioned in the rationale, we, the pip team, would prefer it if pip was not
- debundled (other than optionally ``pip/_vendor/requests/cacert.pem``) and that
- pip was left intact. However, if you insist on doing so, we have a
- semi-supported method (that we don't test in our CI) and requires a bit of
- extra work on your end in order to solve the problems described above.
- 1. Delete everything in ``pip/_vendor/`` **except** for
- ``pip/_vendor/__init__.py`` and ``pip/_vendor/vendor.txt``.
- 2. Generate wheels for each of pip's dependencies (and any of their
- dependencies) using your patched copies of these libraries. These must be
- placed somewhere on the filesystem that pip can access (``pip/_vendor`` is
- the default assumption).
- 3. Modify ``pip/_vendor/__init__.py`` so that the ``DEBUNDLED`` variable is
- ``True``.
- 4. Upon installation, the ``INSTALLER`` file in pip's own ``dist-info``
- directory should be set to something other than ``pip``, so that pip
- can detect that it wasn't installed using itself.
- 5. *(optional)* If you've placed the wheels in a location other than
- ``pip/_vendor/``, then modify ``pip/_vendor/__init__.py`` so that the
- ``WHEEL_DIR`` variable points to the location you've placed them.
- 6. *(optional)* Update the ``pip_self_version_check`` logic to use the
- appropriate logic for determining the latest available version of pip and
- prompt the user with the correct upgrade message.
- Note that partial debundling is **NOT** supported. You need to prepare wheels
- for all dependencies for successful debundling.
|