README.rst 9.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180
  1. ================
  2. Vendoring Policy
  3. ================
  4. * Vendored libraries **MUST** not be modified except as required to
  5. successfully vendor them.
  6. * Vendored libraries **MUST** be released copies of libraries available on
  7. PyPI.
  8. * Vendored libraries **MUST** be available under a license that allows
  9. them to be integrated into ``pip``, which is released under the MIT license.
  10. * Vendored libraries **MUST** be accompanied with LICENSE files.
  11. * The versions of libraries vendored in pip **MUST** be reflected in
  12. ``pip/_vendor/vendor.txt``.
  13. * Vendored libraries **MUST** function without any build steps such as ``2to3``
  14. or compilation of C code, practically this limits to single source 2.x/3.x and
  15. pure Python.
  16. * Any modifications made to libraries **MUST** be noted in
  17. ``pip/_vendor/README.rst`` and their corresponding patches **MUST** be
  18. included ``tools/vendoring/patches``.
  19. * Vendored libraries should have corresponding ``vendored()`` entries in
  20. ``pip/_vendor/__init__.py``.
  21. Rationale
  22. =========
  23. Historically pip has not had any dependencies except for ``setuptools`` itself,
  24. choosing instead to implement any functionality it needed to prevent needing
  25. a dependency. However, starting with pip 1.5, we began to replace code that was
  26. implemented inside of pip with reusable libraries from PyPI. This brought the
  27. typical benefits of reusing libraries instead of reinventing the wheel like
  28. higher quality and more battle tested code, centralization of bug fixes
  29. (particularly security sensitive ones), and better/more features for less work.
  30. However, there are several issues with having dependencies in the traditional
  31. way (via ``install_requires``) for pip. These issues are:
  32. **Fragility**
  33. When pip depends on another library to function then if for whatever reason
  34. that library either isn't installed or an incompatible version is installed
  35. then pip ceases to function. This is of course true for all Python
  36. applications, however for every application *except* for pip the way you fix
  37. it is by re-running pip. Obviously, when pip can't run, you can't use pip to
  38. fix pip, so you're left having to manually resolve dependencies and
  39. installing them by hand.
  40. **Making other libraries uninstallable**
  41. One of pip's current dependencies is the ``requests`` library, for which pip
  42. requires a fairly recent version to run. If pip depended on ``requests`` in
  43. the traditional manner, then we'd either have to maintain compatibility with
  44. every ``requests`` version that has ever existed (and ever will), OR allow
  45. pip to render certain versions of ``requests`` uninstallable. (The second
  46. issue, although technically true for any Python application, is magnified by
  47. pip's ubiquity; pip is installed by default in Python, in ``pyvenv``, and in
  48. ``virtualenv``.)
  49. **Security**
  50. This might seem puzzling at first glance, since vendoring has a tendency to
  51. complicate updating dependencies for security updates, and that holds true
  52. for pip. However, given the *other* reasons for avoiding dependencies, the
  53. alternative is for pip to reinvent the wheel itself. This is what pip did
  54. historically. It forced pip to re-implement its own HTTPS verification
  55. routines as a workaround for the Python standard library's lack of SSL
  56. validation, which resulted in similar bugs in the validation routine in
  57. ``requests`` and ``urllib3``, except that they had to be discovered and
  58. fixed independently. Even though we're vendoring, reusing libraries keeps
  59. pip more secure by relying on the great work of our dependencies, *and*
  60. allowing for faster, easier security fixes by simply pulling in newer
  61. versions of dependencies.
  62. **Bootstrapping**
  63. Currently most popular methods of installing pip rely on pip's
  64. self-contained nature to install pip itself. These tools work by bundling a
  65. copy of pip, adding it to ``sys.path``, and then executing that copy of pip.
  66. This is done instead of implementing a "mini installer" (to reduce
  67. duplication); pip already knows how to install a Python package, and is far
  68. more battle-tested than any "mini installer" could ever possibly be.
  69. Many downstream redistributors have policies against this kind of bundling, and
  70. instead opt to patch the software they distribute to debundle it and make it
  71. rely on the global versions of the software that they already have packaged
  72. (which may have its own patches applied to it). We (the pip team) would prefer
  73. it if pip was *not* debundled in this manner due to the above reasons and
  74. instead we would prefer it if pip would be left intact as it is now.
  75. In the longer term, if someone has a *portable* solution to the above problems,
  76. other than the bundling method we currently use, that doesn't add additional
  77. problems that are unreasonable then we would be happy to consider, and possibly
  78. switch to said method. This solution must function correctly across all of the
  79. situation that we expect pip to be used and not mandate some external mechanism
  80. such as OS packages.
  81. Modifications
  82. =============
  83. * ``setuptools`` is completely stripped to only keep ``pkg_resources``.
  84. * ``pkg_resources`` has been modified to import its dependencies from
  85. ``pip._vendor``, and to use the vendored copy of ``platformdirs``
  86. rather than ``appdirs``.
  87. * ``packaging`` has been modified to import its dependencies from
  88. ``pip._vendor``.
  89. * ``CacheControl`` has been modified to import its dependencies from
  90. ``pip._vendor``.
  91. * ``requests`` has been modified to import its other dependencies from
  92. ``pip._vendor`` and to *not* load ``simplejson`` (all platforms) and
  93. ``pyopenssl`` (Windows).
  94. * ``platformdirs`` has been modified to import its submodules from ``pip._vendor.platformdirs``.
  95. Automatic Vendoring
  96. ===================
  97. Vendoring is automated via the `vendoring <https://pypi.org/project/vendoring/>`_ tool from the content of
  98. ``pip/_vendor/vendor.txt`` and the different patches in
  99. ``tools/vendoring/patches``.
  100. Launch it via ``vendoring sync . -v`` (requires ``vendoring>=0.2.2``).
  101. Tool configuration is done via ``pyproject.toml``.
  102. To update the vendored library versions, we have a session defined in ``nox``.
  103. The command to upgrade everything is::
  104. nox -s vendoring -- --upgrade-all --skip urllib3 --skip setuptools
  105. At the time of writing (April 2025) we do not upgrade ``urllib3`` because the
  106. next version is a major upgrade and will be handled as an independent PR. We also
  107. do not upgrade ``setuptools``, because we only rely on ``pkg_resources``, and
  108. tracking every ``setuptools`` change is unnecessary for our needs.
  109. Managing Local Patches
  110. ======================
  111. The ``vendoring`` tool automatically applies our local patches, but updating,
  112. the patches sometimes no longer apply cleanly. In that case, the update will
  113. fail. To resolve this, take the following steps:
  114. 1. Revert any incomplete changes in the revendoring branch, to ensure you have
  115. a clean starting point.
  116. 2. Run the revendoring of the library with a problem again: ``nox -s vendoring
  117. -- --upgrade <library_name>``.
  118. 3. This will fail again, but you will have the original source in your working
  119. directory. Review the existing patch against the source, and modify the patch
  120. to reflect the new version of the source. If you ``git add`` the changes the
  121. vendoring made, you can modify the source to reflect the patch file and then
  122. generate a new patch with ``git diff``.
  123. 4. Now, revert everything *except* the patch file changes. Leave the modified
  124. patch file unstaged but saved in the working tree.
  125. 5. Re-run the vendoring. This time, it should pick up the changed patch file
  126. and apply it cleanly. The patch file changes will be committed along with the
  127. revendoring, so the new commit should be ready to test and publish as a PR.
  128. Debundling
  129. ==========
  130. As mentioned in the rationale, we, the pip team, would prefer it if pip was not
  131. debundled (other than optionally ``pip/_vendor/requests/cacert.pem``) and that
  132. pip was left intact. However, if you insist on doing so, we have a
  133. semi-supported method (that we don't test in our CI) and requires a bit of
  134. extra work on your end in order to solve the problems described above.
  135. 1. Delete everything in ``pip/_vendor/`` **except** for
  136. ``pip/_vendor/__init__.py`` and ``pip/_vendor/vendor.txt``.
  137. 2. Generate wheels for each of pip's dependencies (and any of their
  138. dependencies) using your patched copies of these libraries. These must be
  139. placed somewhere on the filesystem that pip can access (``pip/_vendor`` is
  140. the default assumption).
  141. 3. Modify ``pip/_vendor/__init__.py`` so that the ``DEBUNDLED`` variable is
  142. ``True``.
  143. 4. Upon installation, the ``INSTALLER`` file in pip's own ``dist-info``
  144. directory should be set to something other than ``pip``, so that pip
  145. can detect that it wasn't installed using itself.
  146. 5. *(optional)* If you've placed the wheels in a location other than
  147. ``pip/_vendor/``, then modify ``pip/_vendor/__init__.py`` so that the
  148. ``WHEEL_DIR`` variable points to the location you've placed them.
  149. 6. *(optional)* Update the ``pip_self_version_check`` logic to use the
  150. appropriate logic for determining the latest available version of pip and
  151. prompt the user with the correct upgrade message.
  152. Note that partial debundling is **NOT** supported. You need to prepare wheels
  153. for all dependencies for successful debundling.