摘要 在 Windows 中通过自行编译 numpy, 可以省去 36 MiB 的 openblas 依赖. 自行编译 numpy 并没有想象中那么困难.
前言 近日需要打包一个使用了一些 numpy 计算的工具, pyinstaller
起手一看, 一个硕大的 openblas 依赖:
1 2 3 4 5 6 36.4 MiB [##########] libopenblas64__v0.3.23-246-g3d31191b-gcc_10_3_0.dll 10.6 MiB [## ] main.exe 5.5 MiB [# ] python311.dll 4.9 MiB [# ] /numpy 4.8 MiB [# ] /pydantic_core ...
我很崇尚过去那些几兆就能完成很多事情的强力工具, 加之这个程序里对 numpy 性能的要求并不算高, 因此我尝试着去除这个依赖, 网上有很多讨论打包和减小打包大小的, 但无非是把 mkl 换成 openblas, 但是 openblas 也够大的了.
经过检索, 此处 提供了不带 openblas 和 mkl 的构建, 但是最后一次更新已是 2022 年, 其版本还在 1.22.4, 有点老了. 既然已有第三方构建, 那我猜此事应该也不会太难.
准备 编译器 在 Windows 下构建 numpy, 只需在此 下载Microsoft C++ 生成工具
, 启动后勾选使用 C++ 的桌面开发
, 等待完成即可. 不需要设置什么环境变量, 后面 numpy 会自己处理.
源代码 1 2 3 git clone --recurse-submodules https://github.com/numpy/numpy.git cd numpygit checkout maintenance/1.24.x
关闭 openblas 在numpy/distutils/
添加一个文件site.cfg
, 内容为:
1 2 3 4 [openblas] libraries =library_dirs =include_dirs =
此文件完整描述可参考根目录的site.cfg.example
.
按照官方文档, 也可以通过设置环境变量来关闭
虚拟环境和依赖包 1 2 3 python -m venv . ./. pip install -r build_requirements.txt
之后默认在 .#env
的虚拟环境中执行命令.
构建 在 <=1.24.x 时, 可以裸 msvc 构建, 如下:
1 python setup.py build -j 16
但是 1.25+ 之后, 这种方式在 Windows 会生成单行超过 32768 的编译命令, 然后静默失败.
可行的方案是换用 cibuildwheel.
修改 pyproject.toml
, 找到[tool.cibuildwheel]
, 在before-build
, before-test
, test-command
这三行前面都加上#
, 或者删掉也行. 结果如下:
1 2 3 4 5 6 [tool.cibuildwheel] skip = "cp36-* cp37-* pp37-* *-manylinux_i686 *_ppc64le *_s390x *-musllinux_aarch64" build-verbosity = "3"
before-build
的脚本是用来下载 openblas 的, 后面的两个 test 我这边运行不成.
之后在 PowerShell 中构建:
1 2 3 $env :CIBW_BUILD="cp311-win_amd64" $env :CIBW_ENVIRONMENT="NPY_USE_BLAS_ILP64=0" cibuildwheel --platform windows
如果一切正常, 应当在 wheelhouse 目录下生成一个 whl 文件, 就是我们所要的了.
1 2 -rwxrwxrwx 1 root root 6.1M Aug 7 23:37 wheelhouse/numpy-1.25.2-cp310-cp310-win_amd64.whl -rwxrwxrwx 1 root root 6.1M Aug 7 23:44 wheelhouse/numpy-1.25.2-cp311-cp311-win_amd64.whl
测试 安装 whl 后看看调试输出:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 > python -c "import numpy as np; np.show_config(); print(np.__version__)" blas_armpl_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE blas_ssl2_info: NOT AVAILABLE blis_info: NOT AVAILABLE openblas_info: NOT AVAILABLE accelerate_info: NOT AVAILABLE atlas_3_10_blas_threads_info: NOT AVAILABLE atlas_3_10_blas_info: NOT AVAILABLE atlas_blas_threads_info: NOT AVAILABLE atlas_blas_info: NOT AVAILABLE blas_info: NOT AVAILABLE blas_src_info: NOT AVAILABLE blas_opt_info: NOT AVAILABLE lapack_armpl_info: NOT AVAILABLE lapack_mkl_info: NOT AVAILABLE lapack_ssl2_info: NOT AVAILABLE openblas_lapack_info: NOT AVAILABLE openblas_clapack_info: NOT AVAILABLE flame_info: NOT AVAILABLE atlas_3_10_threads_info: NOT AVAILABLE atlas_3_10_info: NOT AVAILABLE atlas_threads_info: NOT AVAILABLE atlas_info: NOT AVAILABLE lapack_info: NOT AVAILABLE lapack_src_info: NOT AVAILABLE lapack_opt_info: NOT AVAILABLE numpy_linalg_lapack_lite: language = c define_macros = [('HAVE_BLAS_ILP64' , None), ('BLAS_SYMBOL_SUFFIX' , '64_' )] Supported SIMD extensions in this NumPy install: baseline = SSE,SSE2,SSE3 found = SSSE3,SSE41,POPCNT,SSE42,AVX,F16C,FMA3,AVX2 not found = AVX512F,AVX512CD,AVX512_SKX,AVX512_CLX,AVX512_CNL,AVX512_ICL 1.25.2
很好, 完全没有 openblas, 跑一下 tests:
1 2 3 4 > python runtests.py -v --no-build ... =============== 35028 passed, 1100 skipped, 1308 deselected, 29 xfailed, 2 xpassed in 345.30s (0:05:45) ===============
如果是普通的二进制安装, 则结果为
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 > python -c "import numpy as np; np.show_config(); print(np.__version__)" openblas64__info: libraries = ['openblas64_' , 'openblas64_' ] library_dirs = ['openblas\\lib' ] language = c define_macros = [('HAVE_CBLAS' , None), ('BLAS_SYMBOL_SUFFIX' , '64_' ), ('HAVE_BLAS_ILP64' , None)] runtime_library_dirs = ['openblas\\lib' ] blas_ilp64_opt_info: libraries = ['openblas64_' , 'openblas64_' ] library_dirs = ['openblas\\lib' ] language = c define_macros = [('HAVE_CBLAS' , None), ('BLAS_SYMBOL_SUFFIX' , '64_' ), ('HAVE_BLAS_ILP64' , None)] runtime_library_dirs = ['openblas\\lib' ] openblas64__lapack_info: libraries = ['openblas64_' , 'openblas64_' ] library_dirs = ['openblas\\lib' ] language = c define_macros = [('HAVE_CBLAS' , None), ('BLAS_SYMBOL_SUFFIX' , '64_' ), ('HAVE_BLAS_ILP64' , None), ('HAVE_LAPACKE' , None)] runtime_library_dirs = ['openblas\\lib' ] lapack_ilp64_opt_info: libraries = ['openblas64_' , 'openblas64_' ] library_dirs = ['openblas\\lib' ] language = c define_macros = [('HAVE_CBLAS' , None), ('BLAS_SYMBOL_SUFFIX' , '64_' ), ('HAVE_BLAS_ILP64' , None), ('HAVE_LAPACKE' , None)] runtime_library_dirs = ['openblas\\lib' ] Supported SIMD extensions in this NumPy install: baseline = SSE,SSE2,SSE3 found = SSSE3,SSE41,POPCNT,SSE42,AVX,F16C,FMA3,AVX2 not found = AVX512F,AVX512CD,AVX512_SKX,AVX512_CLX,AVX512_CNL,AVX512_ICL 1.25.2
打包 1 2 3 4 5 6 7 import numpy as npprint (np.random.rand(5 , 5 ))np.show_config() print (np.__version__)
在无 openblas 安装中, pyinstaller test.py
得到结果:
1 2 3 4 5 6 7 8 9 6.9 MiB [##########] /numpy 5.5 MiB [####### ] python311.dll 3.9 MiB [##### ] test.exe 3.3 MiB [#### ] libcrypto-1_1.dll 1.7 MiB [## ] base_library.zip 1.1 MiB [# ] unicodedata.pyd 996.0 KiB [# ] ucrtbase.dll ... Total disk usage: 26.1 MiB Apparent size: 25.9 MiB Items: 79
在默认安装中得到结果:
1 2 3 4 5 6 7 8 9 10 36.4 MiB [##########] libopenblas64__v0.3.23-246-g3d31191b-gcc_10_3_0.dll 5.5 MiB [# ] python311.dll 4.9 MiB [# ] /numpy 3.9 MiB [# ] test.exe 3.3 MiB [ ] libcrypto-1_1.dll 1.7 MiB [ ] base_library.zip 1.1 MiB [ ] unicodedata.pyd 996.0 KiB [ ] ucrtbase.dll ... Total disk usage: 60.6 MiB Apparent size: 60.5 MiB Items: 81
差不多差了 35 MiB, 还是很可观的. 去掉 openblas, 我的工具单文件 15 MiB 就够了, 虽然离心目中的几兆还有些距离, 但这毕竟是 Python, 我很满足了.
拾遗
我编译 1.25.x 时的 HEAD 是 ea677928332c37e8052b4d599bf6ee52cf363cf9, 如果有哪里不同, 可以git reset ea677928332c37e8052b4d599bf6ee52cf363cf9
过来
我的 Windows 版本是 22H2 19045.3271
你需要准备一把顺畅的梯子, cibuildwheel 会需要从 Python 官方网站下载全新的 Python
在我的洋垃圾 E5-2678 下纯编译过程需要两分钟左右
上面的大小是用 ncdu 输出的