Build faiss-GPU and corresponding Python extension on Windows

Faiss is used for efficient similarity search and dense vector clustering, often applied in similarity evaluation for anomaly detection. The library has a GPU extension version, but officially only supports Linux. This article discusses building faiss-GPU on Windows and installing it into Python.

Although theoretically I should add a Windows+CUDA related Github Action and then merge it upstream, I don’t currently have the intention to wrestle with that.

Preparation

Install Winget, used to download some tools. Windows should come with it pre-installed, but if not, you can visit getwinget or Microsoft Docs
VS 2019 or 2022. As of now (2025/09/23), Intel OneAPI does not support VS2026. Run: winget install Microsoft.VisualStudio.2022.BuildTools, or open Visual Studio Installer and choose vs2022 build tools
CMake, winget install Kitware.CMake
Ninja, much faster than MSBuild, run: winget install Ninja-build.Ninja
swig, faiss uses it to generate Python extensions, run: winget install SWIG.SWIG
BLAS, per official recommendation: if on Intel platform, choose Intel OneAPI MKL, otherwise use OpenBLAS
- Intel OneAPI, winget install Intel.OneAPI.BaseToolkit, very large (~2.8G). After installation, make sure you can find a file named setvars.bat. If installed with default path, it will be at C:\Program Files (x86)\Intel\oneAPI\setvars.bat
- OpenBLAS? I haven’t tried, I only succeeded with MKL
CUDA 12.x visit https://developer.nvidia.com/cuda-toolkit-archive and choose one. I tested successfully with 12.4, 12.9
[Optional] Python, if you don’t need the Python package, you can skip it

Install gflags

faiss depends on gflags, just run the following commands in order to install:

git clone https://github.com/gflags/gflags.git
cd gflags
cmake -B build-out . -G "Ninja" -DCMAKE_INSTALL_PREFIX=C:/opt/gflags # directory will be auto-created and registered
cmake --build build-out --config Release
cmake --install build-out

[Optional] Configure Python

Choose a directory you find appropriate, create a virtual environment, install three packages, then activate the environment and enter Faiss code directory:

python -m venv .venv
pip install setuptools packaging numpy

Activate environment

First find x64 Native Tools Command Prompt for VS 2022 in start menu, then run C:\Program Files (x86)\Intel\oneAPI\setvars.bat inside it. Result looks like:

**********************************************************************
** Visual Studio 2022 Developer Command Prompt v17.14.14
** Copyright (c) 2025 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'

C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools>"C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
:: initializing oneAPI environment...
   Initializing Visual Studio command-line environment...
   Visual Studio version 17.14.14 environment configured.
   "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\"
   Visual Studio command-line environment initialized for: 'x64'
:  advisor -- latest
:  compiler -- latest
:  dal -- latest
:  debugger -- latest
:  dev-utilities -- latest
:  dnnl -- latest
:  dpcpp-ct -- latest
:  dpl -- latest
:  ipp -- latest
:  ippcp -- latest
:  mkl -- latest
:  ocloc -- latest
:  pti -- latest
:  tbb -- latest
:  umf -- latest
:  vtune -- latest
:: oneAPI environment initialized ::

If I directly run C:\Program Files (x86)\Intel\oneAPI\setvars.bat in cmd, it will complain it can’t find VS environment.

Clone faiss

I made some modifications to upstream code, until those are patched upstream you can clone my branch:

git clone https://github.com/myuanz/faiss.git
cd faiss

From here, all commands are run in Faiss code directory.

Configure

# If you need Python extension, run
cmake . -B build -DFAISS_ENABLE_PYTHON=ON -G "Ninja" -DBUILD_TESTING=OFF -DCMAKE_BUILD_TYPE=Release -DFAISS_ENABLE_GPU=ON 
# If you don’t need Python extension, run
cmake . -B build -DFAISS_ENABLE_PYTHON=OFF -G "Ninja" -DBUILD_TESTING=OFF -DCMAKE_BUILD_TYPE=Release -DFAISS_ENABLE_GPU=ON

If SWIG is not found, you need to run this in PowerShell:

' -DSWIG_EXECUTABLE=' + (Get-Item (where.exe swig | Select-Object -First 1) -Force).Target

One possible output is:

 -DSWIG_EXECUTABLE=C:\Users\pc\AppData\Local\Microsoft\WinGet\Packages\SWIG.SWIG_Microsoft.Winget.Source_8wekyb3d8bbwe\swigwin-4.3.1\swig.exe

Paste this string to the end of the cmake command above.

Build C++ part

cmake --build build -j
cmake --install build --prefix install

I also uploaded this exported install to releases

Build Python part

cd build/faiss/python
python setup.py install

Modify loader.py

I’m not sure why, but I needed to modify loader.py:

- from .swigfaiss import *
+ from swigfaiss import *

to import successfully.

Check correctness

I provided an extra try_import_faiss_python.py in the repo. After installing faiss-python and modifying loader, running this file will show some GPU info and performance comparisons. Since runtime also needs some dlls, the file adds several paths at the top:

import os, sys, ctypes
 
for p in (
    r'./build/faiss/python',
    r'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\bin',
    r'C:\Program Files (x86)\Intel\oneAPI\compiler\2025.2\bin',
    r'C:\Program Files (x86)\Intel\oneAPI\mkl\2025.2\bin'
):
    p = os.path.abspath(p)
    os.add_dll_directory(p)
    sys.path.append(p)
 
# 
 
# The below code is generated by LLM:
# GPU self-check
ngpu = faiss.get_num_gpus()
print(f"[INFO] Num GPUs detected by FAISS: {ngpu}")
if ngpu == 0:
    raise RuntimeError("Can't find GPU")
 
# %%
 
# Parameters (adjust as needed; Flat complexity ~ nb*nq*d)
d   = 64
nb  = 200_000
nq  = 40_000           # Increase batch size to avoid launch/scheduling overhead dominating
k   = 10
seed = 123
 
rs = np.random.RandomState(seed)
xb = rs.randn(nb, d).astype('float32')
xq = rs.randn(nq, d).astype('float32')
 
xb = rs.randn(nb, d).astype('float32')
# Make the vectors have a "small structure" for sanity check
xb[:100, :] += 3
xq = (rs.randn(nq, d)).astype('float32')
xq[:5, :] += 3
 
# ---- CPU baseline ----
cpu = faiss.IndexFlatL2(d)
t0 = time.perf_counter()
cpu.add(xb)
t1 = time.perf_counter()
Dcpu, Icpu = cpu.search(xq, k)
t2 = time.perf_counter()
print(f"[CPU] add {1e3*(t1-t0):.1f} ms  search {1e3*(t2-t1):.1f} ms")
 
... # truncated

Running the code will first show CPU at full load, then GPU at full load. This code only proves it works, not elegant. Maybe in the future I’ll figure out how to publish a whl.

myuan 数字花园

最近写作

最近笔记