# Fast Algorithms for Finding Pairwise Euclidean Distance (Distance Matrix)

Well, I couldn’t resist playing around. I created a Matlab mex C file called `pdistc` that implements pairwise Euclidean distance for single and double precision. On my machine using Matlab R2012b and R2015a it’s 20–25% faster than `pdist`(and the underlying `pdistmex` helper function) for large inputs (e.g., 60,000-by-300).

As has been pointed out, this problem is fundamentally bounded by memory and you’re asking for a lot of it. My mex C code uses minimal memory beyond that needed for the output. In comparing its memory usage to that of `pdist`, it looks like the two are virtually the same. In other words, `pdist` is not using lots of extra memory. Your memory problem is likely in the memory used up before calling `pdist` (can you use `clear` to remove any large arrays?) or simply because you’re trying to solve a big computational problem on tiny hardware.

So, my `pdistc` function likely won’t be able to save you memory overall, but you may be able to use another feature I built in. You can calculate chunks of your overall pairwise distance vector. Something like this:

``````m = 6e3;
n = 3e2;
X = rand(m,n);
sz = m*(m-1)/2;

for i = 1:m:sz-m
D = pdistc(X', i, i+m); % mex C function, X is transposed relative to pdist
...                     % Process chunk of pairwise distances
end
``````

This is considerably slower (10 times or so) and this part of my C code is not optimized well, but it will allow much less memory use – assuming that you don’t need the entire array at one time. Note that you could do the same thing much more efficiently with `pdist` (or `pdistc`) by creating a loop where you passed in subsets of `X` directly, rather than all of it.

If you have a 64-bit Intel Mac, you won’t need to compile as I’ve included the `.mexmaci64` binary, but otherwise you’ll need to figure out how to compile the code for your machine. I can’t help you with that. It’s possible that you may not be able to get it to compile or that there will be compatibility issues that you’ll need to solve by editing the code yourself. It’s also possible that there are bugs and the code will crash Matlab. Also, note that you may get slightly different outputs relative to `pdist` with differences between the two in the range of machine epsilon (`eps`). `pdist` may or may not do fancy things to avoid overflows for large inputs and other numeric issues, but be aware that my code does not.

Additionally, I created a simple pure Matlab implementation. It is massively slower than the mex code, but still faster than a naïve implementation or the code found in `pdist`.

All of the files can be found here. The ZIP archive includes all of the files. It’s BSD licensed. Feel free to optimize (I tried BLAS calls and OpenMP in the C code to no avail – maybe some pointer magic or GPU/OpenCL could further speed it up). I hope that it can be helpful to you or someone else.