matrix-multiplication
How to get faster code than numpy.dot for matrix multiplication?
np.dot dispatches to BLAS when NumPy has been compiled to use BLAS, a BLAS implementation is available at run-time, your data has one of the dtypes float32, float64, complex32 or complex64, and the data is suitably aligned in memory. Otherwise, it defaults to using its own, slow, matrix multiplication routine. Checking your BLAS linkage is … Read more
Is there any fast method of matrix exponentiation?
You could factor the matrix into eigenvalues and eigenvectors. Then you get M = V * D * V^-1 Where V is the eigenvector matrix and D is a diagonal matrix. To raise this to the Nth power, you get something like: M^n = (V * D * V^-1) * (V * D * V^-1) … Read more
Matrix multiplication using arrays
You can try this code: public class MyMatrix { Double[][] A = { { 4.00, 3.00 }, { 2.00, 1.00 } }; Double[][] B = { { -0.500, 1.500 }, { 1.000, -2.0000 } }; public static Double[][] multiplicar(Double[][] A, Double[][] B) { int aRows = A.length; int aColumns = A[0].length; int bRows = B.length; … Read more
Why list comprehension is much faster than numpy for multiplying arrays?
Creation of numpy arrays is much slower than creation of lists: In [153]: %timeit a = [[2,3,5],[3,6,2],[1,3,2]] 1000000 loops, best of 3: 308 ns per loop In [154]: %timeit a = np.array([[2,3,5],[3,6,2],[1,3,2]]) 100000 loops, best of 3: 2.27 µs per loop There can also fixed costs incurred by NumPy function calls before the meat of … Read more
Why is there huge performance hit in 2048×2048 versus 2047×2047 array multiplication?
This probably has do with conflicts in your L2 cache. Cache misses on matice1 are not the problem because they are accessed sequentially. However for matice2 if a full column fits in L2 (i.e when you access matice2[0, 0], matice2[1, 0], matice2[2, 0] … etc, nothing gets evicted) than there is no problem with cache … Read more
bsxfun implementation in matrix multiplication
Send x to the third dimension, so that singleton expansion would come into effect when bsxfun is used for multiplication with A, extending the product result to the third dimension. Then, perform the bsxfun multiplication – val = bsxfun(@times,A,permute(x,[3 1 2])) Now, val is a 3D matrix and the desired output is expected to be … Read more
Multiply a 3D matrix with a 2D matrix
As a personal preference, I like my code to be as succinct and readable as possible. Here’s what I would have done, though it doesn’t meet your ‘no-loops’ requirement: for m = 1:C Z(:,:,m) = X(:,:,m)*Y; end This results in an A x D x C matrix Z. And of course, you can always pre-allocate … Read more
Minimizing overhead due to the large number of Numpy dot calls
It depends on the size of the matrices Edit For larger nxn matrices (aprox. size 20) a BLAS call from compiled code is faster, for smaller matrices custom Numba or Cython Kernels are usually faster. The following method generates custom dot- functions for given input shapes. With this method it is also possible to benefit … Read more
Why is matrix multiplication faster with numpy than with ctypes in Python?
NumPy uses a highly-optimized, carefully-tuned BLAS method for matrix multiplication (see also: ATLAS). The specific function in this case is GEMM (for generic matrix multiplication). You can look up the original by searching for dgemm.f (it’s in Netlib). The optimization, by the way, goes beyond compiler optimizations. Above, Philip mentioned Coppersmith–Winograd. If I remember correctly, … Read more