# Python vectorizing nested for loops

Approach #1

Here’s a vectorized approach –

``````m,n,r = volume.shape
x,y,z = np.mgrid[0:m,0:n,0:r]
X = x - roi[0]
Y = y - roi[1]
Z = z - roi[2]
``````

Possible improvement : We can probably speedup the last step with `numexpr` module –

``````import numexpr as ne

``````

Approach #2

We can also gradually build the three ranges corresponding to the shape parameters and perform the subtraction against the three elements of `roi` on the fly without actually creating the meshes as done earlier with `np.mgrid`. This would be benefited by the use of `broadcasting` for efficiency purposes. The implementation would look like this –

``````m,n,r = volume.shape
vals = ((np.arange(m)-roi[0])**2)[:,None,None] + \
((np.arange(n)-roi[1])**2)[:,None] + ((np.arange(r)-roi[2])**2)
``````

Simplified version : Thanks to @Bi Rico for suggesting an improvement here as we can use `np.ogrid` to perform those operations in a bit more concise manner, like so –

``````m,n,r = volume.shape
x,y,z = np.ogrid[0:m,0:n,0:r]-roi
``````

Runtime test

Function definitions –

``````def vectorized_app1(volume, roi, radius):
m,n,r = volume.shape
x,y,z = np.mgrid[0:m,0:n,0:r]
X = x - roi[0]
Y = y - roi[1]
Z = z - roi[2]
return X**2 + Y**2 + Z**2 < radius**2

m,n,r = volume.shape
x,y,z = np.mgrid[0:m,0:n,0:r]
X = x - roi[0]
Y = y - roi[1]
Z = z - roi[2]
return ne.evaluate('X**2 + Y**2 + Z**2 < radius**2')

m,n,r = volume.shape
vals = ((np.arange(m)-roi[0])**2)[:,None,None] + \
((np.arange(n)-roi[1])**2)[:,None] + ((np.arange(r)-roi[2])**2)

m,n,r = volume.shape
x,y,z = np.ogrid[0:m,0:n,0:r]-roi
``````

Timings –

``````In [106]: # Setup input arrays
...: volume = np.random.rand(90,110,100) # Half of original input sizes
...: roi = np.random.rand(3)
...:

1 loops, best of 3: 41.4 s per loop

In [108]: %timeit vectorized_app1(volume, roi, radius)
10 loops, best of 3: 62.3 ms per loop

In [109]: %timeit vectorized_app1_improved(volume, roi, radius)
10 loops, best of 3: 47 ms per loop

In [110]: %timeit vectorized_app2(volume, roi, radius)
100 loops, best of 3: 4.26 ms per loop

In [139]: %timeit vectorized_app2_simplified(volume, roi, radius)
100 loops, best of 3: 4.36 ms per loop
``````

So, as always `broadcasting` showing its magic for a crazy almost `10,000x` speedup over the original code and more than `10x` better than creating meshes by using on-the-fly broadcasted operations!

deneme bonusudeneme bonusu veren sitelertürkçe altyazılı pornocanlı bahis casino