One reason memoryview
s are useful is that they can be sliced without copying the underlying data, unlike bytes
/str
.
For example, take the following toy example.
import time
for n in (100000, 200000, 300000, 400000):
data = b'x'*n
start = time.time()
b = data
while b:
b = b[1:]
print(f' bytes {n} {time.time() - start:0.3f}')
for n in (100000, 200000, 300000, 400000):
data = b'x'*n
start = time.time()
b = memoryview(data)
while b:
b = b[1:]
print(f'memoryview {n} {time.time() - start:0.3f}')
On my computer, I get
bytes 100000 0.211
bytes 200000 0.826
bytes 300000 1.953
bytes 400000 3.514
memoryview 100000 0.021
memoryview 200000 0.052
memoryview 300000 0.043
memoryview 400000 0.077
You can clearly see the quadratic complexity of the repeated string slicing. Even with only 400000 iterations, it’s already unmanageable. Meanwhile, the memoryview
version has linear complexity and is lightning fast.
Edit: Note that this was done in CPython. There was a bug in Pypy up to 4.0.1 that caused memoryviews to have quadratic performance.