Previous answers are obsolete. Currently tf.matmul()
support tensors with rank > 2:
The inputs must be matrices (or tensors of rank > 2, representing
batches of matrices), with matching inner dimensions, possibly after
transposition.
Also tf.batch_matmul()
was removed and tf.matmul()
is the right way to do batch multiplication. The main idea can be understood from the following code:
import tensorflow as tf
batch_size, n, m, k = 10, 3, 5, 2
A = tf.Variable(tf.random_normal(shape=(batch_size, n, m)))
B = tf.Variable(tf.random_normal(shape=(batch_size, m, k)))
tf.matmul(A, B)
Now you will receive a tensor of the shape (batch_size, n, k)
. Here is what is going on here. Assume you have batch_size
of matrices nxm
and batch_size
of matrices mxk
. Now for each pair of them you calculate nxm X mxk
which gives you an nxk
matrix. You will have batch_size
of them.
Notice that something like this is also valid:
A = tf.Variable(tf.random_normal(shape=(a, b, n, m)))
B = tf.Variable(tf.random_normal(shape=(a, b, m, k)))
tf.matmul(A, B)
and will give you a shape (a, b, n, k)