Some instruction sets are limited to one bit shift per instruction. And some instruction sets allow you to specify any number of bits to shift in one instruction, which usually takes one clock cycle on modern processors (modern being an intentionally vague word). See dan04’s answer about a barrel shifter, a circuit that shifts more than one bit in one operation.

It all boils down to the logic algorithm. Each bit in the result is a logic function based on the input. For a single right shift, the algorithm would be something like:

- If the instruction is [shift right] and bit 1 of the input is 1, then bit 0 of the result is 1, else bit 0 is 0.
- If the instruction is [shift right], then bit 1 = bit 2.
- etc.

But the logic equation could just as easily be:

- If the instruction is [shift right] and the amount operand is 1, then result bit 0 = shifted input bit 1.
- if the amount is 2 then bit 0 = bit 2.
- and so on.

The logic gates, being asynchronous, can do all of this in one clock cycle. Yet it is true the single shift allows for a faster clock cycle and less gates to settle, if all you are comparing is these two flavors of an instruction. Or the alternative is making it take longer to settle, so the instruction takes 2 or 3 clocks or whatever, and the logic counts to 3 then latches the result.

The MSP430, for example, only has single bit rotate right instructions (because you can perform a single bit shift or a rotate left with another instruction, which I will leave to the reader to figure out).

The ARM instruction set allows both immediate and register based multi-bit rotates, arithmetic shifts and logical shifts. I think there is only one actual rotate instruction and the other is an alias, because rotate left 1 is the same as a rotate right 32, you only need an one direction barrel shifter to implement a multi bit rotate.

SHL in the x86 allows more than one bit per instruction, but it used to take more than one clock.

and so on, you can easily examine any of the instruction sets out there.

The answer to your question is that it is not fixed. Sometimes it is one operation, one cycle, one instruction. Sometimes it is one instruction multiple clock cycles. Sometimes it is multiple instructions, multiple clock cycles.

The compilers often optimize for these sorts of things. Say you have a 16 bit register instruction set with a swap byte instruction and an AND instruction with immediate, but only a single bit shift. You may think shifting 8 bits would require 8 shift instruction cycles, but you could just swap bytes (one instruction) and then AND the lower half to zeros (which might take two instructions, or might be a variable word length instruction of two words, or it might encode into a single instruction) so it only takes 2 or 3 instruction/clock cycles instead of 8. For a shift of 9 bits, you can do the same thing and add a shift, making it 9 clocks vs 3 or 4. Also, on some architectures, it is faster to multiply by 256 than to shift by 8, etc, etc. Each instruction set has its own limitations and tricks.

It is not even the case that either most instruction sets provide multi bit or most limit to single bit. The processors that fall into the “computer” category, like X86, ARM, PowerPC, and MIPS, would lean toward one operation to shift. Expand to all processors but not necessarily “computers” commonly used today, and it shifts the other way, I would say more of them are single bit than multi bit, so multiple operations are needed to perform a multi-bit shift.