Your understanding is mostly correct. You use
select_related when the object that you’re going to be selecting is a single object, so
OneToOneField or a
ForeignKey. You use
prefetch_related when you’re going to get a “set” of things, so
ManyToManyFields as you stated or reverse
ForeignKeys. Just to clarify what I mean by “reverse
ForeignKeys” here’s an example:
class ModelA(models.Model): pass class ModelB(models.Model): a = ForeignKey(ModelA) ModelB.objects.select_related('a').all() # Forward ForeignKey relationship ModelA.objects.prefetch_related('modelb_set').all() # Reverse ForeignKey relationship
The difference is that
select_related does an SQL join and therefore gets the results back as part of the table from the SQL server.
prefetch_related on the other hand executes another query and therefore reduces the redundant columns in the original object (
ModelA in the above example). You may use
prefetch_related for anything that you can use
The tradeoffs are that
prefetch_related has to create and send a list of IDs to select back to the server, this can take a while. I’m not sure if there’s a nice way of doing this in a transaction, but my understanding is that Django always just sends a list and says SELECT … WHERE pk IN (…,…,…) basically. In this case if the prefetched data is sparse (let’s say U.S. State objects linked to people’s addresses) this can be very good, however if it’s closer to one-to-one, this can waste a lot of communications. If in doubt, try both and see which performs better.
Everything discussed above is basically about the communications with the database. On the Python side however
prefetch_related has the extra benefit that a single object is used to represent each object in the database. With
select_related duplicate objects will be created in Python for each “parent” object. Since objects in Python have a decent bit of memory overhead this can also be a consideration.