How can I avoid the Warning “firstResult/maxResults specified with collection fetch; applying in memory!” when using Hibernate?

Although you are getting valid results, the SQL query fetches all data and it’s not as efficient as it should.

So, you have two options.

Fixing the issue with two SQL queries that can fetch entities in read-write mode

The easiest way to fix this issue is to execute two queries:

. The first query will fetch the root entity identifiers matching the provided filtering criteria.
. The second query will use the previously extracted root entity identifiers to fetch the parent and the child entities.

This approach is very easy to implement and looks as follows:

List<Long> postIds = entityManager
.createQuery(
    "select p.id " +
    "from Post p " +
    "where p.title like :titlePattern " +
    "order by p.createdOn", Long.class)
.setParameter(
    "titlePattern",
    "High-Performance Java Persistence %"
)
.setMaxResults(5)
.getResultList();
 
List<Post> posts = entityManager
.createQuery(
    "select distinct p " +
    "from Post p " +
    "left join fetch p.comments " +
    "where p.id in (:postIds) "  +
    "order by p.createdOn", Post.class)
.setParameter("postIds", postIds)
.setHint(
    "hibernate.query.passDistinctThrough", 
    false
)
.getResultList();

Fixing the issue with one SQL query that can only fetch entities in read-only mode

The second approach is to use SDENSE_RANK over the result set of parent and child entities that match our filtering criteria and restrict the output for the first N post entries only.

The SQL query can look as follows:

@NamedNativeQuery(
    name = "PostWithCommentByRank",
    query =
        "SELECT * " +
        "FROM (   " +
        "    SELECT *, dense_rank() OVER (ORDER BY \"p.created_on\", \"p.id\") rank " +
        "    FROM (   " +
        "        SELECT p.id AS \"p.id\", " +
        "               p.created_on AS \"p.created_on\", " +
        "               p.title AS \"p.title\", " +
        "               pc.id as \"pc.id\", " +
        "               pc.created_on AS \"pc.created_on\", " +
        "               pc.review AS \"pc.review\", " +
        "               pc.post_id AS \"pc.post_id\" " +
        "        FROM post p  " +
        "        LEFT JOIN post_comment pc ON p.id = pc.post_id " +
        "        WHERE p.title LIKE :titlePattern " +
        "        ORDER BY p.created_on " +
        "    ) p_pc " +
        ") p_pc_r " +
        "WHERE p_pc_r.rank <= :rank ",
    resultSetMapping = "PostWithCommentByRankMapping"
)
@SqlResultSetMapping(
    name = "PostWithCommentByRankMapping",
    entities = {
        @EntityResult(
            entityClass = Post.class,
            fields = {
                @FieldResult(name = "id", column = "p.id"),
                @FieldResult(name = "createdOn", column = "p.created_on"),
                @FieldResult(name = "title", column = "p.title"),
            }
        ),
        @EntityResult(
            entityClass = PostComment.class,
            fields = {
                @FieldResult(name = "id", column = "pc.id"),
                @FieldResult(name = "createdOn", column = "pc.created_on"),
                @FieldResult(name = "review", column = "pc.review"),
                @FieldResult(name = "post", column = "pc.post_id"),
            }
        )
    }
)

The @NamedNativeQuery fetches all Post entities matching the provided title along with their associated PostComment child entities. The DENSE_RANK Window Function is used to assign the rank for each Post and PostComment joined record so that we can later filter just the amount of Post records we are interested in fetching.

The SqlResultSetMapping provides the mapping between the SQL-level column aliases and the JPA entity properties that need to be populated.

Now, we can execute the PostWithCommentByRank @NamedNativeQuery like this:

List<Post> posts = entityManager
.createNamedQuery("PostWithCommentByRank")
.setParameter(
    "titlePattern",
    "High-Performance Java Persistence %"
)
.setParameter(
    "rank",
    5
)
.unwrap(NativeQuery.class)
.setResultTransformer(
    new DistinctPostResultTransformer(entityManager)
)
.getResultList();

Now, by default, a native SQL query like the PostWithCommentByRank one would fetch the Post and the PostComment in the same JDBC row, so we will end up with an Object[] containing both entities.

However, we want to transform the tabular Object[] array into a tree of parent-child entities, and for this reason, we need to use the Hibernate ResultTransformer.

The DistinctPostResultTransformer looks as follows:

public class DistinctPostResultTransformer
        extends BasicTransformerAdapter {
 
    private final EntityManager entityManager;
 
    public DistinctPostResultTransformer(
            EntityManager entityManager) {
        this.entityManager = entityManager;
    }
 
    @Override
    public List transformList(
            List list) {
             
        Map<Serializable, Identifiable> identifiableMap =
            new LinkedHashMap<>(list.size());
             
        for (Object entityArray : list) {
            if (Object[].class.isAssignableFrom(entityArray.getClass())) {
                Post post = null;
                PostComment comment = null;
 
                Object[] tuples = (Object[]) entityArray;
 
                for (Object tuple : tuples) {
                    if(tuple instanceof Identifiable) {
                        entityManager.detach(tuple);
 
                        if (tuple instanceof Post) {
                            post = (Post) tuple;
                        }
                        else if (tuple instanceof PostComment) {
                            comment = (PostComment) tuple;
                        }
                        else {
                            throw new UnsupportedOperationException(
                                "Tuple " + tuple.getClass() + " is not supported!"
                            );
                        }
                    }
                }
 
                if (post != null) {
                    if (!identifiableMap.containsKey(post.getId())) {
                        identifiableMap.put(post.getId(), post);
                        post.setComments(new ArrayList<>());
                    }
                    if (comment != null) {
                        post.addComment(comment);
                    }
                }
            }
        }
        return new ArrayList<>(identifiableMap.values());
    }
}

The DistinctPostResultTransformer must detach the entities being fetched because we are overwriting the child collection and we don’t want that to be propagated as an entity state transition:

post.setComments(new ArrayList<>());

Leave a Comment