Why you care ? Which layer should process your data

Lets say you have a rest endpoint that gives a list of users based on certain criteria. For example if user want to retrieve inactive users it will return users.

And code for doing the same looks like this in service layer written in spring boot

  public List<User> getInactiveUsersProgrammatically() {
        // 1. Fetch ALL users from the database
        List<User> allUsers = userRepository.findAll();

        // 2. Filter the list in Java code (Stream API)
        List<User> inactiveUsers = allUsers.stream()
            .filter(user -> !user.isActive()) // Keep users where active is false
            .collect(Collectors.toList());

        return inactiveUsers;
    }

Do you see a problem with this code ?

Obvious answer is no correct ? This compiles fair and gives a desired result. But there is a subtle issue in this code, it is loading all the users in the memory of a server and trying to filter those whom are inactive. This can lead to potential system crash in case list of users goes into millions. Now imagine AI generates this code and someone use it as is. Till dev environment every one is happy till they discover that in pre-prod or production things are going differently.

How to solve these type of issues ?

As a rule of thumb, always try to do data filtering at DB layer instead of bringing data in application layer and then applying filtering. This looks easy when said but imagine you have nested objects like

Course →Student →User

and you are tasked to get the courses which are abandoned or not completed for inactive users. It is fairly easy to think on one entity at a time

Get the courses and get the students enrolled in it
Get the users associated with student role
Finally get all of them whose status is inactive.

This above algorithm looks neat but having same issue where system with lots of record will either slow down or break the system itself.

For the above Problem discussed a performant code looks like

//Service layer code    
public List<User> getInactiveUsers() {
        // Calls the derived query method from the repository
        return userRepository.findByActiveFalse();
    }
//Entity layer code
@Repository
public interface UserRepository extends JpaRepository<User, Long> {

    /**
     * Spring Data JPA will automatically translate this method
     * into the SQL query: SELECT * FROM users WHERE active = false
     *
     * @return A list of User objects where the 'active' column is false.
     */
    List<User> findByActiveFalse();

    // Alternatively, you could use:
    // List<User> findByActive(boolean active);
    // and call it with findByActive(false);
}

Another issue that is important to be understood, processing time at the application layer is at least 3 to 4 times more in the application layer as lots of time is spent on JDBC bridge that is usually hidden from application developers. So only load data selectively that is must have in application layer.

Beware of AI generated code being copy pasted as is, since these systems are not yet giving the best code for a given problem until you know subtleties around your domain.

Connect with us on LinkedIn and follow us on Instagram to stay in the loop with the latest insights on skills, hiring, and workforce transformation.

Subscribe to our Newsletter and YouTube Channel to stay updated with fresh perspectives on GenAI-driven talent evaluation.

🌐 Explore more at thinkhumble.in

Why you care ? Which layer should process your data was originally published in Javarevisited on Medium, where people are continuing the conversation by highlighting and responding to this story.

This post first appeared on Read More