Challenging Established Norms: Making Component Fetching the Exception

Published: 03 September 2023

Over the years, I've tried many ways of getting data into my apps—XMLHttpRequest, fetch, Apollo GraphQL, React Query, Remix loaders, React Server Components and more. These diverse perspectives have evolved my data handling strategies. I used to be a strong advocate of querying within components, but my stance has shifted.

Initially, integrating data concerns into our components seemed intuitive to me and improved the developer experience. However, I eventually found reasons to treat it as a last resort. Primarily:

Request waterfalls.
Every corner of the app is tightly coupled to API shape.

I'd like to share why I consider these things problematic and how my efforts to minimise them have led to improvements in my overall architecture.

Request waterfalls

Waterfall issues arise when a child component can't start fetching its data until its parent has completed its own data retrieval and rendered the child. This means data requests can't happen simultaneously, leading to longer page load times and what the Remix team humorously coined 'spinnageddon' for users.

Thankfully, there have been significant improvements in addressing this problem. For instance, Relay encourages querying data in response to an event like a button click before the component mounts. When the component mounts it waits for that promise to resolve instead of initiating the fetch, shortening the time it takes to render the view.

The NextJS team similarly propose lifting fetch initiation up out of your components when using nested React Server Components that request differing data.

Other solutions such as Remix, Solid Start and SvelteKit solve this by making the router wholly responsible for initiating the requests for data.

The earlier you initiate a fetch, the better, because the sooner it starts, the sooner it can finish.
— @TkDodo

This render-as-you-fetch approach is something they all recommend, however, React Server Components encourage colocating your data dependencies with your components by default. If the request for the data in your component isn't initiated ahead of time, we'll have waterfall.

With sequential data fetching, requests in a route are dependent on each other and therefore create waterfalls. There may be cases where you want this pattern [...]. However, this behavior can also be unintentional and lead to longer loading times.
— NextJS Sequential Data Fetching

NextJS addresses this issue to some extent by executing route fetchers in parallel and monkey patching fetch to cache requests for the same data. However, it's important to remember that anything beyond these specific scenarios can still cause a waterfall. Additionally, it's worth noting that Suspense will fallback for a minimum of 500ms to prevent a sudden flash of fallback content.

While unintentionally introducing waterfall in NextJS isn't a major concern, solutions like Remix, in my view, enhance developer experience. Focusing queries solely at the route level minimises waterfall concerns and promotes a more straightforward component architecture. Before diving deeper into how, let's explore why tightly coupling our apps to the API structure may have drawbacks.

Tightly coupled data dependencies

A component with a closely tied data dependency is one that relies on a prop whose type definition matches the return type of an API, or a component that queries its own data. Using either of these methods, your component depends on the structure of the API response to display properly:

type User = {
  id: string,
  name: string,
  avatarUrl: string,
};

type MessageApiResponse = {
  id: string,
  createdAt: number,
  body: string,
  userId: User['id'],
  user: User,
};

const Message = ({ messageId }) => {
  const message: MessageApiResponse = useQuery('/message/' + messageId);
  return (
    <div>
      <img src={message.user.avatarUrl} alt={message.user.name} />
      <p>{message.body}</p>
    </div>
  );
};

// or

const Message = ({ message }: { message: MessageApiResponse }) => {
  return (
    <div>
      <img src={message.user.avatarUrl} alt={message.user.name} />
      <p>{message.body}</p>
    </div>
  );
};

If any changes are made to the API, we must update every part of our codebase that references the areas that were changed:

type User = {
    id: string;
-   name: string;
+   name: string; // @deprecated
+   firstName: string;
+   lastName: string;
    avatarUrl: string;
};

type MessageApiResponse = {
    id: string;
    createdAt: number;
    body: string;
-   userId: User['id'];
-   user: User;
+   userId: User['id']; // @deprecated
+   user: User; // @deprecated
+   authorId: User['id'];
+   author: User;
};

const Message = ({ message }: { message: MessageApiResponse }) => {
  return (
    <div>
-      <img src={message.user.avatarUrl} alt={message.user.name} />
+      <img
+        src={message.author.avatarUrl}
+        alt={`${message.author.firstName} ${message.author.lastName}`}
+      />
      <p>{message.body}</p>
    </div>
  );
};

And that might include storybook stories or unit tests:

// construct story data in the `MessageApiResponse` shape
const message = {
-  user: { avatarUrl: '/boo.png', name: 'Jenna Smith' },
+  author: { avatarUrl: '/boo.png', firstName: 'Jenna', lastName: 'Smith' },
   body: 'hello',
};

export const MessageStory = () => <Message message={message} />;

If numerous components rely on an API structure that has changed, it can lead to a larger task, a more extensive pull request (resulting in a slower review process), and necessitates a more thorough regression test because it affects multiple parts of the codebase.

As applications grow, APIs often accumulate long-lived deprecated fields because the more components we introduce, the more dependencies we create on the existing API structure. This makes it more challenging to remove or modify things.

Having been on projects that transitioned from a RESTful API to an entirely new GraphQL API, this tight coupling became a noticeable pain point. While I can appreciate such transitions are rare, anything that affects delivery efficiency or our ability to make new choices always triggers the explorer in me.

I wondered if there could be an alternative approach. Is it possible to maintain a good developer experience without tightly coupling everything to our API structure? How would that look? To explore this, I decided to see how my architecture would be affected if I treated component querying as a last resort.

Refactoring queries out of components

I did something similar to the following for many years (although Suspense wasn't really a thing back then so each component had its own branched <Loading /> logic). It is an example of the sort of code I often see recommended as the way to architect components with data dependencies:

export default function Page() {
  const params = useParams();
  return (
    <Suspense fallback={<Loading />}>
      <Chat userId={params.userId} />
    </Suspense>
  );
}

function Chat({ userId }) {
  const chat = useQuery('/chat/' + userId);
  return (
    <div>
      <h2>
        <Suspense fallback={<AvatarFallback />}>
          <Avatar userId={userId} />
        </Suspense>
        {chat.owner.firstName}'s chat
      </h2>
      <div>
        <Suspense fallback={<FriendsListFallback />}>
          <FriendsList userId={userId} />
        </Suspense>
        <Messages chatId={chat.id} />
      </div>
    </div>
  );
}

function Messages({ chatId }) {
  const messages = useQuery('/messages/' + chatId);
  return (
    <div>
      {messages?.map((message) => (
        <Message key={message.id} chatId={chatId} messageId={message.id} />
      ))}
    </div>
  );
}

function Message({ chatId, messageId }) {
  const message = useQuery(`/message/${chatId}/${messageId}`);
  return (
    <p>
      <Avatar userId={message.authorId} />: {message.body}
    </p>
  );
}

function FriendsList({ userId }) {
  const friends = useQuery('/friends/' + userId);
  return (
    <ul>
      {friends?.map((friend) => (
        <li key={friend.id}>{friend.firstName}</li>
      ))}
    </ul>
  );
}

function Avatar({ userId }) {
  const user = useQuery('/user/' + userId);
  return <img src={user.avatarUrl} alt="" />;
}

Besides the fact that introducing any state in Chat will re-render our entire application and that reasoning about this page requires heavy use of "Go to definition" 🙃, it also brings about the waterfall and tight coupling problems mentioned earlier that I'd like to try and avoid.

Step one - remove waterfall

If we're treating querying in comps as a last resort, one option is to pass the data they need via a prop with an approach known as prop drilling.

export default function Page() {
  const chat = useQuery('/chat/' + userId);
  const friends = useQuery('/friends/' + userId);
  return <Chat chat={chat} friends={friends} />;
}

function Chat({ chat, friends }) {
  return (
    <div>
      <h2>
        <Avatar user={chat.owner} /> {chat.owner.firstName}'s chat
      </h2>
      <div>
        <FriendsList friends={friends} />
        <Messages messages={chat.messages} />
      </div>
    </div>
  );
}

function Messages({ messages }) {
  return (
    <div>
      {messages?.map((message) => (
        <Message key={message.id} message={message} />
      ))}
    </div>
  );
}

function Message({ message }) {
  return (
    <p>
      <Avatar user={message.author} />: {message.body}
    </p>
  );
}

function FriendsList({ friends }) {
  return (
    <ul>
      {friends?.map((friend) => (
        <li key={friend.id}>{friend.firstName}</li>
      ))}
    </ul>
  );
}

function Avatar({ user }) {
  return <img src={user.avatarUrl} alt="" />;
}

With this approach, data must be loaded upfront. The advantages include:

No waterfall.
Don't have to worry about n+1 problem because we are encouraged to query collections instead of individual Message items.
A reduction in data requests (from 5+ to just 2), as we can utilise data already available on the chat and friends endpoints.

In previous examples, we could have employed techniques like queryClient.setQueryData from React Query to manually update the cache and reduce subsequent data requests. However, I hope I'm not alone in appreciating the simplicity of the approach described above. There's no need to massage a cache, and no new APIs to learn.

On the downside:

Our entire app is still tightly coupled to the API response structure.
Changing state in Chat would trigger a full app re-render.
We're prop drilling.

This multi-layer plumbing is often considered a necessity with the route loader approach, which is not the ideal developer experience. Fortunately, there's a simpler way, aligning with how Dan Abramov has previously described React Server Component architecture—the hole pattern.

Step two - remove prop drilling and data coupling

Let's refactor to use composition/compound components to minimise dependencies on the API shape and avoid prop drilling:

function Page() {
  const chat = useQuery('/chat/' + userId);
  const friends = useQuery('/friends/' + userId);

  return (
    <Chat>
      <ChatTitle avatarUrl={chat.owner.avatarUrl} name={chat.owner.firstName} />
      <div>
        <FriendsList>
          {friends?.map((friend) => (
            <FriendsListUser key={friend.id}>
              {friend.firstName}
            </FriendsListUser>
          ))}
        </FriendsList>
        <ChatMessages>
          {chat.messages?.map((message) => (
            <ChatMessage
              key={message.id}
              avatarUrl={message.author.avatarUrl}
              name={message.author.firstName}
            >
              {message.body}
            </ChatMessage>
          ))}
        </ChatMessages>
      </div>
    </Chat>
  );
}

function Chat({ children }) {
  return <div>{children}</div>;
}

function ChatTitle({ avatarUrl, name }) {
  return (
    <h2>
      <Avatar src={avatarUrl} />
      {name}'s chat
    </h2>
  );
}

function ChatMessages({ childern }) {
  return <div>{children}</div>;
}

function ChatMessage({ avatarUrl, name, children }) {
  return (
    <div>
      <Avatar src={avatarUrl} alt={name} />: {children}
    </div>
  );
}

function FriendsList({ children }) {
  return <ul>{children}</ul>;
}

function FriendsListUser({ children }) {
  return <li>{children}</li>;
}

function Avatar({ src }) {
  return <img src={src} alt="" />;
}

By maintaining a flat owner tree and keeping our components dumb, we enjoy several benefits:

No waterfall.
Don't have to worry about n+1 problem.
Only two network requests without cache massaging.
No more prop drilling.
Any changes to state in our leaf components, such as Chat, will only affect those specific components, rather than triggering a full app re-render.
Reduced dependencies on API structure, minimising the risk of API changes affecting the entire system.
Ability to gradually incorporate API updates on a per-page basis, reducing testing scope and enhancing code review efficiency.
Centralised visibility of the API fields in use, all in one place (Page).
Clearer understanding of the page's likely appearance because most of the UI is rendered in Page.

Furthermore, if we want to display a custom Message in Storybook, we can do so easily without needing to understand the API structure in advance:

return (
  <ChatMessage avatar="/boo.png" senderName="Jenna">
    hello
  </ChatMessage>
);

I like this, but there are a few burning questions...

What if we need to reuse a component in multiple places, like the avatar?

Connecting the same data every time we use Avatar may not seem efficient, but in practice, components aren't always wired up exactly the same way when architecting this way. For instance, the src for the avatar comes from different sources depending on where it's used: one from chat.owner.avatarUrl and another from message.author.avatarUrl.

If we were to fetch data directly within the Avatar component, it could lead to an assumption that chat.owner.id and message.author.id will always be identifiers for a User type, resulting in queries to the /user endpoint for all avatars. In my view, scalability benefits from avoiding these assumptions.

What if we wanted to allow users to specify different avatars when displayed as a chat owner versus a message author? While this example is contrived, I believe the back end would ideally be responsible for defining data relationships as much as possible, not the API consumer, if we want to minimise the impact of future changes.

This approach might appear to involve a lot of copying with minor adjustments, but a bit of boilerplate is perfectly fine, I promise. I recommend resisting the urge to DRY in this context.

Like with copy paste, we are duplicating parts of code to avoid introducing dependencies, gain flexibility, and pay for it in verbosity.
— Write code that is easy to delete, not easy to extend

What if we need to transform data to wire a reused component? That's a lot to copy/paste.

I highly recommend watching Type safe Backend-For-Frontends by Maciek Pękala. Specifically the points about shifting your business logic and data transformations into a BFF that provides data in the exact format required by your views.

For example, your view would not be responsible for computing something like const isRecent = data.timestamp < new Date().getTime() + (7 * 24 * 60 * 60 * 1000) to determine if an item is less than 7 days old. Instead, the BFF would handle that computation and provide a data.isRecent value.

In Remix, your loaders serve as the BFF, enabling you to create reusable functions that loaders can call when necessary. If you don't have a BFF, using hooks is a good way to maintain reusable logic.

What if querying all data upfront slows performance?

If you encounter performance issues with a specific endpoint, consider exploring options like a caching service, CDN cache, or utilising HTML streaming for specific components.

React Router V6 offers a deferred API for streaming, which doesn't require data querying in components. It might also be possible to create a React Server Component (RSC) that functions similarly to Remix's Await component for streaming within NextJS, eliminating the need for querying in NextJS components.

If none of these alternatives are viable, and I can demonstrate a substantial performance improvement by querying in the component, then I will consider it as a last resort.

What if we need to poll/subscribe to data changes?

Remix provides a convenient useRevalidator hook for this purpose. However, if that's not available, this is one scenario where I would consider querying in a component more readily to limit the extent of revalidation and re-rendering on screen.

Conclusion

Considering data fetching in components as an optimisation, to be employed only when problems arise, has enhanced the scalability and maintainability of my application architectures.

Consequently, while React Server Components (RSCs) offer the option to query within components, I'll be prioritising route-based queries in NextJS whenever feasible.

This approach, which encourages keeping views flat, has not only made codebases much easier for me to reason about but has also contributed to enhanced performance.

This perspective may not align with everyone's views, but I'm hoping it's beneficial to some 💛