Next.js DataLoader
What GraphQL Got Right About Data Fetching and How to Steal It for Next.js
React Server Components changed how we think about data fetching in Next.js. Every component can be async. Every component can hit the database directly. No more prop drilling massive objects through the tree.
But there's a catch that isn't discussed enough: this pattern has a serious performance problem hiding in plain sight.
The beauty of async components
Consider a page that renders a list of articles, each showing its author. With Server Components, the natural approach is to let each component fetch what it needs:
async function Page() {
const articleIds = await db.getArticleIds(); // 1000 IDs
return articleIds.map((id) => (
<Suspense key={id} fallback={<Skeleton />}>
<Article id={id} />
</Suspense>
));
}
async function Article({ id }: { id: string }) {
const article = await db.getArticle(id);
return (
<article>
<h2>{article.title}</h2>
<Suspense fallback={<Skeleton />}>
<Author id={article.authorId} />
</Suspense>
</article>
);
}
async function Author({ id }: { id: string }) {
const author = await db.getAuthor(id);
return <span>{author.name}</span>;
}This reads well. Each component is self-contained. Article doesn't need the page to pre-fetch anything — it just needs an id and takes care of the rest. Author works the same way. You can move these components around, compose them differently, and they keep working.
Now count the queries: 1 for the IDs, 1000 for the articles, 1000 for the authors. 2001 database round-trips to render a single page.
This is the async N+1 problem. Every component independently fetches its own data without coordinating with its siblings. React renders them concurrently behind Suspense boundaries, so you get a flood of parallel queries hitting your database all at once.
The conventional fix — and why it hurts
The standard advice is to hoist data fetching to the page level. Fetch everything upfront and pass it down:
async function Page() {
const articles = await db.getAllArticles();
const authorIds = [...new Set(articles.map((a) => a.authorId))];
const authors = await db.getAuthors(authorIds);
const authorsById = new Map(authors.map((a) => [a.id, a]));
return articles.map((article) => (
<Article
key={article.id}
article={article}
author={authorsById.get(article.authorId)!}
/>
));
}
function Article({ article, author }: { article: Article; author: Author }) {
return (
<article>
<h2>{article.title}</h2>
<Author author={author} />
</article>
);
}
function Author({ author }: { author: Author }) {
return <span>{author.name}</span>;
}Two queries. Problem solved. But look at what happened to the architecture.
The page is now tightly coupled to the entire component tree. It needs to know that Article renders an Author, which requires author data. If Author starts displaying the author's company, the page query needs to change. If you add a Comments section inside Article, the page has to fetch comments too. Every new data requirement in a leaf component bubbles up to the page.
Components are no longer self-contained. Article takes { article, author } as props — it doesn't declare its own data needs, it receives everything pre-fetched from above. Move it to a different page and you have to replicate the same fetching logic there.
In a real app with deep component trees, this creates a maintenance problem. The page becomes a god component that knows the data requirements of every descendant. Refactoring a nested component means updating the page query. The whole point of component-driven architecture — encapsulation — is gone.
How GraphQL solved this
This problem isn't new. It's the exact tension that led Facebook to build GraphQL.
In a GraphQL architecture, each component declares its data requirements using fragments:
// Page composes fragments into a single query
function Page() {
const data = useQuery(graphql`
query PageQuery {
articles {
...ArticleCard
}
}
`);
return data.articles.map((article) => (
<Article key={article.id} article={article} />
));
}
// Article declares what it needs + spreads Author's fragment
function Article({ article }: { article: ArticleCardFragment$key }) {
const data = useFragment(graphql`
fragment ArticleCard on Article {
title
author {
...AuthorDisplay
}
}
`, article);
return (
<article>
<h2>{data.title}</h2>
<Author author={data.author} />
</article>
);
}
// Author declares what it needs
function Author({ author }: { author: AuthorDisplayFragment$key }) {
const data = useFragment(graphql`
fragment AuthorDisplay on Author {
name
avatar
}
`, author);
return <span>{data.name}</span>;
}The page composes these fragments into a single query. Each component declares what data it needs — locally, right next to the UI that uses it — but the actual fetching happens in one optimized request. Add a field to Author and only the Author fragment changes. The page query automatically picks it up.
On the server side, GraphQL uses DataLoader to solve the N+1 problem. When resolvers independently call loader.load(id) for their data, DataLoader collects all the IDs within the same microtask and dispatches a single batched query. Thousands of individual fetches become a handful of WHERE id IN (...) calls.
This is a powerful combination: fragments for colocating data requirements, DataLoader for efficient batching. But it comes with the full weight of a GraphQL layer — schema definitions, resolvers, a query language, a client library.
What if we could get the batching benefit directly in Next.js?
DataLoader without GraphQL
It turns out DataLoader isn't tied to GraphQL at all. It's a generic utility: collect .load(id) calls within a microtask, dispatch them as a batch. And React Server Components behind Suspense boundaries create the exact same execution pattern as GraphQL resolvers — multiple async functions running concurrently, each requesting their own data.
First, create request-scoped loaders using React's cache():
import Dataloader from "dataloader";
import { cache } from "react";
const getArticleLoader = cache(
() =>
new Dataloader<string, Article>(async (ids) => {
const articles = await db.getArticles([...ids]);
return ids.map((id) => articles.find((a) => a.id === id)!);
}, { maxBatchSize: 100 }),
);
const getAuthorLoader = cache(
() =>
new Dataloader<string, Author>(async (ids) => {
const authors = await db.getAuthors([...ids]);
return ids.map((id) => authors.find((a) => a.id === id)!);
}, { maxBatchSize: 100 }),
);cache() is the key piece. It memoizes per request — call getArticleLoader() a thousand times during the same render and you get the same instance every time. It's a request-scoped singleton. Without it, each component would create its own DataLoader and there'd be nothing to batch.
Now the components stay almost identical to the original version — self-contained, each loading its own data — but they call loader.load(id) instead of hitting the database directly:
async function Page() {
const articleIds = await db.getArticleIds();
return articleIds.map((id) => (
<Suspense key={id} fallback={<Skeleton />}>
<Article id={id} />
</Suspense>
));
}
async function Article({ id }: { id: string }) {
const article = await getArticleLoader().load(id);
return (
<article>
<h2>{article.title}</h2>
<Suspense fallback={<Skeleton />}>
<Author id={article.authorId} />
</Suspense>
</article>
);
}
async function Author({ id }: { id: string }) {
const author = await getAuthorLoader().load(id);
return <span>{author.name}</span>;
}Same structure as the first example. Components receive IDs as props, not pre-fetched objects. Each component loads its own data and passes IDs down. But behind the scenes, DataLoader collects all those .load() calls and batches them: 1 query for IDs, ~10 batched queries for articles, ~10 for authors. ~21 queries total instead of 2001.
Components stay independent. No coupling between the page and its descendants. And no GraphQL layer required.
See it in action
The demo below simulates 300 components each fetching an article and then its author. Each query runs against a simulated connection pool with configurable latency and jitter. In sequential mode, each component fires its own query — but only as many can run in parallel as there are pool slots, so dots advance a few at a time. In DataLoader mode, all IDs are collected into batches of 100 and each batch uses a single pool slot, so entire waves resolve at once.
Adjust the parameters and compare — try cranking the pool size up to see sequential get faster, then notice DataLoader barely changes because it was already making very few queries.
Each dot represents a component: gray (pending) → orange (article loaded) → green (complete).
When does this make sense?
This pattern fits naturally when:
- Components at the same tree depth need similar data
- You have batch-capable data access (most ORMs support
WHERE id IN (...)) - You want components to own their data dependencies without coupling the page to the tree
If you know exactly what data a page needs and can express it in one query, do that. But for component-driven architectures where the page shouldn't dictate what every descendant needs, DataLoader gives you batching without sacrificing encapsulation.
It's the same philosophy as GraphQL fragments — declare what you need where you need it — without the GraphQL layer.
The full experiment is at github.com/barodeur/nextjs-dataloader.