From 6 GB to 2 GB: How We Tamed a Memory Beast in Node.js
A deep dive into how a single algorithmic decision silently inflated our app's memory ā and the three-phase journey that brought it back under control.

A Number That Made Us Stop
During a routine load test, we watched our Node.js application climb to 6 GB of RAM at peak. No memory leak in the traditional sense. No runaway loop we forgot to close. Just... the app doing exactly what we built it to do ā and eating memory for breakfast because of how we built it.
By the end of this story, peak memory sat at 2 GB. Bootstrap time dropped from **4 minutes to ~40 seconds**. CPU load dropped noticeably. GC pauses shortened. All without rewriting the business logic.
Here's every decision that got us there.
First, Let Me Explain the App's Anatomy
Before we get to the fix, you need to understand the problem space ā because without that context, the solution looks like magic.
Our application manages a complex configuration system. Think of it as a mirror of the database: every table in our DB has a corresponding representation in the app. On top of that representation, we layer customization, filtering, access rules, and relationship logic.
These configurations don't live in isolation. They form a graph ā nodes connected to other nodes, representing how database tables relate to each other. One configuration node can depend on three others. Those three can each depend on five more. And so on. It's a deep, wide, interconnected web.
We support 15+ data sources, each with many tables. Total nodes in the graph: 100+. And these data sources also reference each other, so the graph isn't just deep within one source ā it spans across all of them.
This design is actually a win for the team: adding a new data schema or table is just a matter of writing a JSON config. No new code. The graph handles the rest. It's expressive, fast to iterate on, and easy for engineers to reason about in isolation.
The problem is what happens when you boot the whole thing up at once.
The Bootstrap Phase: Where It All Starts
When the app starts, it needs to do a significant amount of CPU-intensive work to make the graph usable:
Link all nodes together across data sources
Resolve dependency order ā if Table A depends on Table B, B must be ready before A
Detect and handle circular dependencies ā if A depends on B and B depends on A, we need a strategy, not an infinite loop
Validate existence ā make sure every referenced table actually exists in the schema
Build traversal paths ā so runtime queries can walk the graph efficiently
This is unavoidable work. We need to do it. The question is: how we do it.
The Root Cause: A DFS Algorithm With an Expensive Habit
To resolve dependencies and build the full graph, we used a Depth-First Search (DFS) algorithm. DFS is a natural fit here ā you start at a root node, explore all its children, then their children, then backtrack. It handles circular dependency detection cleanly with a "visited" set.
The algorithm roughly worked like this:
resolve(node):
if node is in visitedSet ā skip (circular dep detected)
mark node as visited
for each child of node:
resolvedChild = cloneDeep(child) ā šØ here's the problem
attach resolvedChild to node
resolve(resolvedChild)
See that cloneDeep? That was the catastrophic habit.
Why We Were Using cloneDeep
The intent was reasonable. Each node in the graph can carry a different object shape depending on its relationship with its parent. Node B as a child of A might have different metadata attached than B as a child of D. If we just passed around the same reference, mutating the object for one parent would silently corrupt it for another.
So we reached for cloneDeep from Lodash to create a full independent copy of each node before attaching it.
What cloneDeep Actually Does Under the Hood
A deep clone is not a simple copy. It's a full recursive traversal of an object graph. Here's the mental model:
cloneDeep(obj):
result = {}
for each key in obj:
if obj[key] is a primitive ā result[key] = obj[key]
if obj[key] is an object ā result[key] = cloneDeep(obj[key]) ā recursion
return result
It's literally another DFS ā inside each node ā to recreate every nested value. For a small object, this is trivial. For a node in our graph that might carry nested configuration objects, metadata arrays, nested relation descriptors, and more? It becomes very expensive, very fast.
The Compounding Problem
Now consider that we were calling cloneDeep inside a DFS that traverses 100+ nodes, each with multiple children, which each have multiple children. Every level of the tree triggered deep clones at every level below it.
The result:
Memory exploded because we were holding thousands of redundant object copies simultaneously
The GC (garbage collector) had to work constantly to clean up intermediate clones that were no longer needed
Bootstrap took ~4 minutes because between the DFS traversal and the recursive cloning, the CPU was saturated and the event loop was effectively blocked
The larger our configs grew, the worse it got ā this was a scaling time bomb
Fix #1 ā Shallow Copy + Targeted Key Mutation
The key insight was: we don't need to clone the entire node. We just need to change a few specific keys on it before attaching it to a parent.
If we know exactly which keys differ per parent-child relationship, we can:
Create a shallow copy of the node (O(1) ā just copies top-level references)
Override only the specific keys that need to differ
// Before: O(n) deep clone of the entire object graph per node
const resolvedChild = cloneDeep(child);
// After: O(1) shallow copy + targeted overrides
const resolvedChild = {
...child, // reuse all references
parentRef: currentNode, // only override what needs to differ
relationMeta: computedMeta,
};
What Is a Shallow Copy?
A shallow copy creates a new top-level container but reuses references for all nested objects. In JavaScript, the spread operator { ...obj } does exactly this.
Original object:
{ name: "tableA", config: { ā points to Config Object X } }
Shallow copy:
{ name: "tableA", config: { ā still points to Config Object X } }
ā new object, but inner config is shared, not duplicated
For our use case, this was completely safe. We weren't mutating nested objects ā we were only replacing top-level keys. The shared references were fine to keep.
The impact was dramatic. What was previously a deep recursive clone for every node in a 100+ node graph became a single object spread per node. The memory footprint of the bootstrap phase dropped significantly, and ā because we were no longer allocating thousands of temporary deep-clone objects ā the GC had far less to clean up.
After this change: ~3.43 GB peak. Bootstrap: ~40 seconds.
Still room to improve. But already a massive win.
A Side Note on GC: Why Fewer Objects Matters
Node.js uses V8's garbage collector, which runs in two modes:
Minor GC (Scavenge): Fast, frequent. Cleans up short-lived objects in the "young generation" heap space. Runs in milliseconds.
Major GC (Mark-and-Sweep): Slow, occasional. Traverses the entire live object graph to find and free unreachable memory. Can pause the event loop for tens of milliseconds.
When you allocate thousands of cloneDeep objects that are only used briefly and then discarded, you're putting pressure on both GC modes. Minor GC has to scavenge constantly. Some objects "survive" long enough to get promoted to the old generation, then Major GC has to walk a much larger graph during mark-and-sweep.
By switching to shallow copies, we drastically reduced allocation volume. The GC now runs less frequently, less intensely, and with a smaller graph to traverse. That means more consistent latency, more headroom for the event loop to do real work, and a noticeably more stable CPU profile.
Fix #2 ā WeakMap Instead of Map for Intermediate Caches
During the DFS traversal, we maintained a cache of intermediate resolution results ā essentially memoizing nodes we'd already resolved so we wouldn't process them twice.
We were using a regular Map for this. That seems reasonable at first glance. But there's a subtle problem.
How Map Can Cause Memory Leaks
A regular Map holds strong references to its keys. As long as the Map exists, every key it holds is considered "reachable" by the GC ā even if nothing else in the application references that key anymore.
In our case, the intermediate cache was a module-level variable that persisted for the life of the bootstrap process. Nodes that had been resolved and were no longer needed by any active code path were still being kept alive by the Map. The GC couldn't collect them.
// Strong reference ā node is kept alive by the Map even if nothing else needs it
const cache = new Map();
cache.set(nodeObject, resolvedResult);
// nodeObject cannot be GC'd as long as cache exists
WeakMap: Hold Without Blocking Collection
A WeakMap holds weak references to its keys. If the only reference to a key object is the WeakMap itself, the GC is free to collect it. The entry disappears from the WeakMap automatically.
// Weak reference ā node can be GC'd if nothing else references it
const cache = new WeakMap();
cache.set(nodeObject, resolvedResult);
// nodeObject CAN be GC'd when no other code holds a reference
For an intermediate processing cache ā where objects are only needed during traversal and can be released once their subtree is resolved ā WeakMap is the semantically correct choice. It lets the GC reclaim intermediate nodes progressively during the bootstrap, rather than holding onto everything until the entire process completes.
Important caveat: WeakMap only accepts objects as keys (not primitives), and you can't iterate over it or check its size. For a lookup cache keyed on node objects, it's a perfect fit. For anything else, evaluate carefully.
Fix #3 ā node-caged: Pointer Compression at the Runtime Level
After the algorithmic fixes, we had one more tool to reach for: node-caged.
This is not a library. It's not a code change. It's Node.js ā compiled from source with a single extra flag:
--experimental-enable-pointer-compression
You swap your Dockerfile base image, and that's it.
Why Node.js Uses So Much Memory for "Small" Objects
To understand why this helps, you need a quick detour into how modern computers address memory.
Modern servers use a 64-bit architecture. This means memory addresses are 64 bits long ā 8 bytes per pointer. Every object in JavaScript (inside V8) carries many internal pointers:
A pointer to its prototype
A pointer to its property map (also called a "shape" or "hidden class")
Pointers to each property that holds an object value
Pointers to internal V8 bookkeeping structures
A small JavaScript object with a few properties might carry 10ā20 internal pointers. At 8 bytes each, that's 80ā160 bytes of overhead just for the pointers ā before you've stored a single byte of your actual data.
Scale that across hundreds of thousands of live objects in a complex graph, and pointer overhead becomes a meaningful fraction of total heap size.
What Pointer Compression Does
V8's pointer compression (used in Chrome, now optionally available in Node.js via node-caged) works by defining a fixed "memory cage" ā a contiguous block of heap memory with a known base address.
Instead of storing full 64-bit addresses, V8 stores 32-bit offsets from the base of the cage:
Without compression:
prototype pointer: 0x00007FFF1234ABCD ā 8 bytes
With compression (base = 0x00007FFF00000000):
prototype pointer: 0x1234ABCD ā 4 bytes (just the offset)
Half the bytes per pointer. Since pointers make up a large share of V8 heap usage, the real-world result is typically a 35ā50% reduction in heap size ā without touching a single line of your application code.
The tradeoff: the memory cage has a maximum size of 4 GB (because a 32-bit offset can only address 4 GB of range). For most services ā especially microservices ā this is plenty. For monolithic applications or services that genuinely need more than 4 GB of JS heap, this won't work. Know your limits before adopting it.
| Metric | before | after |
|---|---|---|
| Peak memory (load) | ~6 GB | ~ 2 GB |
| Bootstrap time | ~4 minutes | ~40 seconds |
| GC pressure | High (major GC frequent) | Low |
| CPU during bootstrap | Saturated | Comfortable |
Each fix contributed a distinct layer of improvement:
Shallow copy eliminated the core algorithmic waste ā the biggest single win
WeakMap removed structural memory leaks in intermediate processing
node-caged applied a runtime-level compression that reduced heap overhead across the board
Lessons Learned
Understand your tools before you scale with them. cloneDeep is correct and useful. But it's a full object graph traversal. Using it inside another traversal, on a large graph, at boot time, is a recipe for compounding cost. Know what your utilities do internally ā not just what they promise in the docs.
The cost of an algorithm is not just its Big-O ā it's the constant factor too. Our DFS was O(n). The cloneDeep inside was also O(n) per node. The real complexity was O(n²) and we didn't see it until the graph grew large enough to make it visible.
Shallow copy is almost always enough if you control mutation. The instinct to reach for cloneDeep often comes from fear of accidental mutation. But if you know which keys you're mutating and when, a shallow copy with targeted overrides is both safer to reason about and orders of magnitude cheaper.
WeakMap is not just an optimization ā it's the semantically correct choice for ephemeral caches. If your cache exists to support a computation (not to serve as a long-lived data store), your keys should not prevent GC. WeakMap encodes that intent directly in the data structure.
Infra-level wins exist, and they're worth knowing about. node-caged required zero application code changes. It's just a different binary. Not every optimization lives in your business logic ā sometimes it lives in how the runtime itself manages memory. Stay curious about the lower layers of your stack.
Memory problems compound silently. We didn't wake up one day to 6 GB. The graph grew by tens of nodes over months, each addition adding proportionally more clones. The signal was subtle until it wasn't. If your memory grows roughly linearly with configuration complexity, ask yourself why ā the answer might surprise you.
References :
1- Node caged
2- Weak map
3- Shallow Copy
