I have an application that writes a large array of big-but-not-huge JSON objects to an output file. The objects are created, written, and then discarded (i.e. I don’t keep them all around). I am using JSONStream in an attempt to make memory usage a non-issue, but it isn’t working.
Here is a simple example that shows the issue I’m having:
let fs = require('fs');
let JSONStream = require('JSONStream');
const testfile = 'testfile.json';
const entcount = 70000;
const hacount = 10*1024;
console.log(`opening ${testfile}`);
let outputTransform = JSONStream.stringify();
let outputStream = fs.createWriteStream(testfile);
outputStream.on('finish', () => console.log('finished'));
outputTransform.pipe(outputStream);
console.log(`writing ${entcount} entries, data size ${hacount*2}`);
for (let n = 0; n < entcount; ++ n) {
let thing = {
index: n,
data: 'ha'.repeat(hacount)
}
outputTransform.write(thing);
}
console.log('finishing');
outputTransform.end();
This example uses JSONStream to stream 70000 objects, each roughly 20kB, to a file (this is in the ballpark of my actual application). However, it runs out of memory around 45000 (full output at end of post):
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaS
cript heap out of memory
1: 0092D8BA v8::internal::Heap::PageFlagsAreConsistent+3050
Also I’ve noticed that as I’m calling outputTransform.write
, the file size stays at 0 (it’s also 0 after the above OOM). It doesn’t start growing until outputTransform.end
is called. So I’m assuming the output data is being buffered somewhere and is eating up the heap.
The behavior I expected was that outputTransform.write
would cause output to be written immediately, or at least buffered in a manageably sized buffer; and so I can write as many objects I want without worry about an OOM.
So my question is, is there some way to get JSONStream to not hold all this data in memory?
Increasing the heap size isn’t really an option, because it still presents a memory-bound upper limit.
Full output:
$ node index.js
opening testfile.json
writing 70000 entries, data size 20480
<--- Last few GCs --->
[22256:022DA970] 4589 ms: Mark-sweep 918.8 (924.6) -> 918.3 (921.9) MB, 30.6
/ 0.0 ms (+ 69.8 ms in 33 steps since start of marking, biggest step 9.7 ms, w
alltime since start of marking 104 ms) (average mu = 0.116, current mu = 0.082)
finalize increm[22256:022DA970] 4593 ms: Scavenge 920.2 (921.9) -> 918.1 (92
6.1) MB, 2.3 / 0.0 ms (average mu = 0.116, current mu = 0.082) allocation failu
re
<--- JS stacktrace --->
==== JS stack trace =========================================
0: ExitFrame [pc: 00DDCB97]
Security context: 0x1c200469 <JSObject>
1: /* anonymous */ [0D080429] [index.js:~1]
[pc=203C4C90](this=0x0d0804c5 <Object map = 1ED0021D>,0x0d0804c5
<Object map = 1ED0021D>,0x0d0804a5 <JSFunction require (sfi = 3025687D)>,
0x0d08045d <Module map = 1ED204A5>,0x3024f3c1 <String[#59]: index.js>,0x0d080449
<...
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaS
cript heap out of memory
1: 0092D8BA v8::internal::Heap::PageFlagsAreConsistent+3050