I have a highly detailed SVG (around 100MB) of a world map and I want to scale it up, divide it into tiles (8k res max) and save them as PNGs. The SVG is 1024×1024, I’m doing this process 7 times, each time doubling the resolution.
For this example I’m at level 5, which results in 16 tiles at 8k resolution. It takes around 6 min to generate the tiles and save them to disk with the following script:
const i = 5
const image = sharp(imagePath, { density: 72 * 2 ** i, limitInputPixels: false, unlimited: true })
const { width, height } = await image.metadata()
const chunkCount = Math.ceil(width / maxChunkSize)
const chunkSize = Math.ceil(width / chunkCount)
for (let j = 0; j < chunkCount; j++) {
for (let k = 0; k < chunkCount; k++) {
let left = j * chunkSize
let top = k * chunkSize
image
.clone()
.extract({ left, top, width: chunkSize, height: chunkSize })
.png({ effort: 1 })
.toFile(`${destination}/${j}-${k}.png`)
}
}
I tried using node’s multithreading API:
const workers = []
const maxWorkers = 16
let activeWorkers = 0
const i = 5
const image = sharp(imagePath, { density: 72 * 2 ** i, limitInputPixels: false, unlimited: true })
const { width, height } = await image.metadata()
const chunkCount = Math.ceil(width / maxChunkSize)
const chunkSize = Math.ceil(width / chunkCount)
for (let j = 0; j < chunkCount; j++) {
for (let k = 0; k < chunkCount; k++) {
let left = j * chunkSize
let top = k * chunkSize
workers.push({
worker: new Worker('./workers/processChunk.js'),
data: {
x: j,
y: k,
left,
top,
size: chunkSize,
imagePath,
scale: 2 ** i,
destination: `${destination}/chunks`,
},
})
}
}
while (workers.length > 0) {
if (activeWorkers < maxWorkers) {
const worker = workers.shift()
if (worker) {
activeWorkers++
console.log('Starting worker', worker.data.x, worker.data.y)
worker.worker.postMessage(worker.data)
worker.worker.on('message', (chunk) => {
console.log('Created chunk', chunk.x, chunk.y, 'at', chunk.path)
activeWorkers--
})
}
} else {
await new Promise((resolve) => setTimeout(resolve, 100))
}
}
// Worker script
// Have to load the image with sharp again, because the Sharp object is not serializable
await sharp(imagePath, { density: 72 * scale, limitInputPixels: false, unlimited: true })
.extract({ left, top, width: size, height: size })
.png({ effort: 1 })
.toFile(`${destination}/${x}-${y}.png`)
This cuts the time down to around 4 min. I expected a larger performance gain, but the fact that each worker has to load the SVG and upscale it again slows things down a lot. On scale levels 6 and 7 this takes hours.
With both examples my CPU is chilling at 20-30% usage. I tried with os.setPriority
but it only helped a little bit. Is there a way to further give the script priority?
I can’t load the image only once and pass it to the workers, because it’s a Sharp object and it’s not serializable. I can’t upscale it first and send it, because it will exceed sharp’s pixel limit.
What can I do to optimize the process?