I was looking at different ways to compress and upload a file to s3, and I recently came across the CompressionStream api. Basically uploading a file in one go using this seems to be quite straightforward:
const compressedStream = file.stream().pipeThrough(new CompressionStream('gzip'));
response = await fetch(presignedUrl,{
method: "PUT",
body: body,
headers: {
"Content-Type": contentType,
"Content-Encoding": "gzip"
},
});
Since, I target large files (1-3 gb), I was going for a Multipart upload. But, the Compression stream being a stream api can pipe through data as far as I understand(hopefully correctly) .
So, I wanted to combine the advantages of both, and knowing s3 doesn’t support directly streamed uploads, I wanted to upload the chunked bytes on a multipart upload instead.
Yet, I’m not being able to figure out how to do this, it may look something like this:
// Here, I've tried to use TransformStream but , a better approach is really welcomed
// this is pseudocode only
// file = event.target.files[0] , user selected file:
file
.stream()
.pipeThrough(new CompressionStream('gzip'))
.pipeThrough(new TransformStream({
start(){},
transform(chunk, controller) {
uploadPromises.push(
s3Client
.send(
new UploadPartCommand({
Bucket: bucketName,
Key: key,
UploadId: uploadId,
Body: some_chunk_of_5mb_size, // THIS IS THE CONFUSION
PartNumber: i + 1,
}),
)
},
}))
- What I do not understand is how to get a chunk of size >=5mb , since
that’s s3’s requirement for a multipart upload. - What is the data type of this
chunk
even? In the Transform stream docs, it’s being compared to all sort of data type, can I even check the size and concatenate thischunk
to make it 5mb for the multipart upload? - Does uploading like this if the chunk has to be for example further converted into buffer or something affect the integrity of the file being uploaded?