0

What is the most efficient method to copy selected image files from a EC2 instance (Ubuntu 20.04) in production to a S3 bucket while checking if the file exists in EC2?

This is a one-time operation. The S3 bucket will store user-uploaded images that are resized on the fly, so I don't need anymore to pre-process each image into multiple files with different sizes. I need to copy only the original files to S3. The folder will be deleted from EC2 later.

I have a table with the original file names. I need to check if the file exists in EC2 and copy them to S3 bucket. The total image folder size is about 20gb and there is around 40k file names on the table.

I thought about downloading the whole image folder (~20gb) to my local machine through SFTP or SSH and run a function in my Laravel 9 API on a local server to select the files. After that I need to upload the processed folder to S3.

Would this be the most cost-effective solution without overloading the production server? What is the best way to upload the folder to S3? Its final size should be around 10gb, so I guess I could not upload it through AWS console. Maybe run a function to upload it in batches?

The S3 bucket is not on production yet and the API can connect to it in dev mode.

edit: I also realized that downloading files through SCP/SFTP is slow (300-400kb/s on WinSCP). Is there a faster way?

3
  • Have you consider using aws utility? docs.aws.amazon.com/cli/latest/reference/s3/cp.html Commented Oct 20, 2023 at 19:54
  • I did, I have it installed in linux server and local machine. I was able to transfer files between EC2 and S3 but haven't figured out yet how to perform the operation I need. Commented Oct 20, 2023 at 21:36
  • 1
    I ended up using AWS CLI to upload the whole image directory to a temp S3 bucket (upload speed very fast BTW). Now my API function will only move files between S3 buckets. Commented Oct 21, 2023 at 16:15

1 Answer 1

0

One possible option is to use aws CLI util. The copy function have a lot of options and IMHO the speed should be maximum possible because we have source, tool and target from the same vendor.

BTW you can use the same aws CLI to move the files between buckets:

aws s3 cp s3://.... s3://.... 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.