Create A Static Copy of Your Website

Let's create a static version of our website using the magical wget.

wget -k -K -E -r -l 10 -p -N -F --restrict-file-names=windows -nH https://korays.com/

-k : convert links to relative

-K : keep an original versions of files without the conversions made by wget

-E : rename html files to .html (if they don’t already have an htm(l) extension)

-r : recursive… of course we want to make a recursive copy

-l 10 : the maximum level of recursion. if you have a really big website you may need to put a higher number, but 10 levels should be enough.

-p : download all necessary files for each page (css, js, images)

-N : Turn on time-stamping.

-F : When input is read from a file, force it to be treated as an HTML file.

-nH : By default, wget put files in a directory named after the site’s hostname. This will disabled creating of those hostname directories and put everything in the current directory.

–restrict-file-names=windows : may be useful if you want to copy the files to a Windows PC. (source: http://blog.jphoude.qc.ca/2007/10/16/creating-static-copy-of-a-dynamic-website/)

Once done we will use sync command sync all the static content to our S3 bucket using AWS CLI.

Let's make a new bucket (reminder: bucket names are unique and global)

aws s3 mb s3://newbucketweb --region us-west-1

Let's sync up all content to the new bucket

aws s3 sync . s3://newbucketweb

Sync up might take sometime depending on the objects we are syncing to our bucket.

Now tricky part is to enable public access to this new bucket we created.

aws s3api put-public-access-block \ --bucket newbucketweb \ --public-access-block-configuration "BlockPublicAcls=false,IgnorePublicAcls=false,BlockPublicPolicy=false,RestrictPublicBuckets=false"

If you go to AWS console, you should see the bucket settings as below:

bucketsettings

Now need to enable public read to the bucket using this JSON schema.

aws s3api put-bucket-policy --bucket newbucketweb --policy "{ \"Version\": \"2012-10-17\", \"Statement\": [ { \"Sid\": \"PublicReadGetObject\", \"Effect\": \"Allow\", \"Principal\": \"*\", \"Action\": \"s3:GetObject\", \"Resource\": \"arn:aws:s3:::newbucketweb/*\" } ] }"

This will add above rule to the bucket policy. You can check this via AWS console under bucket policy.

Let's enable our index.html page

aws s3 website "s3://newbucketweb" --index-document index.html --error-document index.html

The publicly accessible (as per our configuration) bucket endpoint would be constructed as per S3 guidelines as below:

http://<bucketname>.s3-website-<region>.amazonaws.com

Congratulations! you have a static copy of your site running under an S3 bucket. You can use this static copy for many purposes such as using as a static DR site.