■ Current topic

■ Previous topic

5 The job system: how AWSACtools work

■ Next topic

7 How to create an ATLAS Software Release EBS snapshot

■ Quick search

header

6 Creating an AMI for AWSAC from scratch

In this chapter it is shown, how to create an AWSAC AMI: an Amazon Machine Image (AMI) for ATLAS Computing, as it is used by the job system described in the chapter before. The whole process is described in every detail and from scratch; in order to afford easy reproduction.

The AMI will contain a specialized Linux system that fulfils the following requirements:

  • running properly on EC2
  • ATLAS Software Release runs properly on it

Note

At this point, we take the opportunity to refer to http://cernvm.cern.ch - a project with the aim to «provide a baseline Virtual Software Appliance for use by LHC experiments at CERN». Perhaps this will be important for the AWSAC project in the future.

6.1 Prerequisites

At first the operating system to run on EC2 has to be chosen, just like the way to create the AMI. The decisions I made will be discussed shortly.

6.1.1 Choosing Scientific Linux 4

In general EC2 is compatible to any Linux distribution. But to make things easy, it is recommended to use only new distributions with new software versions. This will arise the fewest problems. One reason for this is, that most tools that are needed to control EC2 need recent software versions. There are more reasons and you will see the emerging problems by reading this documentation. It is important to know, that EC2 is guaranteed to run properly with new Fedora systems. On the other hand ATLAS Software is only guaranteed to run properly on Scientific Linux 4. With this old software one has to expect problems with EC2. It seemed that nobody else tried to get it running on EC2, so I could not expect any support.

Hence I tried to get ATLAS Software running properly on Scientific Linux 5 and Fedora 7, 8, 9. These tries exposed that the effort to obtain a running ATLAS Software Release under these distributions is really big. I aborted these approaches and decided to get Scientific Linux 4 running on EC2, since «ATLAS Software is running properly only on Scientific Linux 4» seemed to be the strongest condition.

6.1.2 Two different ways to create an AMI

Amazon EC2 AMI Tools provide two commands to create an Amazon Machine Image.

The first command ec2-bundle-image creates an AMI through a loopback file. So you have to push the system you want to run on EC2 into an image file step by step from another running system on your home PC.

The second command ec2-bundle-vol creates an AMI from a running system. Using this you can e.g. set up the system you want to run on EC2 as a virutal machine on your home PC. Then you edit and configure this system “from inside” (the conventional way). When you think that it is ready to run on EC2, you can bundle it to an AMI with ec2-bundle-vol at runtime. For this the system itself needs the AMI Tools installed.

The second possibility is more convenient for our purposes. So I decided to set up a virtual machine (VM) with Scientific Linux using VMware Player. In the following I assume that VMware Player is successfully installed on your system.

6.2 Set up a Scientific Linux VM for an AMI

In this part I will show you the way to a new Scientific Linux (SL) VM that is ready for an Amazon Machine Image. The process is described by means of VMplayer for virtualization and 32 bit SL 4.7 as linux distribution.

6.2.1 Preparation

At first create a directory to put all needed files in, including the virtual disk for the Scientific Linux system. For this reason it is necessary that there is some free disk space (at least three times the space required by the Scientific Linux system we plan to set up - 10 GB should be enough). In the following I assume that this new directory is /SL47ami.

There are mainly two options to get the SL packages you need during your SL installation. If you download the CD iso files, you can install SL from these local files. But if you want to do this, every iso file needs to be attached to the VM as IDE drive. I decided to install the new operating system using the online repository directly. This is more convenient than the many-iso-files-way. But, of course, we need a small iso file here, too. Download the SL installer image (~6 MB) to the new folder:

$ wget http://ftp.scientificlinux.org/linux/scientific/47/i386/images/SL/boot.iso

A VMware virtual machines’ hard disk drive is reading from and writing to a special file in the host filesystem. This file must be vmdk formatted. The open source machine emulator and virtualizer QEMU brings along qemu-img that can create files of a specific size and format these files as vmdk. After downloading and installing QEMU, you can use qemu-img to create a new vmdk file. For our purposes a virtual harddisk drive with 10 GB storage is big enough:

$ qemu-img create -f vmdk /SL47ami/SL47ami.vmdk 10G

Note

The size of the so-called instance storage on a running EC2 instance is not affected by this choice later on. At this juncture it is sufficient to ensure that there is enough space for Scientific Linux itself.

If you want to start a virtual machine with VMware Player, you have to give a configuration file - a so-called vmx file - to the player. Create a new configuration file; e.g. SL47ami.vmx. In this file - in essence - the virtual hardware must be configured. For our purposes the most important hardware the VM needs is a cdrom-image drive with the SL installer “inside”, a hard disk drive (corresponding to the created vmdk file) and an ethernet adapter. So something like the following lines should be put into the new configuration file (working for me):

config.version = "8"
virtualHW.version = "4"

displayName = "Scientific Linux 4.7 - minimal for AMI"
guestOS = "rhel4"

memsize = "1024"

floppy0.present = "FALSE"

ide0:0.present = "TRUE"
ide0:0.filename = "/mnt/scratch/gehrcke/virtual-disks/SL47ami.vmdk"

ide1:0.present = "TRUE"
ide1:0.deviceType = "cdrom-image"
ide1:0.startConnected = "TRUE"
ide1:0.fileName = "/home/iwsatlas1/gehrcke/virtual_SL47ami/boot.iso"

ethernet0.present = "TRUE"
ethernet0.connectionType = "nat"
ethernet0.addressType = "generated"
ethernet0.generatedAddress = "00:0c:29:fa:b6:cb"
ethernet0.generatedAddressOffset = "0"

A large part of this should be self-explanatory. guestOS describes the operating system class you want to run on your VM. In this case this is Red Hat Enterprise Linux 4. You should adjust memsize (in MB) of your VM to your real hardware and to the needs of the applications running on it. If you want to run different VMs at the same time, you should vary the generatedAddressOffset. This avoids equaling MAC addresses.

Now the VM is ready to start up:

$ vmplayer SL47ami.vmx

Note

The VM boots up like a normal computer. Since the hard disk drive is still empty, the virtual BIOS looks for a bootable disk in the cdrom drive. In this case then the SL installer boots up!

6.2.2 Installing Scientific Linux

The SL installer needs to know the installation method. I chose linux text installation. The installation menu is self-explanatory. Nevertheless I will mention every step because some settings are really important or a bit tricky.

At first choose your language.

In the next step you can decide whether you want to install from an http/ftp online repository or e.g. from local iso files. As I described above, I chose the online repository. http is convenient, so choose http.

Now you have to set up TCP/IP. Try choosing DHCP config. For me this sometimes resulted in endless waiting. The reason for this is not clear to me; I guess that there are problems with the VMware DHCP server managing the “virtual subnet” for virtual machines on the host system. If you encounter the same problem you can easily workaround: check out the dhcpd.conf file corresponding to the VMnet that should provide the internet connection. For me this was /etc/vmware/vmnet8/dhcpd/dhcpd.conf There you can get all information you need to configure TCP/IP manually. This was fast for me, even if DHCP config did not work properly.

In the http setup menu you have to submit the SL location on an http server omitting “http://”. Enter the following:

website name: ftp.scientificlinux.org
Scientific Linux directory: linux/scientific/47/i386

You like to configure the installation on your own, so use the Custom Installation Type.

You also like partitioning the hard disk on your own. Choose Disk Druid. And yes, you know that all data will be lost!

Now Disk Druid wants to know what to do. The configuration I chose is described schematically in this summary:

Add Partition:
    mount point: /
    ext 3
    9500 MB
    force primary

Add Partition:
    swap
    fill all available space
    force primary

In the following steps I specified to use Grub, not to pass any boot options to the kernel and not to use a grub password. The Boot Loader Configuration was OK. I chose to install the boot loader to the MBR and acknowledged IP and Hostname configuration.

In terms of security I chose not to use a firewall and to disable SELinux. I think that those things are not necessary for our extremely specialized EC2 instances and may produce problems. The EC2 Network Security regulation should care for a sufficient amount of security. This was a fast decision and maybe I am wrong.

Now decide on additional Languages, the Time Zone and the Root Password.

In the following Package Group Selection you have to select the components your new system should consist of. Select what you need. My minimal config is the following:

Main Tree:
    select:
        YUM
        APT
        Development Tools:
            select:
                all gcc-options
            deselect:
                Emacs
        Administration Tools:
            select:
                all
        System Tools:
            select:
                all
    deselect:
        everything else (really! we don't need any graphics etc.)

Note

Selecting all gcc options ensures that the compiling ATLAS Software Release applications run correctly (including KitValidation MooEvent).

Begin installation and reboot when it is done.

Done! Your new Scientific Linux VM is up and running. You can now log in as root.

As a first action do a YUM update to get the latest security updates:

$ yum update

6.2.3 Configure the system for EC2

The next big objective is to bundle and upload an AMI of our new system. For this you have to breach some hurdles. At first you have to implement the Amazon EC2 AMI Tools into the new system. The tools provide the command ec2-bundle-vol we want to use to bundle the running system into an AMI. And they provide the possibility to upload the AMI to Amazon’s Simple Storage S3 using ec2-upload-bundle. Get the AMI Tools running on an old system like Scientific Linux 4 requires some new software. The Amazon EC2 AMI Tools do not provide the possibility to register an AMI on EC2. So you have to install and configure the Amazon EC2 Command-Line Tools (also called API Tools), too. But this is not everything you have to configure: the system’s hardware detection tool kudzu needs some modification, too, so that the system is able to boot up correctly on EC2.

Note

If you download the AMI Tools as RPM and try to install it with rpm -i ec2-ami-tools.noarch.rpm, you will notice that some requirements/dependencies are not met by Scientific Linux 4. I needed a newer version of tar (greater or equal to 1.15) and a newer version of Ruby (greater or equal to 1.8.2). After some investigation I decided to install both of them from source and then to install the RPM without regarding dependencies (see below). Maybe going around the package manager (and perhaps SRPMs) is not the cleanest way but it is easy and it really works good, as you can see in the following parts.

Install a newer tar:

Download the latest source tarball, extract it and cd to the source directory. This will look like

$ wget http://ftp.gnu.org/gnu/tar/tar-1.20.tar.bz2
$ tar xjf tar-1.20.tar.bz2
$ cd tar-1.20

We want to brutally overwrite the the original tar. The executable’s path is /bin/tar. So start configure with --prefix=/, compile with make and then install the new tar:

$ ./configure --prefix=/
$ make
$ make install

/bin/tar should now be the newer version. Test it:

$ which tar
$ tar --version
Install Ruby:

Download the latest source tarball, extract it and cd to the source directory. This will look like

$ wget ftp://ftp.ruby-lang.org/pub/ruby/1.8/ruby-1.8.7.tar.bz2
$ tar xjf ruby-1.8.7.tar.bz2
$ cd ruby-1.8.7

We want to place the executable in /usr/bin. So start configure with --prefix=/usr, compile with make and then install Ruby:

$ ./configure --prefix=/usr
$ make
$ make install

/usr/bin/ruby should now exist. Check it out and test your new Ruby installation:

$ which ruby
$ ruby --version
Install the AMI Tools:

Download and install the RPM without regarding dependencies:

$ wget http://s3.amazonaws.com/ec2-downloads/ec2-ami-tools.noarch.rpm
$ rpm -i --nodeps ec2-ami-tools.noarch.rpm

Test it! Type ec2- and then press TAB to see the new commands available. Try e.g.

$ ec2-upload-bundle --help

The following error should be normal:

/usr/lib/site_ruby/aes/amiutil/uploadbundle.rb:1:in `require':
no such file to load -- aes/amiutil/crypto (LoadError)
from /usr/lib/site_ruby/aes/amiutil/uploadbundle.rb:1

See also

Amazon’s Developer Guide - Bundling an AMI: «If you receive a load error when running one of the AMI utilities, Ruby might not have found the path. To fix this, add /usr/lib/site_ruby to Ruby’s library path, which is set in the RUBYLIB environment variable.»

So, before using the AMI Tools, we have to add /usr/lib/site_ruby to $RUBYLIB. This should work:

$ export RUBYLIB=$RUBYLIB:/usr/lib/site_ruby
$ ec2-upload-bundle --help

We will set the environment variable $RUBYLIB automatically later on. The AMI Tools are now installed properly.

Note

The API Tools need Java installed. A version of at least 1.5 is required. We will use Yum to install it.

Install Java:

Use Yum to install Java:

$ yum install java

Acknowledge to install java-1.5.0-sun-compat (in my case) and jdk.

Install and configure the API Tools:

Download the API Tools as zip file and extract it:

$ cd /root
$ wget http://s3.amazonaws.com/ec2-downloads/ec2-api-tools.zip
$ unzip ec2-api-tools.zip

The tools were extracted to a new directory e.g. /root/ec2-api-tools-1.3-24159 (with the API Tools version in the name). There is no “installation” needed, but some further configuration.

Most of the commands provided by the tools need to know the path to “the user’s PEM encoded RSA public key certificate file” and the path to “the user’s PEM encoded RSA private key file”. You can download these files from AWS’ web interface after signing up successfully for EC2. The AMI you will bundle will be your own AMI and you will be the only one using instances of this AMI and you will almost for sure need these key files available in running instances more often. So it is no problem and even convenient to store these files in the virtual system and to bundle them into the AMI. scp them from anywhere (called user@host:/path/to/.ec2) to /root/.ec2 on the SL virtual machine:

$ mkdir /root/.ec2
$ scp user@host:/path/to/.ec2/* /root/.ec2
    user@hosts's password:
    cert-***********************.pem     100%  916     0.9KB/s   00:00
    pk-***********************.pem       100%  926     0.9KB/s   00:00

Note

Why chosing /root as directory for all the files? EC2 instances are created and terminated rapidly. If you make a mistake on an instance you can just terminate it and try again. For this reason working as root is not as dangerous as on a “normal” system. At first I worked with different users on EC2 instances but I realized that there was no need to. Finally the linux systems I used on EC2 only knew one user: root. I store everything special that normally would be stored in a home directory in /root.

The API Tools need some environment variables to be set as explained in Amazons EC2 Getting Started Guide:

See also

Amazon’s Getting Started Guide - Prerequisites: «The command line tools depend on an environment variable (JAVA_HOME) to locate the Java runtime. This environment variable should be set to the full path of the directory that contains a sub-directory named bin which in turn contains the java (on Linux/Unix) or the java.exe (on Windows) executable. You might want to simplify things by adding this directory to your path before other versions of Java.»

See also

Amazon’s Getting Started Guide - Setting up the Tools: «The command line tools depend on an environment variable (EC2_HOME) to locate supporting libraries. You’ll need to set this environment variable before you can use the tools. This should be set to the path of the directory into which the command line tools were unzipped. This directory is named ec2-api-tools-A.B-rrrr (A, B and r are version/release numbers), and contains sub-directories named bin and lib. [...] The environment variable EC2_PRIVATE_KEY should reference your private key file, and EC2_CERT should reference your X509 certificate.»

The best thing is to create a new file /root/AWS_SET_ENV_VARS.sh with the following content (customize it with your file- and directory names):

export JAVA_HOME=/usr
export EC2_HOME=/root/ec2-api-tools-*.*-*****
export EC2_PRIVATE_KEY=/root/.ec2/pk-*************************.pem
export EC2_CERT=/root/.ec2/cert-*************************.pem
export PATH=$PATH:$EC2_HOME/bin

Then just source the file and the needed variables are set:

$ source /root/AWS_SET_ENV_VARS.sh
Cleanup:
Keep the system clean and slim because you have to pay for the amount of data you store on S3. Remove the archives and the folders they were extracted to that are not needed anymore (tar-archive and dir, Ruby-archive and dir, API Tools archive)

Note

In one of the early tests I bundled this system state into an AMI and tried to run it on EC2. But I could not connect to the running EC2- instance. Using the Amazon EC2 Query API-call GetConsoleOutput I could read the boot log. The graphical Hardware Discovery Utility kudzu recognized a hardware change when booting my system on EC2 but it did not reconfigure automatically. So eth0 could not be brought up and for this reason there was no way to connect to the EC2 instance. The two essential lines were:

Hardware configuration timed out.Run '/usr/sbin/kudzu' from the command line to re-detect.

and:

Bringing up interface eth0: Device eth0 has different MAC address than expected, ignoring.
[FAILED]

VMware emulates other hardware than Xen, which is the virtualization software running on EC2. So the change of MAC address makes sense. kudzu is able to reconfigure this, but it needs someone to press a key within 30 seconds. Nobody can interfere with a graphical boot application on a remote machine. This is adverse. After reading here and there and some testing I could solve this problem as you can read in the following part.

Modify kudzu and hardware configuration:

  • Remove the complete HWADDR-line from /etc/sysconfig/network-scripts/ifcfg-eth0:

    $ cd /etc/sysconfig/network-scripts
    $ mv ifcfg-eth0 backup_ifcfg-eth0
    $ cat backup_ifcfg-eth0 | grep --invert-match HWADDR > ifcfg-eth0
    
  • Remove the complete class: NETWORK from /etc/sysconfig/hwconf with e.g. vi. I deleted the following lines:

    -
    class: NETWORK
    bus: PCI
    detached: 0
    device: eth0
    driver: pcnet32
    desc: "Advanced Micro Devices [AMD] 79c970 [PCnet32 LANCE]"
    network.hwaddr: 00:0C:29:FA:B6:CB
    vendorId: 1022
    deviceId: 2000
    subVendorId: 1022
    subDeviceId: 2000
    pciType: 1
    pcidom:    0
    pcibus:  0
    pcidev: 10
    pcifn:  0

Note

What will happen on the next reboot (e.g. on EC2)? kudzu will recognize network hardware that is not listed in /etc/sysconfig/hwconf, the listing of current installed hardware. kudzu will try to configure it, but 30 seconds will pass without doing anything. After this, the ethernet hardware itself works very well, even without being configured by kudzu. The MAC address was deleted out of ifcfg-eth0, so the prior problem will disappear:

Bringing up interface eth0:  [  OK  ]

For this reason connecting to the remote machine using ssh will be possible. After logging in, manually invoking kudzu in the quiet mode will update the hardware listing without any (graphical) problems:

$ kudzu -q

After this, another reboot will be kudzu-free with a running device eth0. So this is the way we plan to do.

Warning

To ensure that the device eth0 is brought up properly on EC2 instance boot, this state has to be bundled before the next reboot is done (as explained in the note-box before).

6.3 Bundle, upload and register the AMI

In the following part I will describe how to bundle the Scientific Linux virtual machine. After bundling, we will upload the new Amazon Machine Image to Amazon’s Simple Storage S3 using the AMI Tools. To tell EC2 that there is a new AMI stored in a S3-bucket, we will have to register the uploaded AMI using the API Tools.

6.3.1 Bundle

As discussed in 6.1.2 Two different ways to create an AMI we will use ec2-bundle-vol to bundle the AMI. You should at first read the help:

$ export RUBYLIB=$RUBYLIB:/usr/lib/site_ruby
$ ec2-bundle-vol --help

Note

You do not want to set all environment variables like RUBYLIB manually anymore? Append /root/AWS_SET_ENV_VARS.sh and then source it again:

$ echo 'export RUBYLIB=$RUBYLIB:/usr/lib/site_ruby' >> /root/AWS_SET_ENV_VARS.sh
$ source /root/AWS_SET_ENV_VARS.sh

Build the command line options for ec2-bundle-vol:

I will show you the command line options you need step by step. The AMI will be encrypted using information from your EC2 private key and certificate. So the first three options of every ec2-bundle-vol invoking are:

-k $EC2_PRIVATE_KEY     (path to private key file)
-c $EC2_CERT            (path to certificate file)
-u ************         (EC2 user id without hyphens)

Note

Set a EC2_UID environment variable. It will be usefull! Append /root/AWS_SET_ENV_VARS.sh:

$ echo 'export EC2_UID=************' >> /root/AWS_SET_ENV_VARS.sh
$ source /root/AWS_SET_ENV_VARS.sh

The bundle command needs to know the system architecture (one of i386, x86_64). Our VM for ATLAS Computing is a 32bit system. You have to submit the directory to save the image to and the name of the image. The image itself is will exist two times (once in a big image file and once in ~10 MB parts) in the image directory. I chose /mnt because it is excluded automatically from bundling. The name should be expressive because it will be the most important identifier later on.

-r i386                 (32bit arch)
-d /mnt                 (image directory)
-p SL47-AWSAC-base      (image prefix/name)

Now it is important to know that the following bundle will not happen on EC2. As stated in the help we have to deactivate inheriting instance metadata (in fact I do not know the benefit of inheriting). And: the bundle program must generate an fstab file. This is necessary to boot up properly on EC2.

--no-inherit            (do not inherit instance metadata)
--generate-fstab        (inject a generated EC2 fstab)

Invoke ec2-bundle-vol:

Warning

Before invoking the command this is the last chance to check things and to change something before the system state is bundled.

At first check, if you really cleaned up installation files like archives or extracted archives.

Then let me tell you that setting some other environment variables is convenient:

Note

To upload files to S3 - as it will happen when you invoke ec2-upload-bundle in the next step - you need the so-called AWS Access Key ID and the AWS Secret Key. Get them from your AWS webinterface and store them as environment variables to /root/AWS_SET_ENV_VARS.sh:

$ echo 'export AWS_ACCESS_KEY_ID=******************' >> /root/AWS_SET_ENV_VARS.sh
$ echo 'export AWS_SECRET_ACCESS_KEY=**************' >> /root/AWS_SET_ENV_VARS.sh
$ source /root/AWS_SET_ENV_VARS.sh

Now call ec2-bundle-vol with the commandline options derived above:

$ ec2-bundle-vol -k $EC2_PRIVATE_KEY -c $EC2_CERT -u $EC2_UID
                 --generate-fstab --no-inherit -r i386 -d /mnt -p SL47-AWSAC-base

Note

This special set of command line options is necessary only when you bundle an AMI at home. As soon as the image is stored on S3 and rebundled from an EC2 instance, the bundle command looks a bit different.

You will see an output like this:

Copying / into the image file /mnt/SL47-AWSAC-base...
Excluding:
     /var/lib/nfs/rpc_pipefs
     /sys
     /proc
     /proc/sys/fs/binfmt_misc
     /dev/pts
     /dev
     /media
     /mnt
     /proc
     /sys
     /mnt/SL47-AWSAC-base
     /mnt/img-mnt
1+0 records in
1+0 records out
mke2fs 1.35 (28-Feb-2004)
NOTE: rsync with preservation of extended file attributes failed.
Retrying rsyncwithout attempting to preserve extended file attributes...
/etc/fstab:
     # Legacy /etc/fstab
     # Supplied by: ec2-ami-tools-1.3-21885
     /dev/sda1 /     ext3    defaults 1 1
     /dev/sda2 /mnt  ext3    defaults 0 0
     /dev/sda3 swap  swap    defaults 0 0
     none      /proc proc    defaults 0 0
     none      /sys  sysfs   defaults 0 0
Bundling image file...
Splitting /mnt/SL47-AWSAC-base.tar.gz.enc...
Created SL47-AWSAC-base.part.00
...
Generating digests for each part...
Digests generated.
Creating bundle manifest...
ec2-bundle-vol complete.

When bundling is complete, you have an Amazon Machine Image of your Scientific Linux system in your local virtual file system. It consists of several part files like SL47-AWSAC-base.part.00 and one manifest file like SL47-AWSAC-base.manifest.xml. In the next step these files will simply be uploaded to Amazon Simple Storage S3 using a command line tool Amazon delivers.

6.3.2 Upload

Use ec2-upload-bundle to upload the part files and the manifest file. You should read the --help message at first. I will shortly explain the needed command line options. To interact with Simple Storage S3, the AWS Access Key ID and the AWS Secret Key are needed. Submit them with -a and -s. As described in the S3 introduction, files uploaded to S3 are stored as objects in buckets. You have to submit the bucket you want to store the AMI in with -b. Since the space of bucket names is a space all AWS users share, you have to find a bucket name that does not exist until now. I got “ATLAS” ;). At least ec2-upload-bundle needs to know which AMI to upload: submit the path to the manifest file with -m.

My command looks like:

$ ec2-upload-bundle -a $AWS_ACCESS_KEY_ID -s $AWS_SECRET_ACCESS_KEY
                    -b ATLAS -m /mnt/SL47-AWSAC-base.manifest.xml

You get an error message?

Server.RequestTimeTooSkewed(403):
The difference between the request time and the current time is too large.
Bundle upload failed.

Every HTTP request (which is calling an Amazon Web Services API command like e.g. an upload to S3) contains a timestamp. If this client GMT differs from the server GMT by more than 15 minutes, the server declines the request with error 403. Why does that happen to us? Because Scientific Linux‘s kernel base interrupt rate has been increased. Together with virtualization this results in time running much too slow.

See also

http://www.gossamer-threads.com/lists/linux/kernel/494604 - there you can learn much more details.

I tried setting the time once before starting the upload. But then the error reappeared in the middle of the upload, because the VM’s clock really runs very slow (note: my upload was very fast!). I did not want to modify the system with things using ntp. I needed a quick (and dirty) solution, because uploading an AMI from “a VM at home” will happen only once or seldom. So I wrote settimeloop.sh:

#!/bin/sh
# settimeloop.sh: Get a timestring from a web server and set date using this
# string. Repeat this endlessly.
while true ;
do
    wget -m -nd http://gehrcke.de/awsac/permstuff/time.php
    date --set="`cat time.php`"
    sleep 10
done

using the following time.php:

<?php
echo gmstrftime("%a %b %d %H:%M:%S GMT %Y");
?>

Note

time.php will stay on my server - so you can use it! Since it delivers GMT, it works for all timezones. The date --set command should process the delivered timestring successfully on every english linux system.

Now create (download) /root/settimeloop.sh, make it executable and run it in a new shell. Then re-invoke ec2-upload-bundle. Step by step:

Note

You can run settimeloop.sh in background using &, too. Then you can skip the following new-shell-step, but you should redirect the output to a file or to /dev/null. Run it like this: $ ./settimeloop.sh > looplog  2>&1 &

  • Find out the inet addr (IP address) of device eth0 on your VM using ifconfig. Open a new shellconnection to your VM from the virtualizing host system using this IP address. For me this was (typing in a shell of my virtualizing HOST system - not in a shell of the VM):

    $ ssh root@172.16.30.137
    
  • Download settimeloop.sh. Make it executable and run it:

    $ cd /root
    $ wget http://gehrcke.de/awsac/permstuff/settimeloop.sh
    $ chmod u+x /root/settimeloop.sh
    $ ./settimeloop.sh
    
  • re-invoke ec2-upload-bundle from the primary shell of your VM (the same command like before)

In my case the output looks like:

Uploading bundled image parts to https://s3.amazonaws.com:443/ATLAS ...
Uploaded SL47-AWSAC-base.part.00 to https://s3.amazonaws.com:443/ATLAS/SL47-AWSAC-base.part.00.
...
Uploading manifest ...
Uploaded manifest to https://s3.amazonaws.com:443/ATLAS/SL47-AWSAC-base.manifest.xml.
Bundle upload completed.

Depending on your internet connection this may take a very long time.

Note

Sometimes you may get other errors while uploading. If the connections breaks, you do not have to start from beginning.

While uploading various times, I got two different errors and the reason for them was not trackable for me:

Uploaded **** to https://s3.amazonaws.com:443/****
Error: failed to upload "****", Curl.Error(56):
SSL read: error:00000000:lib(0):func(0):reason(0), errno 104.
Bundle upload failed.
Uploaded ********* to https://s3.amazonaws.com:443/********
Error: failed to upload "***********", Server.InternalError(500)
: We encountered an internal error. Please try again.
Bundle upload failed.

In every case you can resume the download. Check out ec2-upload-bundle --help: you will see, that there is something like a “resume”-option: «--part PART: Upload the specified part and upload all subsequent parts.» This mostly worked for me instantly. But then you have to take care by yourself that every part file really is stored on S3 successfully. You can recheck with the Firefox S3 Extension S3Fox.

Lets say the last part file uploaded was Uploaded *****.part.31. Then just copy the upload command produced by ami_upload and append a --part 32:

ec2-upload-bundle -a $AWS_ACCESS_KEY_ID -s $AWS_SECRET_ACCESS_KEY -b ** -m ** --part 32
[...]
Skipping **.31.
Uploaded **.part.32 to https://s3.amazonaws.com:443/**.part.32.
[...]
Uploaded manifest to https://s3.amazonaws.com:443/**.manifest.xml.
Bundle upload completed.

When the last file, the manifest file, is uploaded, the new AMI on S3 can be registered for use with EC2.

6.3.3 Register

To register the image you can use the API Tools command ec2-register or Elasticfox, that was first mentioned in Manage and monitor EC2. In both cases you need to know where your AMI manifest file is stored on S3 (bucketname/manifestfilename). This is the command I invoked using the API Tools:

$ ec2-register -K $EC2_PRIVATE_KEY -C $EC2_CERT ATLAS/SL47-AWSAC-base.manifest.xml

EC2 checks, if the checksums listed in the manifest file correspond to the part files. On success EC2 assigns a unique identification string (AMI ID) to the AMI you ordered to register. The command line tool returns this AMI ID:

IMAGE       ami-33d1355a

Now you have a registered AMI of your Scientific Linux system stored on S3. It is ready to start up in Amazon’s Elastic Computing Cloud! You can shut down your local virtual machine using shutdown -h now.

6.4 Run, optimize and rebundle the AMI on EC2

In this part I will describe how to run an EC2 instance of the new AMI. Then I will accomplish some unique optimization/modification of the AMI and finish in rebundling so that the modified system will be saved in a second AMI.

6.4.1 Run and connect

In a few moments you can feel as a manager of your own AMIs and EC2 instances. For this you need a “management environment”. I think you have seen enough of the API Tools. If you don’t have any other possibilities, you can to use them. They are convenient if you e.g. want to run an EC2 command from within an instance. For this you installed and bundled them into the AMI. You now want to run an EC2 instance of your new AMI. Of course, you can use the API Tools to run instances. But using them for every running, terminating, checking, registering, ... would soon get nerving.

I suppose that you have a graphical desktop environment (to read this documentation) and that you are able to use Firefox as browser. So I recommend to use Elasticfox, as I described in Manage and monitor EC2. Then you do not need to set up the AMI Tools on your local system, because Elasticfox supports all essential EC2 API calls. In the following I assume that you use Elasticfox (S3Fox for S3 Simple Storage is not bad, too) as “management environment”.

In Elasticfox‘s “Machine Images” list you will see all public AMIs and your AMIs. Use the filter field to filter out most of them (e.g. look for the bucket name you stored the new AMI in). Rightclick the new AMI and choose Launch instance(s) of this AMI. The default settings (e.g. security group default and instance type m1.small) are okay - so click Launch.

Refresh the “Your instances” list - the instance State should be pending. This means the AMI is processed and the virtual machine will start booting in some time (1-5 minutes). Use the time to set up EC2 Network Security to allow external requests (from “the internet”) on port 22 to your instances. By default every port is blocked. Do you remember the security group setting before launching the instance? This was the default group. Change it in the Security Groups tab of Elasticfox: for group default grant permission for traffic from CIDR 0.0.0.0/0 from port 22 to port 22 for TCP/IP. With this setting you will be able to connect to running instances in the default group using ssh.

When the instance State changes to running this means the virtual machine of your AMI has started booting. After waiting another 1-3 minutes, you can try to connect to the instance using the public DNS name. Rightclick the instance in the instances list and choose Copy Public DNS Name to clipboard. Then try connecting to your instance as root:

$ ssh root@ec2-75-101-217-23.compute-1.amazonaws.com

This should result in:

ssh root@ec2-75-101-217-23.compute-1.amazonaws.com
The authenticity of host [...]
Are you sure you want to continue connecting (yes/no)? yes
[...]
root@ec2-75-101-217-23.compute-1.amazonaws.com's password:

Enter your root password - congratulation, you are now logged in on an virtual machine of your own Scientific Linux system on EC2.

Note

If you can not connect to your instance for a long time, then use the Show console output option for your instance in Elasticfox. It allows you to see the output of an instance. The output does not get to you in realtime. There is some delay. Show console output is very useful to debug the instance boot (in fact it is the only way).

6.4.2 Optimize

Show console output lists some errors and warnings while booting the new AMI. In this paragraph I will describe how to remove the causes of these and how to make using the new AMI more comfortably. Some of the things I show you are required for some special functionality, some are for cleaning up the system and some are for comfort only. I recommend to carry out each step.

Reconfigure hardware:

As described here, there is one last step missing to solve the Kudzu-problem completely. Execute Kudzu in quiet mode:

$ kudzu -q
Add kernel modules:

After optimizing the running instance we want to bundle it into a new AMI using ec2-bundle-vol. But this now would result in:

Could not find any loop device.
Maybe this kernel does not know about the loop device? (If so, recompile or `modprobe loop'.)

After some web search I found a blogpost from Scott Parkerson solving the problem. Download the modules of the specific EC2 kernel build 2.6.16-xenU from Amazon’s Fedora Core 4 AMI:

$ wget http://people.rpath.com/~scott/cabinet/ec2/2.6.16-xenU-modules.tar.bz2

I mirrored the file: http://gehrcke.de/awsac/permstuff/2.6.16-xenU-modules.tar.bz2

Extract the archive to / (it then places all files to /lib/modules/2.6.16-xenU) and remove it:

$ tar xvjf 2.6.16-xenU-modules.tar.bz2 -C /
$ rm 2.6.16-xenU-modules.tar.bz2

Add the loop module:

$ modeprobe loop
Deactivate Thread-Local Storage:

During the boot, the following warning appears:

** WARNING: Currently emulating unsupported memory accesses  **
**          in /lib/tls glibc libraries. The emulation is    **
**          slow. To ensure full performance you should      **
**          install a 'xen-friendly' (nosegneg) version of   **
**          the library, or disable tls support by executing **
**          the following as root:                           **
**          mv /lib/tls /lib/tls.disabled                    **
** Offending process: init (pid=1)                           **

So it makes sense to deactivate tls:

$ mv /lib/tls /lib/tls.disabled
Modify runlevel 4:

As you can see in console output, EC2 instances start up with runlevel 4. The current instance is overloaded with services I think you will not need. I deactivated the following:

$ cd /etc/rc.d/rc4.d
$ mv S05kudzu backup_S05kudzu
$ mv S80sendmail backup_S80sendmail
$ mv S09isdn backup_S09isdn
$ mv S09pcmcia backup_S09pcmcia
$ mv S40smartd backup_S40smartd
$ mv S18rpcidmapd backup_S18rpcidmapd

I could have deactivated more services. Feel free to extend this list (if you know what you are doing).

Autoset environment:

In the previous paragraphs some various environment variables were needed. They will be needed in the future, too. So we will configure the system to automatically set the environment variables listed in /root/AWS_SET_ENV_VARS.sh at login. At first lets check whether your /root/AWS_SET_ENV_VARS.sh is complete. It should define the following variables (with partly different values):

export JAVA_HOME=/usr
export EC2_HOME=/root/ec2-api-tools-1.3-24159
export EC2_PRIVATE_KEY=/root/.ec2/pk-***********.pem
export EC2_CERT=/root/.ec2/cert-***********.pem
export PATH=$PATH:$EC2_HOME/bin
export RUBYLIB=$RUBYLIB:/usr/lib/site_ruby
export EC2_UID=*******************
export AWS_ACCESS_KEY_ID=*******************
export AWS_SECRET_ACCESS_KEY=*******************

Then make this file sourced when logging in as root:

$ echo source AWS_SET_ENV_VARS.sh >> /root/.bashrc

I like ll to show all files in a special way, so this is the right moment to set this alias and/or other aliases:

$ echo "alias ll='ls -lah --color'" >> /root/.bashrc

You may now check if all the changes you made on the running instance until here work. reboot the instance:

$ reboot

Note

Invoking the standard reboot command keeps the current EC2-instance running. This is like restarting a computer. So all changes you made in the running instance will not get lost. The AMI is not loaded again. This would happen when you shut down/terminate the instance and launch a new one from the AMI. But keep in mind that, if there are any problems while rebooting, modified data will be lost.

After some time check the console output using Elasticfox. Kudzu should say nothing, the TLS-warning should be gone, some services should not boot up and the modprobe: FATAL: Could not load /lib/modules/2.6.16-xenU/modules.dep-error should be gone. Reconnect to the instance as root. By entering env you will see, that the special environment variables from AWS_SET_ENV_VARS.sh were set automatically.

Install useful scripts:

When you often rebundle an image on EC2, then typing the whole ec2-bundle-vol- and ec2-upload-bundle-commands will get nerving. I will give you support for this with small Python scripts which are using the automatically set environment variables:

$ wget http://gehrcke.de/awsac/permstuff/AMIutils/root/ami_bundle
$ wget http://gehrcke.de/awsac/permstuff/AMIutils/root/ami_upload
$ wget http://gehrcke.de/awsac/permstuff/AMIutils/root/ami_delete
$ chmod u+x ami_bundle ami_delete ami_upload

I will describe the usage after the next step.

Cleanup:
In the next step the instance state will be bundled into a new AMI. So this is the right moment to look for trash like archives that are no more needed and so on. Delete everything that is not needed any more.

6.4.3 Rebundle, upload, register

At first invoke the Python script ami_bundle I deliver for bundling:

$ ./ami_bundle

The usage should be self-explanatory. Enter an expressive AMI name. It should contain a version number because you almost for sure will change the AMI in the future. Bundle the image to /mnt. This folder is excluded from bundling. For me it looks like:

================= a script to invoke ec2-bundle-vol =================
     make sure that $EC2_PRIVATE_KEY, $EC2_CERT, $EC2_UID are set
=====================================================================

image name: SL47-AWSAC-v01
image dir: /mnt

command:
ec2-bundle-vol -k $EC2_PRIVATE_KEY -c $EC2_CERT -u $EC2_UID
               -r i386 --no-inherit -d /mnt -p SL47-AWSAC-v01

execute? (y/n): y
Copying / into the image file /mnt/SL47-AWSAC-v01...
[...]
Created SL47-AWSAC-v01.part.56
Generating digests for each part...
Digests generated.
Creating bundle manifest...
ec2-bundle-vol complete.

Then invoke the Python script I deliver for uploading. The script needs to know the S3-bucket to store the AMI in. Submit this with -b. I recommend to take the same bucket like for the first AMI. Additionally the script needs the path to the manifest-file of the new AMI stored in the local file system of your current instance. Use TAB so that you do not have to type the whole path (for this reason I decided to use command line options for ami_upload). For me this looks like this:

$ ./ami_upload -b ATLAS -m /mnt/SL47-AWSAC-v01.manifest.xml
=================== a script to invoke ec2-upload-bundle ======================
make sure that $EC2_HOME, $AWS_ACCESS_KEY_ID and $AWS_SECRET_ACCESS_KEY are set

          usage: -b S3bucketName -m PathToImageManifest
===============================================================================


command:
ec2-upload-bundle -a $AWS_ACCESS_KEY_ID -s $AWS_SECRET_ACCESS_KEY
                  -b ATLAS -m /mnt/SL47-AWSAC-v01.manifest.xml

execute? (y/n): y
Uploading bundled image parts to https://s3.amazonaws.com:443/ATLAS ...
[...]
Uploaded manifest to https://s3.amazonaws.com:443/ATLAS/SL47-AWSAC-v01.manifest.xml.
Bundle upload completed.

Note

If the connections breaks, you do not have to start from beginning. As described above, there is a way to resume an upload.

The modified AMI is now stored on S3. After registering you can use it. Using Elasticfox, registering is really easy: rightclick a free area in Elasticfox‘s “Machine Images” list and click Register a new AMI. Enter bucket/manifestPath. In my case this is ATLAS/SL47-AWSAC-v01.manifest.xml. After confirming, the new AMI will appear in the list.

Now you can shut down the instance of the “old” AMI:

$ shutdown -h now

6.5 Modify the AMI for AWSACtools

In the next steps the AMI should get prepared for the job system the AWSACtools deliver. So run an instance of your latest AMI (the one bundled in the paragraph before) and log in. AWSACtools mostly consist of Python scripts. Some of them use features of a newer Python version than the one that comes with SL47. So we will install a second, newer Python. AWSACtools use the Python module boto to invoke AWS API calls, as described in 4.1.3 Using the API for own applications. We will install subversion to get the latest version of boto. After this the server autorun components of AWSACtools will be copied to the instance. AWSACtools will be injected into the system startup using rc.local. Then the instance will be bundled into a new AMI.

Install newer Python:

Changing the distribution delivered Python is not a good idea. So let us install Python 2.5.2 as an alternative installation (no hard links and no manual) to /opt/bin/python:

$ wget http://www.python.org/ftp/python/2.5.2/Python-2.5.2.tar.bz2
$ tar xjf Python-2.5.2.tar.bz2
$ cd Python-2.5.2
$ ./configure --prefix=/opt
$ make
$ make altinstall
$ cd ..
$ rm -rf Python-2.5.2
$ rm Python-2.5.2.tar.bz2
$ ln -s /opt/bin/python2.5 /opt/bin/python

Test the new Python:

$ /opt/bin/python

should result in something like this:

Python 2.5.2 (r252:60911, Oct 13 2008, 14:33:49)
[GCC 3.4.6 20060404 (Red Hat 3.4.6-10)] on linux2
Install subversion:

I make it short:

$ yum install subversion
Install boto:

boto should be installed as a site package for the new Python. Download the latest revision of boto to /root/boto and then create a symbolic link to this folder in the site-packages directory of Python:

$ svn checkout http://boto.googlecode.com/svn/trunk/ /root/boto
$ ln -s /root/boto/boto /opt/lib/python2.5/site-packages

Note

/root/boto/boto is because the linked directory must contain the __init__.py and boto‘s SVN tree has another subdirectory boto.

Now you can test boto. Run /opt/bin/python and enter:

>>> import boto
>>> conn = boto.connect_ec2()
>>> images = conn.get_all_images()
>>> print images

This will print a list of objects describing all public and your AMIs.

Note

boto uses the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY TO authenticate with AWS.

Install AWSACtools autorun:

There is not much of AWSACtools that has to be bundled into the AMI since AWSACtools are designed to be flexible. It is only the beginning of a chain of events that will happen when a job session is started. At first download a modified rc.local and overwrite the original one:

$ wget http://gehrcke.de/awsac/permstuff/AWSACtools/awsac-autorun_rev001_for_chap6/etc/rc.d/rc.local
$ mv rc.local /etc/rc.d/rc.local
$ chmod u+x /etc/rc.d/rc.local

The sense of this new rc.local and of the two files downloaded in the next step is explained in how AWSACtools work¶.

Two more files are needed:

$ mkdir /root/awsac
$ cd /root/awsac
$ wget http://gehrcke.de/awsac/permstuff/AWSACtools/awsac-autorun_rev001_for_chap6/root/awsac/awsac-autorun.sh
$ wget http://gehrcke.de/awsac/permstuff/AWSACtools/awsac-autorun_rev001_for_chap6/root/awsac/getsessionarchive

This was everything.

Rebundle, upload, register:

Cleaned everything up? Now bundle the instance state into a new AMI:

$ cd /root
$ ./ami_bundle

I chose:

image name: SL47-AWSAC-v02
image dir: /mnt

Every modification of an instance that you want to bundle in a new AMI is followed by 6.4.3 Rebundle, upload, register... So, do it! :)

mpiphysiklogo unilogo