Archive for the ‘CentOS’ Category

Monitoring file modifications with auditd with exceptions

Wednesday, November 19th, 2014

Playing with auditd, I had a need to monitor file modifications for all files recursively underneath a given directory. According to the auditctl(8) man page there are two ways of writing a rule to do this:

-w /directory/ -p wa
-a exit,always -F dir=/directory/ -F perm=wa

The former rule is basically a shortcut for the latter rule; the latter rule is also potentially more expressive with the addition of extra -F conditions. I also needed to ideally exclude certain files and/or sub-directories in the directory from triggering the audit rule and it turns out you do this:

-a exit,never -F dir=/directory/directory-to-exclude/
-a exit,never -F path=/directory/file-to-exclude
-a exit,always -F dir=/directory/ -F perm=wa

According to this post order is important; list the exceptions before the main rule.

Building mod_proxy_wstunnel for CentOS 6

Wednesday, August 6th, 2014

I had a need to be able to put an Apache-based reverse proxy in front of an install of Uchiwa which is a Node.js-based dashboard for Sensu. The only problem is that it uses WebSockets which means it doesn’t work with the regular mod_proxy_http module. In version 2.4.5 onwards there is mod_proxy_wstunnel which fills in the gap however CentOS 6 only has a 2.2.15 (albeit heavily patched) package.

There are various instructions on how to backport the module for 2.2.x (mostly for Ubuntu) but these involve compiling the whole of Apache from source again with the module added via an additional patch. I don’t want to maintain my own Apache packages but more importantly Apache has provided apxs a.k.a the APache eXtenSion tool to compile external modules without requiring the whole source tree available.

So, I have created a standalone RPM package for CentOS 6 that just installs the mod_proxy_wstunnel module alongside the standard httpd RPM package. In order to do this I took the original patch and removed the alterations to the various build files and also flattened the source into a single file, (the code changes were basically adding whole new functions so they were fine to just inline together). The revised source file and accompanying RPM spec file are available in this Github gist.

HP Microserver TPM

Friday, January 10th, 2014

I’ve been dabbling with DNSSEC which involves creating a few zone- and key-signing keys, and it became immediately apparent that my headless HP Microserver has very poor entropy generation for /dev/random. After poking and prodding it became apparent there’s no dormant hardware RNG that I can just enable to fix it.

Eventually I stumbled on this post which suggests you can install and make use of the optional TPM as a source of entropy.

I picked up one cheaply and installed it following the above instructions to install and configure it; I found I only needed to remove the power cord for safety’s sake, the TPM connector on the motherboard is right at the front so I didn’t need to pull the tray out.

Also, since that blog post, the rng-tools package on RHEL/CentOS 6.x now includes an init script so it’s just a case of doing the following final step:

# chkconfig rngd on
# service rngd start

It should then be possible to pass this up to any KVM guests using the virtio-rng.ko module.

HP Microserver Remote Management Card

Saturday, January 28th, 2012

I recently acquired the Remote Management card for my HP Microserver, which allows remote KVM & power control, IPMI management and hardware monitoring through temperature & fan sensors.

Note the extra connector on the card in addition to the standard PCI-e x1 connector which matches the dedicated slot on the Microserver motherboard. This presented a bit of a problem as I was using the space for the battery backup module for the RAID controller in the neighbouring slot.

Thankfully the long ribbon cable meant I could route the battery up to the space behind the DVD burner freeing the slot again. Once the card was installed and everything screwed back together I booted straight back into CentOS. Given IPMI is touted as a feature I figured that was the first thing to try so I installed OpenIPMI:

# yum -y install OpenIPMI ipmitool
...
# service ipmi start
Starting ipmi drivers:                                     [  OK  ]
# ipmitool chassis status
Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
Error sending Chassis Status command

Hmm, not good. Looking at dmesg shows the following is output when the IPMI drivers get loaded:

ipmi message handler version 39.2
IPMI System Interface driver.
ipmi_si: Adding SMBIOS-specified kcs state machine
ipmi_si: Adding ACPI-specified smic state machine
ipmi_si: Trying SMBIOS-specified kcs state machine at i/o address 0xca8, slave address 0x20, irq 0
ipmi_si: Interface detection failed
ipmi_si: Trying ACPI-specified smic state machine at mem address 0x0, slave address 0x0, irq 0
Could not set up I/O space
ipmi device interface

From reading the PDF manual it states that the IPMI KCS interface is at 0xCA2 in memory, not 0xCA8 that the kernel is trying to probe. Looking at the output from dmidecode shows where this value is probably coming from:

# dmidecode --type 38
# dmidecode 2.11
SMBIOS 2.6 present.
 
Handle 0x001B, DMI type 38, 18 bytes
IPMI Device Information
	Interface Type: KCS (Keyboard Control Style)
	Specification Version: 1.5
	I2C Slave Address: 0x10
	NV Storage Device: Not Present
	Base Address: 0x0000000000000CA8 (I/O)
	Register Spacing: Successive Byte Boundaries

This suggests a minor bug in the BIOS.

Querying the ipmi_si module with modinfo shows it can be persuaded to use a different I/O address so I created /etc/modprobe.d/ipmi.conf containing the following:

1
options ipmi_si type=kcs ports=0xca2

Then bounce the service to reload the modules and try again:

# service ipmi restart
Stopping all ipmi drivers:                                 [  OK  ]
Starting ipmi drivers:                                     [  OK  ]

As of CentOS 6.4, the above won’t work as the ipmi_si module is now compiled into the kernel. Instead, you need to edit /etc/grub.conf and append the following to your kernel parameters:

1
ipmi_si.type=kcs ipmi_si.ports=0xca2

Thanks to this post for the info. Instead of bouncing the service you’ll need to reboot, then try again:

# ipmitool chassis status
System Power         : on
Power Overload       : false
Power Interlock      : inactive
Main Power Fault     : false
Power Control Fault  : false
Power Restore Policy : always-off
Last Power Event     : 
Chassis Intrusion    : inactive
Front-Panel Lockout  : inactive
Drive Fault          : false
Cooling/Fan Fault    : false
# ipmitool sdr
Watchdog         | 0x00              | ok
CPU_THEMAL       | 32 degrees C      | ok
NB_THERMAL       | 35 degrees C      | ok
SEL Rate         | 0 messages        | ok
AMBIENT_THERMAL  | 20 degrees C      | ok
EvtLogDisabled   | 0x00              | ok
System Event     | 0x00              | ok
SYS_FAN          | 1000 RPM          | ok
CPU Thermtrip    | 0x00              | ok
Sys Pwr Monitor  | 0x00              | ok

Success! With that sorted, you can now use ipmitool to further configure the management card, although not all of the settings are accessible such as IPv6 network settings so you have to use the BIOS or web interface for some of it.

Overall, I’m fairly happy with the management card. It has decent IPv6 support and the Java KVM client works okay on OS X should I ever need it but I couldn’t coax the separate virtual media client to work, I guess only Windows is supported.

Forcing GUID Partition Table on a disk with CentOS 6

Saturday, August 6th, 2011

CentOS 6 is now out so I can finally build up a new HP ProLiant Microserver that I purchased for a new home server. Amongst many new features, CentOS 6 ships a GRUB bootloader that can boot from a disk with a GUID Partition Table (GPT for short). Despite not having EFI, the HP BIOS can boot from GPT disks so there’s no limitation with disk sizes that I can use.

However, before I tore down my current home server I wanted to test all of this really did work so I popped a “small” 500 GB disk in the HP. The trouble is the CentOS installer won’t use GPT if the disk is smaller than 2 TB. Normally this wouldn’t be a problem with a single disk apart from I want to specifically test GPT functionality but this can cause complications if your disk is in fact a hardware RAID volume because most RAID controllers allow the following scenario:

  1. You have a RAID 5 volume comprising 4x 500 GB disks, so the OS sees ~ 1.5 TB raw space.
  2. Pull one 500 GB disk, replace with a 1 TB disk, wait for RAID volume to rebuild.
  3. Rinse and repeat until all disks are upgraded.
  4. Grow RAID volume to use the extra space, OS now sees ~ 3 TB raw space.
  5. Resize partitions on the volume, grow filesystems, etc.

You’ll find your volume will have a traditional MBR partition scheme as it fell below the 2 TB limit so you can’t resize the partitions to fully use the new space. While it might be possible to non-destructively convert MBR to GPT, I wouldn’t want to risk it with terabytes of data. It would be far better to just use GPT from the start to save painting myself into a corner later.

If you’re doing a normal install, start the installer until you get to what’s known as stage 2 such that on (ctrl+)alt+F2 you have a working shell. From here, you can use parted to create an empty GPT on the disk:

# /usr/sbin/parted -s /dev/sda mklabel gpt

Return to the installer and keep going until you reach the partitioning choices. Pick the “use empty space” option and the installer will create new partitions within the new GPT. If you choose the “delete everything” option, the installer will replace the GPT with MBR again.

If like me you’re using kickstart to automate the install process, you can do something similar in your kickstart file with the %pre section, something like the following:

%pre
/usr/sbin/parted -s /dev/sda mklabel gpt
%end

Then make sure your kickstart contains no clearpart instructions so it will default to just using the empty space. The only small nit I found after the install is that the minimal install option only includes fdisk and not parted as well so if you want to manage the disk partitions you’ll need to add that either at install time or afterwards as fdisk doesn’t support GPT.

Storage controller probe order & Kickstart

Tuesday, June 14th, 2011

I’ve been configuring some Dell servers today that have one of Dell’s MD3220 storage arrays attached. The servers were completely blank so they needed kickstarting with CentOS 5.x. Not a problem, just use the standard one that’s used for 99% of the servers.

The problem that arises is the standard kickstart assumes the first disk it sees should be the disks internal to the server and partitions accordingly but annoyingly the installer loads the mpt2sas driver first for the HBA’s used to hook up the storage array, and then loads the megaraid_sas driver for the built-in RAID controller used for the internal disks. This means the internal disks could be anywhere from /dev/sdc onwards depending on what state the array is in.

An undocumented boot option is driverload= which seems to force the supplied list of drivers to be loaded first before anything else that’s detected. I found reference to this option in this post along with use of the nostorage boot option. However when I used the following options:

nostorage driverload=sd_mod:megaraid_sas

The installer still loaded the mpt2sas driver but it did force the megaraid_sas driver to be loaded first. This could just be a bug that mpt2sas isn’t on the list of “storage” drivers but the net result is acceptable in that /dev/sda now pointed to the internal disks. After the server rebooted /etc/modprobe.conf also had the correct scsi_hostadapterX aliases to keep things in the right order.

Holy Exploding Batteries Batman

Sunday, February 27th, 2011

If you have a UPS, it’s probably worth paying attention to the temperature reading if it reports such a thing. My APC Smart-UPS 1000VA suffered a failed battery which wasn’t totally unexpected given I had last replaced it about two years ago.

While I was awaiting the replacement RBC6 battery, I had a poke about with apcupsd and noticed that it reckoned the internal temperature of the UPS was around 39-40 ℃ which seemed a bit high.

New battery arrived so I pulled the front panel off the UPS and tried to use that stupid plastic flap APC put on the battery to help you pull it out, surprise, it tore off like it’s done on every damn battery. Okay, so I’d have to reach in and get my hand behind the battery to push it out, which due to the nature of what a UPS generally does felt like that bit in the Flash Gordon movie when Peter Duncan sticks his arm in the tree stump and gets fatally stung. Nope, still not coming out and I could feel the battery was quite hot so I pulled the power and left it to cool down for an hour or two. With a bit of help from gravity it became apparent why the battery was being so stubborn:

Bursting Battery

The photo probably doesn’t illustrate just how distorted each side of the battery was. According to apcupsd the temperature is now around 25 ℃ with the new battery so it’s probably worth sticking some monitoring on that. Grabbing the temperature is easy if apcupsd(8) is running:

# /usr/sbin/apcaccess | grep -i temp
ITEMP    : 25.9 C Internal

It should be easy enough to wrap that in a simple Nagios test and/or graph it with Cacti.

This blog is IPv6-enabled

Sunday, December 5th, 2010

I’ve been quite keen on getting IPv6 connectivity to my hosts, although I’m currently lacking either a home internet package or hosted server where there is native connectivity. Thankfully there are enough free tunnel providers available which work fine as a workaround.

I’m using SixXS for two tunnels; one to my home network and another to the VPS provided by Linode which amongst other things hosts this blog.

SixXS operate a system of credits whereby tasks such as creating tunnels, obtaining subnets and setting up reverse DNS delegation costs varying amounts of credits and the only way to earn credits is to keep your tunnel up and running on a week-by-week basis. In the interests of promoting reliable and stable connectivity this is a Good Thing.

Now, what must have happened about a year ago was I created the two tunnels, but because SixXS deliberately don’t give you enough credits to be able to request subnets at the same time, you have to wait a week or two to earn sufficient credits to be able to do that. So I got my home network sorted first as that was more important and then likely got distracted by something shiny and forgot to sort out a subnet for the server.

Now, you don’t necessarily need a subnet when you have just the one host at the end a tunnel, you can just use the IPv6 address allocated to the local endpoint of the tunnel. However, you can’t change the reverse DNS of that address, meaning it will be stuck called something like cl-123.pop-01.us.sixxs.net. If you’re planning on running a service which can be picky about the reverse DNS, (SMTP is a good example), then it’s probably best to get a subnet so you have full control over it.

By this point, requesting a subnet isn’t a problem credits-wise as the tunnels have been up for over a year. My existing tunnel configuration stored under /etc/sysconfig/network-scripts/ifcfg-sit1 looked something like this:

1
2
3
4
5
6
7
8
9
10
DEVICE=sit1
BOOTPROTO=none
ONBOOT=yes
IPV6INIT=yes
IPV6_TUNNELNAME=SixXS
IPV6TUNNELIPV4=1.2.3.4
IPV6TUNNELIPV4LOCAL=5.6.7.8
IPV6ADDR=2001:dead:beef:7a::2/64
IPV6_MTU=1280
TYPE=sit

Once the subnet has been allocated and routed to your tunnel, it’s simply a case of picking an address from that and altering the configuration like so:

1
2
3
4
5
6
7
8
9
10
11
DEVICE=sit1
BOOTPROTO=none
ONBOOT=yes
IPV6INIT=yes
IPV6_TUNNELNAME=SixXS
IPV6TUNNELIPV4=1.2.3.4
IPV6TUNNELIPV4LOCAL=5.6.7.8
IPV6ADDR=2001:dead:beef:7a::2/64
IPV6ADDR_SECONDARIES="2001:feed:beef::1/128"
IPV6_MTU=1280
TYPE=sit

Note the IPV6ADDR_SECONDARIES addition. Now your host should be accessible through both addresses and thanks to #199862 any outbound connections initiated from your host should use the subnet address rather than the tunnel address. You can test this with ip(8) and some known IPv6-connected hosts.

# host -t aaaa www.sixxs.net
www.sixxs.net is an alias for nginx.sixxs.net.
nginx.sixxs.net has IPv6 address 2001:1af8:1:f006::6
nginx.sixxs.net has IPv6 address 2001:838:2:1::30:67
nginx.sixxs.net has IPv6 address 2001:960:800::2
# ip -6 route get 2001:1af8:1:f006::6
2001:1af8:1:f006::6 via 2001:1af8:1:f006::6 dev sit1  src 2001:feed:beef::1  metric 0 
    cache  mtu 1280 advmss 1220 hoplimit 4294967295
# host -t aaaa ipv6.google.com
ipv6.google.com is an alias for ipv6.l.google.com.
ipv6.l.google.com has IPv6 address 2001:4860:800f::63
# ip -6 route get 2001:4860:800f::63
2001:4860:800f::63 via 2001:4860:800f::63 dev sit1  src 2001:feed:beef::1  metric 0 
    cache  mtu 1280 advmss 1220 hoplimit 4294967295

Trying again for the remote endpoint address of the tunnel gets a different result:

# ip -6 route get 2001:dead:beef:7a::1
2001:dead:beef:7a::1 via :: dev sit1  src 2001:dead:beef:7a::2  metric 256  expires 21315606sec mtu 1280 advmss 1220 hoplimit 4294967295

The kernel decides that in this case using the local address of the tunnel is a better choice, which I think is due to RFC 3484 and how Linux chooses a source address. If that expiry counter ever hits zero and the behaviour changes, I’ll let you know…

Issues with iptables stateful filtering

Wednesday, September 29th, 2010

I hit a weird issue today, I have Apache configured as a reverse proxy using mod_proxy_balancer which is a welcome addition in Apache 2.2.x. This is forwarding selected requests to more Apache instances running mod_perl, although this could be any sort of application layer, pretty standard stuff.

With ProxyStatus enabled and pushing a reasonable amount of traffic through the proxy, I started to notice that the application layer Apache instances would consistently get marked in error state every so often, removing them from the pool of available servers until their cooldown period expired and the proxy enabled them again.

Investigating the logs showed up this error:

[Tue Sep 28 00:24:39 2010] [error] (113)No route to host: proxy: HTTP: attempt to connect to 192.0.2.1:80 (192.0.2.1) failed
[Tue Sep 28 00:24:39 2010] [error] ap_proxy_connect_backend disabling worker for (192.0.2.1)

A spot of the ol’ google-fu turned up descriptions of similar problems, some not even related to Apache. The problem looked related to iptables.

All the servers are running CentOS 5 and anyone who runs this is probably aware of the stock Red Hat iptables ruleset. With HTTP access enabled, it looks something similar to this:

1
2
3
4
5
6
7
8
9
10
11
12
13
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -p 50 -j ACCEPT
-A RH-Firewall-1-INPUT -p 51 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp --dport 5353 -d 224.0.0.251 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m udp --dport 631 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 631 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 80 -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited

Almost all of it is boilerplate apart from line 12, which I added and is identical to the line above it granting access to SSH. Analysing the live hit counts against each of these rules showed a large number hitting that last catch-all rule on line 13 and indeed, this is what is causing the Apache errors.

Analysing the traffic with tcpdump/wireshark showed that the frontend Apache server is only getting as far as sending the initial SYN packet and it’s failing to match either the dedicated rule on line 12 for HTTP traffic, or even the rule on line 10 to match any related or previously established traffic, although I wouldn’t really expect it to match that.

Adding a rule before the last one to match and log any HTTP packets that are considered to be in the state INVALID showed that indeed for some strange reason, iptables is deciding that an initial SYN is somehow invalid.

More information about why it might be invalid can be coaxed from the kernel by issuing the following:

# echo 255 > /proc/sys/net/ipv4/netfilter/ip_conntrack_log_invalid

Although all this gave me was some extra text saying “invalid state” and the same output you get from the standard logging target:

ip_ct_tcp: invalid state IN= OUT= SRC=192.0.2.2 DST=192.0.2.1 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=21158 DF PROTO=TCP SPT=57351 DPT=80 SEQ=1402246598 ACK=0 WINDOW=5840 RES=0x00 SYN URGP=0 OPT (020405B40402080A291DC7600000000001030307)

From searching the netfilter bugzilla for any matching bug reports I found a knob that relaxes the connection tracking, enabled with the following:

# echo 1 > /proc/sys/net/ipv4/netfilter/ip_conntrack_tcp_be_liberal

This didn’t fix things, plus it’s recommended to only use this in extreme cases with a broken router or firewall. As all of these machines are on the same network segment with just a switch connecting them there shouldn’t be anything mangling the packets.

With those avenues exhausted the suggested workaround of adding rules similar to the following:

1
-A RH-Firewall-1-INPUT -m tcp -p tcp --dport 80 --syn -j REJECT --reject-with tcp-reset

before your last rule doesn’t really work as the proxy still drops the servers from the pool as before, but just logs a different reason instead:

[Tue Sep 28 13:59:23 2010] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 192.0.2.1:80 (192.0.2.1) failed
[Tue Sep 28 13:59:23 2010] [error] ap_proxy_connect_backend disabling worker for (192.0.2.1)

This isn’t really acceptable to me anyway as it requires your proxy to retry in response to a server politely telling it to go away and it’s going to be a performance hit. The whole point of using mod_proxy_balancer for me was so it could legitimately remove servers that are dead or administratively stopped, how is it supposed to tell the difference between that and a dodgy firewall?

The only solution that worked and was deemed acceptable was to simply remove the state requirement on matching the HTTP traffic, like so:

-A RH-Firewall-1-INPUT -m tcp -p tcp --dport 80 -j ACCEPT

This will match both new and invalid packets, however it’s left me with a pretty low opinion of iptables now. Give me pf any day.

Building CentOS 5 images for EC2

Thursday, September 16th, 2010

I had a need to create some CentOS 5 hosts on Amazons EC2 platform, and while there’s nothing stopping you from reusing a pre-built AMI, it’s always handy to know how these things are built from scratch.

I had a few basic requirements:

  • I’ll be creating various sizes of EC2 instances, so both i386 and x86_64 AMI’s are required.
  • Preferably boot the native CentOS kernel rather than use the generic EC2 kernel as I know the CentOS-provided Xen kernel JFW.

You’ll need the following:

  • An existing Intel/AMD Linux host, this should be running CentOS, Fedora, RHEL, or anything as long as it ships a usable yum(8). It should also be an x86_64 host if you’re planning on building for both architectures and have around 6GB of free disk space.
  • Amazon AWS account with working Access Key credentials for S3 and a valid X.509 certificate & private key pair for EC2.
  • The EC2 AMI tools and the EC2 API tools installed and available in your $PATH.
  • A flask of weak lemon drink.

Lets assume you’re working under /scratch, you’ll need to first create a directory to hold your root filesystem, and also a couple of directories within that ahead of installing anything:

# mkdir -p /scratch/ami/{dev,etc,proc,sys}

The /dev directory needs a handful of devices creating:

# MAKEDEV -d /scratch/ami/dev -x console
# MAKEDEV -d /scratch/ami/dev -x null
# MAKEDEV -d /scratch/ami/dev -x zero

A minimal /etc/fstab needs to be created:

1
2
3
4
5
/dev/sda1               /                       ext3    defaults        1 1
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0

There are other partitions that will be available to your instance when it is up and running but they vary between instance types, this is the bare minimum that is required and should work for all instance types. If you want to add the additional partitions here, refer to the Instance Storage Documentation. I will instead use Puppet to set up any additional partitions after the instance is booted.

/proc and /sys should also be mounted inside your AMI root:

# mount -t proc proc /scratch/ami/proc
# mount -t sysfs sysfs /scratch/ami/sys

Create a custom /scratch/yum.cfg which will look fairly similar to the one your host system uses:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
[main]
cachedir=/var/cache/yum
keepcache=0
debuglevel=2
logfile=/var/log/yum.log
distroverpkg=redhat-release
tolerant=1
exactarch=1
obsoletes=1
gpgcheck=0
plugins=1
reposdir=/dev/null
 
# Note: yum-RHN-plugin doesn't honor this.
metadata_expire=1h
 
# Default.
# installonly_limit = 3
 
[centos-5]
name=CentOS 5 - Base
baseurl=http://msync.centos.org/centos-5/5/os/$basearch/
enabled=1
 
[centos-5-updates]
name=CentOS 5 - Updates
baseurl=http://msync.centos.org/centos-5/5/updates/$basearch/
enabled=1
 
[centos-5-epel]
name=Extra Packages for Enterprise Linux 5 - $basearch
baseurl=http://download.fedora.redhat.com/pub/epel/5/$basearch/
enabled=1

Notably disable the gpgcheck directive and make sure no additional repositories are picked up by setting the reposdir to somewhere where no .repo files are located otherwise you’ll scoop up any repositories configured on your host system. By making use of the $basearch variable in the URLs, this configuration should work for both i386 and x86_64.

If you have local mirrors of the package repositories, alter the file to point at them and be a good netizen. You will need to make sure that your base repository has the correct package groups information available. Feel free to also add any additional repositories.

You’re now ready to install the bulk of the Operating System. If the host architecture and target architecture are the same, you can just do:

# yum -c /scratch/yum.conf --installroot /scratch/ami -y groupinstall base core

If however you’re creating an i386 AMI on an x86_64 host, you need to use the setarch(8) command to prefix the above command like so:

# setarch i386 yum -c /scratch/yum.conf --installroot /scratch/ami -y groupinstall base core

This mostly fools yum and any child commands into thinking the host is i386 and without it, you’ll just get another x86_64 image. Sadly you can’t do the reverse to build an x86_64 AMI on an i386 host.

This should give you a fairly minimal yet usable base however it won’t have the right kernel installed, so do the following to remedy this:

# yum -c /scratch/yum.cfg --installroot /scratch/ami -y install kernel-xen
# yum -c /scratch/yum.cfg --installroot /scratch/ami -y remove kernel

(Remember to use setarch(8) again if necessary)

You can also use variations of the above commands to add or remove additional packages as you see fit.

All that’s required now is to perform a bit of manual tweaking here and there. Firstly you need to set up the networking which on EC2 is simple, one interface using DHCP. Create /etc/sysconfig/network-scripts/ifcfg-eth0:

1
2
3
4
5
6
7
DEVICE=eth0
BOOTPROTO=dhcp
ONBOOT=yes
TYPE=Ethernet
USERCTL=yes
PEERDNS=yes
IPV6INIT=no

And also /etc/sysconfig/network:

1
NETWORKING=yes

The networking still won’t work without the correct kernel module(s) being loaded so create /etc/modprobe.conf with the following:

1
2
alias eth0 xennet
alias scsi_hostadapter xenblk

The second module means the instance can see the various block devices as well as the first module fixing the networking. The ramdisk for the kernel now needs to be updated so it knows to pull in these two modules and load them at boot, for this you need to know the version of the kernel installed. You can do this a number of ways, but the easiest is to just look at the /boot directory:

# ls -1 /scratch/ami/boot
config-2.6.18-164.15.1.el5xen
grub
initrd-2.6.18-164.15.1.el5xen.img
message
symvers-2.6.18-164.15.1.el5xen.gz
System.map-2.6.18-164.15.1.el5xen
vmlinuz-2.6.18-164.15.1.el5xen
xen.gz-2.6.18-164.15.1.el5
xen-syms-2.6.18-164.15.1.el5

In this case the version is “2.6.18-164.15.1.el5xen”. Using this, we need to run mkinitrd(8) but we also need to use chroot(1) to run the command as installed in your new filesystem, using the filesystem as its / otherwise it will attempt to overwrite bits of your host system. So something like the following:

# chroot /scratch/ami mkinitrd -f /boot/initrd-2.6.18-164.15.1.el5xen.img 2.6.18-164.15.1.el5xen

No /etc/hosts file is created so it’s probably a good idea to create one of those:

1
127.0.0.1	localhost.localdomain localhost

SELinux will be enabled by default and although your instance will boot, you won’t be able to log in so the easiest thing is to just disable it entirely by editing /etc/selinux/config so it looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#	enforcing - SELinux security policy is enforced.
#	permissive - SELinux prints warnings instead of enforcing.
#	disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
#	targeted - Only targeted network daemons are protected.
#	strict - Full SELinux protection.
#	mls - Multi Level Security protection.
SELINUXTYPE=targeted 
# SETLOCALDEFS= Check local definition changes
SETLOCALDEFS=0

You could also disable it with the correct kernel parameter at boot time. There may be a way to allow SELinux to work, it may just be that the filesystem needs relabelling which you can force on the first boot by creating an empty /scratch/ami/.autorelabel file. I’ll leave that as an exercise for the reader or myself when I’m bored enough.

Now we need do deal with how to boot the native CentOS kernel. Amazon don’t allow you to upload your own kernels or ramdisks to boot with your instances so how do you do it? Apart from their own kernels, they now provide a PV-GRUB kernel image that when it boots, it behaves just like the regular GRUB bootloader and reads your instance filesystem for a grub.conf and then uses that to select the kernel and loads it from your instance filesystem along with the accompanying ramdisk.

We don’t need to install any boot blocks but we will need to create a simple /boot/grub/grub.conf using the same kernel version we used when recreating the ramdisk:

1
2
3
4
5
6
default=0
timeout=5
title CentOS (2.6.18-164.15.1.el5xen)
	root (hd0)
	kernel /boot/vmlinuz-2.6.18-164.15.1.el5xen ro root=/dev/sda1
	initrd /boot/initrd-2.6.18-164.15.1.el5xen.img

If we install any updated kernels, they should automatically manage this file for us and insert their own entries, we just need to do this once.

To match what you normally get on a regular CentOS host, a couple of symlinks should also be created:

# ln -s grub.conf /scratch/ami/boot/grub/menu.lst
# ln -s ../boot/grub/grub.conf /scratch/ami/etc/grub.conf

When you create an EC2 instance you have to specify an existing SSH keypair created within EC2 which you should be able to use to log into the instance. This is accomplished by the usual practice of having the public part of the key being copied into /root/.ssh/authorized_keys however I initially thought that was magic that Amazon did for you, but they don’t, you need to do it yourself.

When the instance is booted, the public part of the key (as well as various other bits of metadata) is available at the URL http://169.254.169.254/latest/meta-data/public-keys/0/openssh-key, so the easiest thing to do is add the following to /etc/rc.d/rc.local:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.
 
touch /var/lock/subsys/local
 
if [ ! -d /root/.ssh ] ; then
        mkdir -p /root/.ssh
        chmod 700 /root/.ssh
fi
 
/usr/bin/curl -f http://169.254.169.254/latest/meta-data/public-keys/0/openssh-key > /root/.ssh/authorized_keys
 
chmod 600 /root/.ssh/authorized_keys

You can be more elaborate if you want, but this is enough to allow you to log in with the SSH key. One thing I found was that because the firstboot service started at boot, that sat for a while asking you if wanted to do any firstboot-y things which would delay your rc.local hack from running until it timed out, so Amazon would say your instance was running but you couldn’t SSH in for a minute or two. Easiest thing is to disable firstboot:

# chroot /scratch/ami chkconfig firstboot off

You can also use this to disable more services, there’s a few enabled by default that are arguably useless in an EC2 instance but they won’t break anything if you leave them enabled. You can also enable other services if you installed any additional packages.

Finally, if you need to play about with the image you can just do the following:

# chroot /scratch/ami

(Remember to use setarch(8) again if necessary)

This gives you a shell inside your new filesystem if you need to tweak anything else, one thing I found necessary was to set your /etc/passwd file up correctly and optionally set a root password, which you could use instead of the SSH key.

# pwconv
# passwd
Changing password for user root.
New UNIX password: ********
Retype new UNIX password: ********
passwd: all authentication tokens updated successfully.

If you want to use any RPM-based commands while you’re inside this chroot and you created an i386 image on an x86_64 host, you may get the following error:

# rpm -qa
rpmdb: Program version 4.3 doesn't match environment version
error: db4 error(-30974) from dbenv->open: DB_VERSION_MISMATCH: Database environment version mismatch
error: cannot open Packages index using db3 -  (-30974)
error: cannot open Packages database in /var/lib/rpm

This is because some of the files are architecture-dependent and despite using setarch(8) they still get written as x86_64 format. It’s a simple fix:

# rm -f /scratch/ami/var/lib/rpm/__db*

If you query the RPM database inside the chroot and then want to install or remove more packages with yum outside the chroot, you will need to do the above again.

Before you start to package up your new image there’s one small bit of clean up, for some reason yum creates some transaction files that you’ll notice are under /scratch/ami/scratch/ami/var/lib/yum, I couldn’t work out how to stop it making those so you just need to blow that directory away:

# rm -f /scratch/ami/scratch

You should also unmount the /proc and /sys filesystems you mounted before installing packages:

# umount /scratch/ami/proc
# umount /scratch/ami/sys

Right, you’re now ready to package up your new image.

First thing is to bundle the filesystem which will create one big image file, then chop it up into small ~ 10MB pieces and then create an XML manifest file that ties it all together. You will need your AWS user ID for this part which you can find in your AWS account:

# ec2-bundle-vol -c <certificate_file> -k <private_keyfile> -v /scratch/ami -p centos5-x86_64 -u <user_id> -d /scratch -r x86_64 --no-inherit

This can take a few minutes to run. Remember to also set the architecture appropriately.

The next step is to then upload the manifest and image pieces either to an existing S3 bucket that you own, or a new bucket that will be created:

# ec2-upload-bundle -m /scratch/centos5-x86_64.manifest.xml -b <bucket> -a <access_key> -s <secret_key>

This part has the longest wait depending on how fast your internet connection is, you’ll be uploading around 330MB per image. If you seriously made a flask of weak lemon drink, I’d drink most of it now.

Once that finishes the final step is to register the uploaded files as an AMI ready to create instances from it. Before we do that though, we need to find the correct AKI to boot it with. There should be four AKI’s available in each location, two for each architecture which differ in how they try and find the grub.conf on the image. One treats the image as one big filesystem with no partitioning, the other assumes the image is partitioned and assumes grub.conf is on the first partition.

List all of the available AKI’s with the following:

# ec2-describe-images -C <certificate_file> -K <private_keyfile> -o amazon | grep pv-grub
IMAGE	aki-407d9529	ec2-public-images/pv-grub-hd0-V1.01-i386.gz.manifest.xml	amazon	available	public		i386	kernel				instance-store
IMAGE	aki-427d952b	ec2-public-images/pv-grub-hd0-V1.01-x86_64.gz.manifest.xml	amazon	available	public		x86_64	kernel				instance-store
IMAGE	aki-4c7d9525	ec2-public-images/pv-grub-hd00-V1.01-i386.gz.manifest.xml	amazon	available	public		i386	kernel				instance-store
IMAGE	aki-4e7d9527	ec2-public-images/pv-grub-hd00-V1.01-x86_64.gz.manifest.xml	amazon	available	public		x86_64	kernel				instance-store

Assuming this is still an x86_64 image, the AKI we want is aki-427d952b. More documentation about these is available here.

Now we can register the AMI like so:

# ec2-register -C <certificate_file> -K <private_keyfile> -n centos5-x86_64 -d "CentOS 5 x86_64" --kernel aki-427d952b <bucket>/centos5-x86_64.manifest.xml
IMAGE   ami-deadbeef

The output is our AMI id which we’ll use for creating instances. If you haven’t already created an SSH keypair, do that now:

# ec2-add-keypair -C <certificate_file> -K <private_keyfile> <key_id>

This returns the private key portion of the SSH keypair which you need to save and keep safe, there’s no way of retrieving it if you lose it.

Finally create an instance using the AMI id we got from ec2-register along with the id of a valid SSH keypair:

# ec2-run-instances -C <certificate_file> -K <private_keyfile> -t m1.large -k <key_id> ami-deadbeef

This will return information about your new instance which will initially be in the pending state. Periodically run the following:

# ec2-describe-instances -C <certificate_file> -K <private_keyfile>

Once your instance is in the running state, you should be able to see the hostname and IP that has been allocated and you can now SSH in using the private key you saved from ec2-add-keypair.

When you’ve finished with the instance, you can terminate it as usual with ec2-terminate-instances.

If you no longer need the AMI image or wish to change it in some way, you need to first deregister it with:

# ec2-deregister -C <certificate_file> -K <private_keyfile> ami-deadbeef

Then remove the bundle from the S3 bucket:

# ec2-delete-bundle -b <bucket> -a <access_key> -s <secret_key> -m /scratch/centos5-x86_64.manifest.xml

Then in the case of making changes, repeat the steps from ec2-bundle-vol onwards.