## Wednesday, December 8, 2010

### More fractals!

I don't resist the pleasure to put more pictures I made. In order to achieve the continuous color, I used a trick called "fractal renormalization". In short, after the iteration that put me out of my convergence radius, I chose the color based on the number of iterations minus log(log(|Zn|))/log(2). More information here.

You can find the high resolution of these pictures here:

First image

Second image (detail)

## Sunday, December 5, 2010

### Fractals!

I recently watched Nova's excellent documentary on fractals. This reminded me of my first programs to render a Julia set, on a 486DX33 running MS-DOS 6.0!

After a few readings in the Internet, I decided to do it again, but this time on my mighty Intel Quad core/2GHz, running Linux Fedora 14.

First hurdle: I never programmed any application whose output is a graphic. So be it, I'll do a small program that will generate a PNG file.

A few attempts later - and discovering that RGB PNG expects you to have 3 bytes per pixel, kind of logical - I did it.

My program is neither smart nor elaborate. But it does the job for Julia sets based on z2+c, where z and c belongs to the set of complex numbers.

The following example is for c=-1.6

And for c=0.256

Going further: if I have a program that can render a Julia set picture for a value of c, I should be able to generate a movie for a path in the set of complex numbers.

Yes. Using the multiple PNGs generated and mencoder (from mplayer), I assembled a couple of animations.

fractal1: from -2 to 2 (works using VLC)

c moves on the line from -2 to 2.

fractal2: sliding a parabola (works using VLC)

c moves along the parabola 2z2-1

## Wednesday, December 1, 2010

### Fedora 14 - Anjuta crashes or is killed by signal 6 (SIGABRT)

A few days ago, Anjuta started to crash on me. A quick trip to the CLI showed this happened due to not being able to open a library (libsqlite3.so in my case).

Setting the LD_LIBRARY_PATH worked the issue around.

`export LD_LIBRARY_PATH=/lib:/usr/lib:/usr/lib64`
Anjuta would now run from the CLI.

## Sunday, November 28, 2010

### A Clustered Samba Fileserver

A few weeks ago, I started this. Due to lack of time, to the many mistakes I noted during the process and to a move from Ubuntu 10.10/virtualbox to Fedora 14/kvm+qemu, I put it aside for a while.

Time to get back to it!

The test setup is made of:

• 2 virtual machines running on kvm+qemu
• Fedora 14

1. The need for a clustered CIFS Fileserver

Sad to say, but Windows Sharing is probably the most used file sharing over a network. Most customer grade NASes seldom offer anything else and a lot of clients have only netbios/CIFS available to access a network share.

But netbios/CIFS also comes with a price: a certain CPU and memory usage per client. That aside is not the issue, but in environment where hundreds or thousands of clients simultaneously access a share, this can lead to serious performance degradation.

At least two options are offered:
1. Buy and install a beefier file server
2. Add multiple nodes accessing the same filesystems, each presenting the same share

Solution 1 has an obvious limit: as the number of simultaneous clients grows again, another beefier server will be needed, and another ... until the hardware cost prohibits any further upgrade.

Solution 2, on the other end, while technically more complex, gives the ability to grow very high. of course, at a certain point, the limiting factor will most likely be the IO/s on the backend storage.

This solution declines in multiple versions: using a unique fileserver as the backend shared amongst all the nodes, using a centralized backend storage (iSCSI, Fiber Channel, ...) shared amongst all nodes, or using a local storage replicated to other nodes.

All have advantages and disadvantages. In this document, I'll look at the third option, as it's a most likely option for SMEs, although certain vendors, such as Promise or Compellent, have really cheap SANs.

I also find this solution more scalable: if you need more space - or performance - you can add a disk in each node, replicate it and add it to the Volume Group you share.

2. Installing and configuring DRBD

As mentioned, we need a mechanism to replicate a local block device (ie, a partition or a physical device) with an identical device in another computer.

DRBD replicates a block device local to a computer with a block device on another computer. Said otherwise, DRBD will make sure that all write requests issued to the local block device is mirrored to a block device located on another computer, based on its configuration.

In itself, DRBD is not a cluster product (I repeat: DRBD is not a cluster product!) but it's a piece in the puzzle.

DRBD works by creating resources. See DRBD's home page [1] for an illustration of this. It also has some very fine documentation.

Most distribution have a package available to install DRBD. Also, DRBD is mainline since 2.6.33, so recent kernels already have the necessary configuration to support it.

DRBD works by creating resources, that specify which local device is to be used, what peer(s) do(es) the replication and what is the device on the peer to use. Once this is down, the resource can be activated and will start to work as a local device.

Initially, this worked by having a node as the primary, and the other as the secondary. However, it's now possible to have both nodes be primary, meaning they are both RW. This has a serious implication in terms of sensitivity to split-brain situation: if a write is issued on both nodes while the network connectivity is lost, which one should be propagated? Both? What if they both concern the same location on the physical device?

In an active-active scenario, we don't have much choice and have to have both nodes primaries.

In my setup, the resource is r0 and the local device to back it is vda3. Here is my resource file.

resource r0 {
on linux-nas01 {
device    /dev/drbd1;
disk      /dev/vda3;
meta-disk internal;
}
on linux-nas02 {
device    /dev/drbd1;
disk      /dev/vda3;
meta-disk internal;
}
startup {
become-primary-on both;
}
net {
allow-two-primaries;
after-sb-2pri disconnect;
}
}
The startup and net directives were taken from DRBD's website. In the documentation, there are sections specific to Redhat Clusters, which I encourage to read [2].

On Fedora, an important step is to either open the ports configured in the resources on the firewall, or to disable the firewall. Being test hosts, I opted for the latter.

service iptables stop
chkconfig iptables off

And make sure DRBD is on, and will start at boot time.

chkconfig drbd on
service drbd start
In my case, I had to take an extra step, as DRBD wouldn't activate the resource, due to no peer being the formal primary and no UpToDate status.

After that, vda3 started its replication from a node to the other. The command "drbd-overview" will indicate that there is a sync in progress. Let it complete.

3. Installation of the cluster and tools

Next: we will take care of the cluster components on both nodes.

Fedora uses the Redhat Cluster Suite, composed of various pieces, such as cman, coro and such.

Important note: if you don't want to edit a few configuration files, make sure that the node names you give match the hostnames AND the DNS entries.

First, let's create the cluster, named NASCluster.

ccs_tool create NASCluster
And let's create a fence mechanism [3]. This will guarantee that if a node is acting stray, it will be prevented from being part of the cluster, or accessing cluster resources.

And let's add our nodes to the cluster.

ccs_tool addnode linux-nas01 -n 1 -v 1 -f linux-nas01
ccs_tool addnode linux-nas02 -n 2 -v 1 -f linux-nas02
Last, let's check that everything is as we expect it to be.

ccs_tool lsnode
ccs_tool lsfence
Replicate /etc/cluster/cluster.conf to the other node, i.e. using scp.

And start cman on both nodes.

service cman start
Congratulations. At this point, the command cman_tool status and clustat should report a few information about your cluster being up.

4. Configuring clvmd, cluster LVM daemon

This one is a sneaky b....d. It took me almost a day to figure out what was wrong.

Every time I started clvmd, I would end up with cman terminating, not being able to restart it nor to unfence the node.

It appears that the default mechanism to insure cluster integrity is, for clvmd, "auto", which tries to detect whether corosync, cman or another tool should be used. I assume that my corosync configuration was incomplete. But it's possible to force cman to be used.

In /etc/sysconfig/clvmd - you may have to create the file - place the following line.

CLVMDOPTS="-I cman"
Next, edit /etc/lvm/lvm.conf and change the locking_type to 3. This is a recommendation from DRBD's documentation.

Also, make sure lvm knows it's a clustered implementation.

lvmconf --enable-cluster
And you shall be all set to start clvmd.

service clvmd start
If everything goes OK, clvmd should return, saying that nothing was found. This is normal.

Let's finish the cluster install by starting rgmanager.

service rgmanager start
Don't forget to do the clvmd configuration  and to start the daemons on both nodes.

5. Creation of the LVM configuration

Our goal, after all, is to make the device available on the cluster. So, let's do it after a quick recap on what is LVM.

Logical Volume Management allows one to group different "physical volumes", that is physical devices such as hard drives, logical devices, such as partitions or iSCSI/FC targets or even virtual devices such as DRBD devices into Volume groups, in which various logical volumes can be carved as needed.

A big advantage is the ability to quickly add space to a logical volume by giving it more extents, and growing the filesystem.

If down the road you notice that your /var partition is running out of space, either you have some free extents in the Volume Group you can give to that Logical Volume, or you may add a new disk, add it as a Physical Volume to the Volume Group and allocate now free extents to the Logical Volume.

In our case, the drbd resource, accessible as /dev/drbd1 will be treated as a physical volume, added to a volume group called drbd_vg and 900MB will be allocated to the Logical Volume drbd_lv.

pvcreate /dev/drbd1
vgcreate drbd_vg /dev/drbd1
lvcreate -n drbd_lv -L 900M /dev/drbd_vg
If everything went right, you can issue pvscan, vgscan and lvscan on the second node, and it should return you the various volumes you just created on the first node. It may be necessary to refresh the clvmd service.

Side note: an issue I got with the default start order ...

At a point, I stopped both nodes and restarted, to discover (horrified) that the actual partition, vda3, was then used rather than the drbd device. The reason is simple: the lvm2-monitor service starts, by default, before the drbd service.

I still have to go through the documentation, but as my setup didn't use lvm for anything else than the clustered file system, I went away by making sure drbd started before lvm. HOWEVER ... lvm2-monitor also starts at run-level 2, which drbd is not supposed to. So I disabled lvm2 in run-level 2.

6. Creation of GFS2 filesystem

The end is close. Let's now create a GFS2 file system. GFS (Global File System) is a Redhat developed clustered file system. In short, clustered file systems have the mechanisms to insure integrity (not two nodes should be writing at the same place, no node should consider some space as free when another node is writing to it) and consistency of the file system. This seems kind of obvious but, trust me, the details are really gory.

The file system is created using the well-known command mkfs

mkfs -t gfs2 -p lock_dlm -j 2 -t NASCluster:opt /dev/drbd_vg/drbd_lv
A special attention to the second parameter '-t'. It specifies the table to use, and should be labelled <clustername>:<fsname>. If the part before the colon doesn't match your cluster name, you won't be able to mount it.

If everything goes right, let's mount the file system on both nodes.

mount -t gfs2 /dev/drbd_vg/drbd_lv /opt

Try creating a file on node 2, you should see it on node 1.

The GFS2 service depends on an entry present in fstab. When you created the file system, a UUID was displayed. Use it to add a line in /etc/fstab:

UUID=cdf6fd4a-4cb2-7883-e207-5477e44d688e /opt              gfs2      defaults      0 0
This will mount my file system, with type gfs2, under /opt, with the default options, no dump and no fsck needed.

And the last step: let's start the gfs2 service.

service gfs2 start

7. SAMBA

In /etc/samba/smb.conf, we have only to present the mount point as a share, and restart the daemon on both nodes.

[opt]
comment=Test Cluster
path=/opt
public=yes
writable=yes
printable=no
You may have to adjust a few other options, such as authentication or workgroup name.

Upon restart, you should be able to access the same files indifferently accessing the first or second node.

Side note: samba and clustering.

There is however a catch: samba is not meant to be a cluster application.

When accessing a file on a samba share, a lock is created and stored in a local database, in a tdb file. This file is local to each node and not shared, which means that a node has absolutely no idea of what the other nodes have as far as locks are concerned.

There are a few options to do a clustered install of the samba services, presented in [4].

8. Accessing the resource through a unique name

And the last piece. If we were to ask every user to chose between node 1 or node 2, they would probably either complain, or all use the same node.

A small trick is needed to make sure the load is spread on both nodes.

The easiest is to publish multiple A records in your dns.

cluster IN A ip1
cluster IN A ip2

Other ways are possible, such as having a home made script that will return the list of currently active nodes minus the ones that are already too loaded, or have the less loaded reported and so on.

Bibliography

[2] DRBD's documentation on Redhat Clusters
[3] Explanation on cluster fencing
[4] Clustered samba

Thanks

Special thanks to the linbit team, to both Fedora and Redhat teams and everyone involved in Linux and clustering.

As usual, drop me a line if you have any question or comment.

## Saturday, November 27, 2010

### You can avoid becoming a victim of fraud

A few weeks ago, I received a leaflet from the US Postal Inspection Service, with a few reminders on what is a safe behavior on the Internet.  A useful list of things everybody knows, but that's best reminded every now and then.

If it looks to good to be true, then it's not.

Yes, Mr. X died a few months ago in very strange circumstances, he has no heirs and his lawyer is looking for someone on whose account he would transfer an insane amount of money. Of course, you would get a cut, say 10 or 15% out of the multi-billion loot.

Right, seems yummy.

Using that pretext, scammers will either ask for some personal information, including bank information, or for an "advance", because there are unexpected difficulties.

If the latter is "only" some money you'll lose - and it amounted to a number with 5 or 6 digits in certain case - there is nothing more to it.

The former is actually more a problem: with these information, that may include passport numbers and so on, the scammers are able to forge your identity, make fake travel documents (think "terrorism") or in certain cases, impersonate you and rip you off your money.

Also, never forget that even "improbable" countries have certain guidelines when it comes to money for which no heir is to be found.

A friend is stuck in an improbable country or is extremely sick abroad

Another classic. Someone you know just sent you an email: she/he had all his belongings stolen while traveling and asks for some money. A common variant is the friend fell sick during a travel and the "nasty, evil" hospital requests the money prior to start treating her/him. Again, money is needed.

There, the scammers touch to a sensitive part of ourselves: our compassion and the fact we wouldn't let down a friend in need.

Only one solution: call the friend in question, especially if you don't know his/her whereabouts. In most of the cases, you'll end up with a very surprised person on the phone ... who will discover that his/her email account was compromised.

My lovely friend from abroad starts to need money

A few weeks ago, you started discussing with that cute girl from a poor country. And, gradually, she mentioned that her lack of money was preventing her from doing what she wanted to: continuing school, purchasing a shop ... or even taking a flight and come to see you. But she would not accept you to come and see her.

At some point, she asks if you could lend her a certain amount. It starts with small sums, and she sends a couple of pictures of her school and so on. Then, "I need big money ... I have a problem". Problem is, as soon as you've sent the money, you never hear from her again, and worse, the email address bounces back.

Yup, fake.

Abusing loneliness is also a big money maker with scammers.

A few facts and things to recall when online

• You have no way of making sure your correspondent is who he claims he is
That might seem obvious, but anyone could pretend to be rich, famous, beautiful, your banker or a friend. You have no real way of checking it online. The only option is to check offline, especially if you smell a rat.

• URL can be disguised in an email, and a website appearance can be copied
Here is a quick example: www.fbi.gov. Did you notice how this link, that seems to send to the FBI website, actually points to the CIA website?  This could be worse, I could have copied hotmail's login page and pointed it to you. You would have tried to log in ... guess what would have happened to your username and password?

If someone prompts you to click on an email, ask yourself why you would want to click on it. Also, if it comes from your bank, type the link you know, do not click on the one in the email.

• A bank never asks for private information in an email
Does your banker suffer from amnesia he has to ask you your account number? No, he is not your banker. Most of the banks will be more than happy to explain you what they may and may not ask in an email or over the phone.

• You can't win a lottery you never played
Regardless of how much you want to, you can't win to a lottery if you haven't played. And worse: in certain countries, it's illegal to participate to a foreign lottery.

• Keep your applications and antivirus up-to-date
That one is easier to say than to do: we all have 30 to 40 different applications on a computer, including the operating system. Maintaining the whole stuff up-to-date is a daunting task.

The safest is to have an antivirus that also acts as an endpoint security application. Or to use a secure OS, such as Linux.

That's all folks. Have a safe trip online.

## Wednesday, November 17, 2010

### 100 Naked Citizens on the Internet: 100 leaked body scans

An interesting information from Wired: 100 Naked Citizens: 100 Leaked Body Scans. In short, it seems that some body scan images were leaked out of the systems and made available on the Internet, which is something the TSA has promised wouldn't happen.

This raises a few questions, first in term of the respect of privacy -- people usually don't like to be seen naked on the Internet without their consent -- but also as how did this happen, whether there were any policy violations, and if so, what will be done to prevent this from happening in the future.

Currently, the only alternative for people willing to travel is a complete pat-down search, which include a search around the breasts and in the genital area. More intrusive, but you don't have the risk of having your pictures spread on the Internet.

## Wednesday, November 10, 2010

### Fedora 14

After a few frustrations with Ubuntu 10.10, I decided to switch to Fedora 14. The result is ... amazing. A little struggle with the "nouveau" driver that kept automatically loading even though it's in the blacklist. Actually, it's loaded before the blacklist file is read, so I had to change the boot line in grub by adding "rdblacklist=nouveau". After that, I was able to compile and install the nVidia driver. glxgears went from 600 FPS to 12000 FPS! Quite impressive.So far, I got no problem: no missed refresh, no strange behavior, no missing buttons/applets in the top bar. I plan to give it a more thorough look this week-end. But so far, so good!

## Sunday, October 31, 2010

### "You have money waiting ... send your info to westernu.co.uk" and back.

Recently, I got an email from someone claiming to work for Western Union and asking me to send the usual information (first name, last name, date of birth and so on).

The trained eye will immediately recognize an attempt to grab some of my personal information. So, it must be removed.

"dig MX westernu.co.uk" tells me that the mailer is hosted by hotmail. So, I sent an email informing them that there is something phishy -- sorry fishy -- there.

Thank you for reporting spam to the MSN Hotmail Support Team. This is an auto-generated response to inform you that we have received your submission. Please note that you will not receive a reply if you respond directly to this message.

Unfortunately, in order to process your request, Hotmail Support needs a valid MSN/Hotmail hosted account.

We can help you best when you forward the spam/abusive mail as an attachment to us. The attachment should have full headers / message routing information displayed. This means that the complete “From” address of the offending message should be displayed. If you need help to do this, please visit the following website:

http://safety.msn.com/articles/junkmail.armx

If you have deleted or no longer have the message, you can still resubmit your report by sending the name of the violating MSN/Hotmail hosted account and a description of your concerns. If your submission does not involve a third party, please include your own account name in the body of your message along with the description of your concerns so we can process your report.

For further instructions on how to submit spam and abusive emails to Hotmail, please visit:

http://postmaster.msn.com/cgi-bin/dasp/postmaster.asp?ContextNav=Guidelines

http://postmaster.msn.com/cgi-bin/dasp/postmaster.asp?ContextNav=FightJunkEmail

In summary, this is not an hotmail account, nothing will be done.

I deleted the offending email, so I have no way of controlling whether the account was disabled.

## Tuesday, October 26, 2010

### A clustered Samba Fileserver, Issue with packages on Fedora 13

In order to redo all the steps, I deleted and re-created my Fedora 13 virtual machines. However, when I arrive at installing the packages for clustering, the system can't complete the request, as it looks for older packages and failed to recognize the newer packages as valid.

I have open a ticket with Fedora's bugzilla: 646697

[20101026 - 8:30EST]

The cman maintainer, Fabio, and I narrowed down the issue to a nss-tools install problem: I run the x86_64 flavor of Fedora and when doing the yum install of the package, it tries to install i686 dependencies that are already installed in their x86_64 incarnation. For obvious reasons, this fails and the nss-tools package is not installed.

I had a quick look at the bug reports in Bugzilla but wasn't able to find anything related. I opened another bug report on which I'll work more on this as soon as I have the time. The new ticket is : 646807

[20101028 - 9:45EST]

"yum install nss-tools" now works correctly. Many thanks to the Fedora team!

J.

## Sunday, October 24, 2010

### Ubuntu 10.10 -- here I come!

As most of you know already, Ubuntu 10.10 "Maverick Meerkat" came out earlier this month. It was time for me to upgrade from my 10.04.

My laptop upgrade went fine, really fine. No issues.

On my workstation, the install turned wrong as my /boot partition was full: the installer failed to fully install the kernel and dropped me out of the process with a "fix the issue and re-run dpkg --configure --all". After issue the needed "apt-get remove", I ran the command. Which completed without doing a thing!

Ok, I have a copy of all my important stuff, so I decided to go for the reboot. Besides two "FATAL" messages as dependency files aren't to be found, the system was back online. Remained the traditional install of my nVidia drivers. And voila! Everything is working.

First impression: it's great, although it seems - and don't quote me on this, I don't have any scientific measurements - a bit slower than the 10.04.

My mailer, Evolution, also reported an error "Unable to find mem.Inbox" the first time I started it, but not the following time, so that's ok.

Another small problem: when I exit dosbox in fullscreen mode, the screen is not correctly refreshed. More an annoyance than a real issue.

Concerning the applications, nothing really changes as I upgraded. I'm right now in the process of importing all my pictures into Shotwell from F-Spot.

That aside, it seems to be a nice release, and I'm looking forward to exploring it more.

## Monday, October 11, 2010

### Now reading "The design of design"

A clear and complete treaty on the processes and functions behind the design process itself. Written by an expert on design who worked, among others, on the S/360 architecture design, this book approaches many aspect of the design, such as methodologies, requirements and thoughts.

This is an interesting book for all involved in design, such as architects, system and software engineers and all whose job requires creating new systems.

Only shadow, the version I have has numerous typos and misspellings. Even if it doesn't make the text ambiguous or unreadable, it certainly doesn't add to the pleasure of reading the book.

## Sunday, October 10, 2010

### Google is testing a car that drives itself!

That's sweet: a car that drives itself in the streets.

Article on the NY Times

### A clustered Samba Fileserver, Part II - Replicated block device using DRBD

In this part, I will present the configuration of DRBD.

DRBD works by creating resources, that match a local device to a device on a remote computer. Logically, on both computer, the resource definition shall be identical, which doesn't mean the the actual physical device shall be the same.

Here is the resource created for this test:

resource r0 {
on linuxnas01 {
device    /dev/drbd1;
disk      /dev/sdb;
meta-disk internal;
}
on linuxnas02 {
device    /dev/drbd1;
disk      /dev/sdb;
meta-disk internal;
}
}
It creates a resource named r0, that exists on machines linuxnas01 and linuxnas02. On linuxnas01, the physical device associated with the resource is /dev/sdb. In this scenario, it's the same on linuxnas02. It makes the configuration easier to understand, but this is in no way mandatory.

First comment, this is for a resource in "primary-secondary" mode, which means that one node is consider to have a rw privilege, while the other is ro.

The line "meta-disk internal;" specifies that the drbd metadata (see [1] for a reference on these) are to be written on the physical device specified by the resource. This will have some importance when we will create the filesystem.

Also, it has an performance side to consider: each write operation will result in (at least) two actual accesses: one to write the sectors concerned, and the second to update the meta-data. If the goal is to put in place a filesystem that will need to have heavy performance, a better solution is to store these metadata on another physical device, for instance a high speed disk such as a flash drive or a 15K SAS disk.

On Fedora, don't forget to open the firewall. By default, only port 22/tcp is allowed, which prevent the DRBD connection to establish. As a result, one would see the nodes staying in the "Unknown" state. Also, you have to load the modules, either manually with "/etc/init.d/drbd start" or add them to the required rcX.d, with chkconfig or update-rc.d, depending again on your flavor.

Once the resource is configured, attached and up'd (see [2] for the chapter dealing with the configuration), it appears and starts to sync between the nodes.

[root@linuxnas01 ~]# drbd-overview
1:r0  SyncSource Secondary/Secondary UpToDate/Inconsistent C r----
[==>.................] sync'ed: 19.2% (1858588/2293468)K

udevadm can also be used to check that the device exists.

[root@linuxnas01 ~]# udevadm info --query=all --path=/devices/virtual/block/drbd1
P: /devices/virtual/block/drbd1
N: drbd1
W: 61
S: drbd/by-res/r0
S: drbd/by-disk/sdb
E: UDEV_LOG=3
E: DEVPATH=/devices/virtual/block/drbd1
E: MAJOR=147
E: MINOR=1
E: DEVNAME=/dev/drbd1
E: DEVTYPE=disk
E: SUBSYSTEM=block
E: RESOURCE=r0
E: DEVICE=drbd1
E: DISK=sdb

My excerpt shows Secondary/Secondary. To force a node to be "primary", let's use "drbdadm primary <resource>". Issued on the first node, this forces drbd to recognize that linuxnas01 is indeed the primary node.

At this point, I have a working resource on both machine. The next step is to create the filesystem on the primary node.

Bibliography

[1] DRBD web site, chapter 18, http://www.drbd.org/users-guide/ch-internals.html
[2] DRBD web site, chapter 5, http://www.drbd.org/users-guide/ch-configure.html

## Tuesday, October 5, 2010

### Congratulations to Andre Geim and Konstantin Novoselov

Dr. Geim and Dr. Novoselov were awarded the Nobel prize in Physics, 2010 for their works on graphene.

http://nobelprize.org/nobel_prizes/physics/laureates/2010/

## Monday, September 27, 2010

### A clustered Samba Fileserver, Part I - Presentation

I got, a few weeks ago, a request to create a Samba (CIFS) fileserver that would be used to serve a variety of files to hundreds of clients. There are a few requirements:
• It should be high performance;
• It should have some mechanisms to replicate to another storage;
• It should be scalable both in terms of performance and available space.

I came up quite quickly with a solution: a cluster of fileservers, replicating block devices and presenting to Samba a clustered filesystem. That way, I can have two nodes writing to their respective devices at the same time, and have both changes immediately available to the other nodes.

On Linux systems, DRBD is the de facto block device replication solution, and GFS2 is a well supported clustered filesystem, OCFS being another one.

### Finally upgraded my Mac Book Pro to Snow Leopard 10.6.3

Ok, not really a new one, but it took me time.

Installation

Small issue. A long time ago, I played with Linux on the Mac and I created a few new partitions. When I was done, I tried deleting them, with no luck. I then converted them to UFS and simply mounted them.

However, when I tried to upgrade to 10.6.3 from Leopard, the installer wouldn't let me, claiming that my startup disk was not a startup disk. Rebooting and trying the upgrade directly from the installation media didn't help either.

An attempt to kill the partition failed with MediaKit not able to delete it. It even claimed it there was no such partition.

At that point, my only choice would have been to replace the partition scheme and do a fresh install. So ..

Using an external USB drive, I used Time Machine to take a backup of everything: applications, parameters, files and so on. I then rebooted, selected the 1-partition scheme and reinstalled the whole OS.

First login: miracle! The installer asked me whether I wanted to restore a Time Machine backup! Of course I want. It did so and reloaded all my stuff. So cool! Later, I had to do a couple of adjustments: reinstall some applications that were 32-bits and not 64, install the new PGPMail, XCode and the Mac Ports, and recompile all my installed ports. As of now, everything works fine.

After reinstalling everything, I ran not one, but two sessions of upgrades. The combined total was around 1GB downloaded, where my install was around 5GB. That's kind of frustrating to see that in about a year, the whole OS will be completely replaced by patches and updates.

Tests

Perceptibly, Snow Leopard is snappier and the dock reacts faster than it did in 10.5. For computation intensive applications, it behaves the same and no CPU overhead is seen.

The only changes I saw was that my stacked applications now have a scroll bar, where the previous version scaled the icons instead. I don't have enough items in my Documents folder to tell if it behaves the same.

Conclusions

Snow Leopard installation is awesome and, while it took the largest part of my Sunday morning to do it, partially due to the Time Machine Backup, it went smoothly and without any hitch or bump. The possibility to restore a Time Machine Backup after the first boot is a nice and real cool functions and helped me in migrating my stuff. Should I have to do it again, I would use a Firewire disk rather than a USB one, as my dataset was around 43GB.

The OS doesn't feel different, it doesn't look different and it doesn't behave differently. Hugely appreciated when dealing with some people whose learning curve is almost horizontal ( :) ), and everything stayed at its place.

I haven't had the time yet to run all the tests I wanted, but I think I won't be disappointed.

Bibliography

XCode, http://developer.apple.com/technologies/xcode.html (Developer account needed)
MacPorts: http://www.macports.org/

## Tuesday, September 21, 2010

### It's a matter of PCI DSS ...

Recently, during my trip to the Grand Canyon, I booked a trip in helicopter, and I had to rent a tripod for my camera.OK, there were a lot more expenses than that, but these were actually the most surprising.

Why? Because for both, the clerk took my credit card, and wrote down all the details, including the 3-digits verification code! I was so shocked that I couldn't even speak: I just granted two parties to print money. From my account. Without virtually no possibility to dispute.

The PCI DSS mandates that:

3.2.2 Do not store the card-verification code or value (three-digit or four-digit number printed on the front or back of a payment card) used to verify card-not-present transactions.

So basically, a small shop in Page, AZ and a small hotel in Tusayan, AZ just gave themselves the right to ignore the very basic security measures to protect my information. If this was done inside their computer systems, they could be prohibited from issuing any payment request.

At this time, I'm monitoring my bank account, as I don't really know what happened to the piece of paper. But the lesson is learned: next time someone starts to write the CVV, I'll just cancel the transaction and ask for the note. Remember: even when on holidays, bad things can happen ...

## Monday, September 20, 2010

### Feature request submitted to Fedora

During my tests of nginx on fedora 13 (which is great!), I found it would be way easier to move the whole server{} section in its own file, rather than have a nginx.conf that's a mile long. Another driver is that that section can actually covers yours, and you'll scratch your head - like I did - to find why what should just pass a request from nginx to a back-end Apache is serving the "This is the default nginx page."

Let's hope the maintainers will agree :)

## Saturday, September 18, 2010

Not long ago, I was asked to install a front end for some web applications. For high availability sake, it was decided to install two web servers, each accessing the same back-end database.

The choice was between a few commercial products, Foundry being the most known, and several opensource alternatives. I thus decided to play with one of them, Nginx.

I have already installed Apache and Squid as reverse proxies and front-end load balancers, but Nginx was still new for me. So why hesitate?

From a few articles I read on the Internet ([1], [2] and [3]) show that Nginx outperfoms Apache in several scenarios, one of them being a large number of concurrent connections. In that regard, in then makes sense to think of:

• Nginx as a Front-end, eventually with a twin and using VRRP or clusterd to provide front-end high availability, and to eventually serve the static parts;
• Several Apache back-ends with your favorite language: php, perl, ... or any other Tomcat, JBoss, Zope ...;
• A way to centralize the database (More on this later).
The front-end can have multiple roles, from just acting as a reverse proxy between clients and back-end servers to also encrypt the traffic on-the-fly, compress, take decisions based on geoIP and so on. The sky is the limit!

My first test was an easy setup: a Nginx front-end and two back-end Apache servers. This was easily accomplished, with only a few directives:

In the http{} section, I declared my server farm:

server 192.168.1.71:80;
server 192.168.1.72:80;
}
And in the server{} section, I declared that everything has to be sent to these upstream servers:

location / {
}

This is the most basic setup and performs a round-robin selection between the two servers: each new connection is redirected to the next server. From here, let's try two things:

1. Shut the Apache process on one of the back-end server
2. Shut the complete server down

Scenario 1 is detected immediately, and Nginx forwards all requests to the second back-end. Mission accomplished!

Scenario 2, OK as well: Nginx tried for a bit, then switched to second machine. Again, if we except a 30 seconds wait time, no error was returned. This can be tuned at will, see [4]. In the same document, you will see the options to control the load balancing, to make sticky sessions and so on.

My second test was a wee bit more "complex": why waste precious CPU cycles on the application servers when the front-end can compress, encrypt and serve static contents such as pictures. This leaves plenty of CPU resources to execute all the scripts on the back-end servers.

So, objectives:

1. The front-end compresses and encrypts the connection with the client;
2. The connection between the back-ends and the front-end is in clear text;
3. The front-end serves static content.
That's an easy job.

First, let's create a self-signed certificate:

openssl genrsa -out nginx-key.pem
openssl req -new -key nginx-key.pem -out nginx-req.pem
<Bunch of questions suppressed>
openssl x509 -req -signkey nginx-key.pem -in nginx-req.pem -out nginx-cert.pem
Next, let's configure Nginx for SSL. Most distributions have a default "ssl.conf" file in /etc/nginx/conf.d. In there, you can find most of the needed declarations.

#
# HTTPS server configuration
#

server {
listen       443;
server_name  _;

ssl                  on;
ssl_certificate      /etc/nginx/ssl/nginx-cert.pem;
ssl_certificate_key  /etc/nginx/ssl/nginx-key.pem;

ssl_session_timeout  5m;

ssl_protocols  SSLv3 TLSv1;
ssl_prefer_server_ciphers   on;

location / {
}
}
No big mysteries there if you are a bit familiar with the Apache configuration. The ssl_protocols and ssl_ciphers declarations are in the openssl-like format. Again, I would strongly advise disabling SSLv2 as it has some weaknesses, and leaving only the "HIGH" encryption.

This alone gives me the encryption by the front-end. To compress, simply add

gzip  on;

within the server{} section.

The next and last part is to serve the static content from nginx itself. To make things easy, I isolated the images in /images. To serve them directly from nginx rather than from the back-end server, I'll declare that all URLs that start with a '/images' shall be served from the local system rather than being passed to the upstream servers:

location /images {
root /usr/share/nginx;
}

And that's it. From here, my front-end encrypts, compresses and serves the image from its local drive.

Bibliography

[1] http://joeandmotorboat.com/2008/02/28/apache-vs-nginx-web-server-performance-deathmatch/

[2] http://www.wikivs.com/wiki/Apache_vs_nginx

[3] http://blog.webfaction.com/a-little-holiday-present

[4] http://wiki.nginx.org/NginxHttpUpstreamModule

### NVidia GLX extensions

Recently, I started having issues with my video rendering: some screensavers were not working anymore and some games were just ... slow.

The basic troubleshooting gave its verdict: the GLX extensions were not taken into account by the NVIDIA (I have a GeForce 9000) driver. Why? Because the loaded extension was the one from the Xorg foundation rather than the one supplied by NVidia, as shown in the relevant Xorg.X.log where the vendor attribute associated with the glx module was "X.Org Foundation". This shows in the Xorg.X.log as an entry prefixed with "NVIDIA(0)" stating that the glx module isn't from NVidia.

The file in question, "/usr/lib64/xorg/modules/extensions/libglx.so", is provided by the package "xserver-xorg-core", where NVidia glx module is called libglx.so.XXX.YY, in my case libglx.so.256.44.

To solve this, as root:

service gdm stop
cd /usr/lib64/xorg/modules/extensions
mv libglx.so libglx.so.old
ln -s libglx.so.256.44 libglx.so
service gdm start

And voila, upon restart, the vendor now shows as "NVIDIA Corporation", and all opengl-aware applications are happy again!

## Friday, September 17, 2010

### Back from Arizona

*Snirf* That's the end of my vacation. But the good point is back to testing!

## Sunday, September 12, 2010

### Last week of vacations!

Alas! Every good thing has to come to an end. This is my last week of vacations, and, after that, I have to resume my work life.

Of course, there are positive aspects: I will also resume my tests of nginx and a few other things I have been working on.

Also, don't forget to check the album as it will be updated soon with all my pictures from my tour in Arizona.

## Thursday, September 9, 2010

### Grand Canyon

I'm enjoying some time in the Grand Canyon. That's truly amazing, and the views are exceptional. I'll post my pictures as soon as I'm back in New York.

## Friday, September 3, 2010

### Smile, it's Freeday!

Woohoo! Today is my last day at work before two weeks of freedom. Time to dust my books and prepare my hiking shoes.

## Thursday, September 2, 2010

### A few new pictures on our album

New pictures in the album! Expect some more after our trip to the Grand Canyon.

### nginx

Started playing with nginx, that's heavy stuff! More news when I've done more tests.