Skip to content

Chris Read
Syndicate content Chris Read
He's just this guy, you know...
Updated: 6 hours 10 min ago

The story of Ecks

Tue, 09/13/2011 - 10:07

I’ve just release Ecks into the wild, a Python library for accessing SNMP data from a server without having to deal with the pain of knowing about what a MIB or OID is. SNMP stands for Simple Network Management Protocol, but for most people it is anything but simple. It’s pretty straight forward once you understand what’s going on, but most people are daunted by the learning curve.

What results from this resistance is that when your average developer decides he wants to monitor CPU usage or disk space on his machine he or she ends up doing it in the most obtrusive way possible – SSH. While I’m a big fan of small shell scripts, this is one place they do not belong. Let me give you an example:

I set up a new server here in London for one of our Chicago teams. Being a conscientious team, the first thing they did was wire in some monitoring that wrote for their servers. It checks things like disk space, memory usage, CPU load and the state of various processes that they care about. They need pretty fine grained checking intervals, so they check these every minute. The easiest way they know how to do this though is to SSH in to their machines and run df, free, netstat,etc and scrape the output. Every minute. Which on this nice shiny server consumed almost 20% of the CPU right off the bat. Educating them on the use of SSH ControlMaster helped, but it’s still doing a lot of work on the machine.

This was the last straw that lead to the creation of Ecks. People will always follow the path of least resistance, so if you want people to do the right thing, you need to make it the easiest thing to do. SNMP has all this information available, modern snmpd implementations are stable, have a tiny footprint and are more secure than providing SSH access to your machine.

The hardest part of all though is what to name this little library. When discussing the problem with Julian Simpson (the @builddoctor), he pointed out that MIB always reminded him of the Men in Black. Reading the Wikipedia article on the original comic book series had some interesting snippets:

The Men in Black are a secret organization that monitors and suppresses paranormal activity on Earth…

Replace “Earth” with “a computer” and you’re starting to get somewhere. Then I noticed this gem:

 An agent named Ecks went rogue after learning the truth behind the MiB: they seek to manipulate and reshape the world in their own image by keeping the supernatural hidden.

Many people think that the complexity of the MIB keeps SNMP data hidden. And so the name was chosen…


Categories: Blogs

£106.50 per Terabyte Storage Server

Thu, 06/02/2011 - 23:43
With the price of storage dropping all the time, there is a constant perception from people who don’t deal with it every day that “disk space is cheap”, especially when it comes to developers. The problem is that so called “Enterprise” storage costs are still astronomical compared to what people are used to paying for home storage – even when using SATA disks.

A lot of this extra cost comes from a perceived requirement for the highest available capacity, availability and performance. Achieving all three characteristics is expensive, but if you’re willing to sacrifice one of them then costs start to fall considerably. Lowering requirements on two of the three drops it even more.

One of the teams I work with has a requirement primarily on capacity. Performance and availability are nice, but capacity is the key. We generate gigabytes worth of log files every day, but didn’t have one place to store it all for easy analysis. Just before I joined the team they’d purchased the cheapest “Enterprise” storage system the IT team at the time would allow – it ended up costing in the region of £12k for 12TB of raw storage. That’s £1000 per TB!

In addition to the price, the other problems were accessibility and management of the data and managing growth. This inspired a hunt for something that would provide a cheaper and more flexible solution.

Our requirements were:

  • *nix based system. The current storage solution was based on Windows Storage Server, but all our systems and tools for this team are Linux based. Yes, Windows does technically provide things like an NFS server, but fighting with the file system permissions and overall performance are two things that impacted us.
  • Cheap to expand. We need to have a clear path to grow the storage in the server easily by simply adding more disks.
  • Large filesystems. There’s nothing more wasteful from a storage point of view than having lots of small filesystems. Besides the management overhead, there’s also many wasted blocks lying around un-used.
  • Cheap to build. This inevitably means commodity hardware.
  • Reasonable availability. We don’t need 99.999% uptime, but would be happy with somewhere in the region of 90%+
  • Reasonable performance. Primary access to the data on this machine is via gigabit Ethernet. As long as it can keep up with the network card we’re happy…
Hardware Options

The first thing we looked at was the Backblaze concept. Using a custom designed case (that’s actually quite easy to purchase) they manage to cram 45 SATA drives into a 4U chassis. That’s pretty impressive. Using cheap SATA hot swap port multiplier/backplanes and controllers allows for quite a low cost, but not without problems though. In the configuration they publish there are a number of problems:

  • A mix of PCIe and PCI-X SATA controllers means that you need a LOT of slots on your motherboard. Boards with all those slots are not cheap.
  • Each of the 45 drives is presented to the operating system (Linux in this case, which is good). They then use software RAID for the volumes. Care needs to be taken when building a system like this though to ensure that you spread the drives for each array evenly across controllers so that a single controller or expander backplane failure does not take out an entire volume, which just adds to the management overhead.
  • The port multiplier chipset they use is only supported on Linux. This was a problem as we had initially planned to use OpenSolaris for this device as ZFS would make the storage management and expansion a no-brainer. The Oracle take over of Sun and their subsequent clamp down on OpenSolaris derailed this plan. I did do some performance testing with FreeBSD and their implementation of ZFS but the performance was terrible.
  • The cost is also not that cheap. Single orders of the custom case and power supply that go with it are actually pretty pricy.
  • Performance. I did a lot of testing, mostly with iozone. What I found though was that when running through the port multiplier, even when using NCQ, performance was pretty bad.

The system I was using for my testing though had an onboard SAS controller and port expander. The SAS specification includes SATA support. Out of curiosity I decided to compare performance on that controller to what I got from the SATA chain. It was actually a lot better, which got me thinking about using SAS controllers with hardware RAID to simplify administration.

Initially I tried a pair of 8 port LSI controllers (LSI 8208ELP), but the cheaper end of the range have issues in their firmware and do not even boot with more than one controller present, so I replaced them with a single Adaptec 51245 controller. This made management and expansion a lot easier, as it has enough internal ports for the initial drives we planned on, and an external port that allows easy expansion.

Current architecture

The current system we have running in production is assembled from 100% off the shelf components. They are:

Description

Part

Count

Each

Total

SAS Cables

4

£9.40

£37.60

Drive Cage NetStor NS170S Black

2

£90.10

£180.20

RAID Controller Adaptec 51245

1

£512.89

£512.89

Case & 650W PSU Antec 4U22EPS650

1

£192.36

£192.36

Motherboard Intel DP55WG

1

£96.84

£96.84

CPU Intel i7 860

1

£178.71

£178.71

RAM Kingston 4G DDR3 1600

2

£63.96

£127.92

VGA PNY 8400GS

1

£37.85

£37.85

Data Disks WD20EARS

10

£76.58

£765.80

NIC Intel PRO/1000

2

£0.00

£0.00

OS Disks ST3160318AS

2

£0.00

£0.00

Cost (September 2010): £2130.17 (Excl VAT)

Total Raw Space: 20TB

Unit Cost: £106.50 per TB

Power Draw: 0.8A with sustained disk access, 1.16A peak on boot up.

The operating system we’ve selected is Ubuntu 10.04 LTS. We run the smaller disks we had spare in RAID1 for the OS. The data drives are in RAID6 on the WD’s.

Hardware Build

Here are some photos we took as we built the system:

Boxed Internal Drive Cage Rear of Drive Cage Front View - Open Motherboard Data Drive Ready to rack

Operating System Configuration

Using the hardware RAID controller allows us to simplify the Operating System configuration. Instead of having a block device for each physical disk, the controller presents a single block device for each array.

The filesystem we’re using is EXT4, created with the following options:

-m 1 -O dir_index,has_journal,extent,sparse_super

To try and speed things up we also set the noatime flag when mounting the filesystem.

The Adaptec Storage Manager application is used to monitor the health of the disks and will email us in the case of a drive failure.

Performance

With the current configuration, the performance is in no way spectacular, but it’s more than good enough for our needs. Write speeds are quite slow peaking at around 90MB/s, reads are good peaking at around 750MB/s.

Client Access and Usage

We’re currently running a mixed workload on the system and so far have had no complaints or problems. Every night we rsync all of our log files from our production machines, which is currently about 300GB per day.

Some of the people who are interested in this data access it via HTTP. We run an Apache server on the box (with compression enabled of course) to allow people easy access to browse and download what they need. Developers also access the logs read-only via NFS, and have a directory mounted via NFS called “playpen” that they use as a scratch pad for any of their research that requires more disk space than the 300GB or so they have available in their development machines.

Going Forward

In the 6 months or so we’ve had the machine we’ve almost used all the disk space. On a clear disk you can seek forever. We’ve placed an order for 2 more disk trays (NetStor NS330S-8026). These trays have built in SAS port expanders and allow us to daisy chain up to 7 trays off our external port. We’ve also ordered different hard drives. Instead of using the green WD drives we’ve decided to go for a bit more speed and get 7200RPM Hitachi Ultrastar 7K3000 SATA drives. This does add a lot more to the cost, but the performance pay off is worth it for us for expected higher write speeds. Once they’re up and running I’ll publish some comparative benchmarks.

Costs for the upgrade (Feb 2011): £9699.00

Total Raw Space: 64TB

Unit Cost: £151.50 per TB

Update (23 June 2011)

Right now there is some kind of major compatibility problem between the Adaptec card and the NetStor JBOD. I’ve been wrestling for over a week now with vendor support and still not getting anywhere.


Categories: Blogs