YOU try building the biggest computer in the world. I've been so busy my brain is about to fall out of my head.
Anyways.
I just wanted to take a moment to say I have found one of the coolest programs in the world.
I was needing a way to load-balance our multiple login nodes, and searched for quite a while for a decent solution. I installed Linux-HA, and played with various other bloatware, but they all had too many levers and buttons to push. I just wanted a nice, simple piece of software that did what I wanted. And I found it!
You should check it out : Crossroads Load Balancer.
This guy is my new hero. It compiles quickly and successfully (why should that ever be an issue? but it is! getting Linux-HA installed required something like 28 perl dependencies, along with alot of other crap that necessitated upgrading almost the entire os. AND THEN you still have trouble getting it to work the way you want it to), and once installed, is very easy to configure.
Here is my configuration file, with important info changed to protect the innocent:
(to come later - the install I got going was on a test machine that someone turned off to make room for a grid cluster - I'm going to cannibalize that machine and use it's organs for more important business.)
If you are looking for something that just works, Use it.
Friday, October 26, 2007
Monday, September 17, 2007
cluster fork
You may or may not know, but cluster-fork is actually a command in the rocks universe. It allows you to run a command or set of commands on every machine in your cluster.
It's also a very apt way to describe why I haven't been updating this blog lately. I've been cluster-forked!
It's gotten to the point where I either do something useful, or update this blog. With the new machine, family and life, I've been afforded little time for such a thing as blogging. When I do get a chance, I will update as much as possible.
So, while I'm sitting here, I'll post a brief synopsis.
Uh, let's see.
We were testing and retesting the hardware, figuring out the correct bios settings to get the best performance out of the system, along with figuring out how to remotely manage and monitor thousands of machines in an easy and convenient manner. The blades are designed really well, and we are getting very nice performance percentages out of the new chips (wink wink). I know some of you are dying to know actual numbers, but I haven't had time to ask if I can write about it.
Additionally, we had to get the remote operating system install up and running smoothly, so that the blades come up uniformly and with only the necessary software and daemons to get the jobs run, when the hardware actually starts to arrive.
We installed all the disk servers with their necessary little bits and pieces, all the while learning the ins and outs of the unique and extremely engineered hardware from sun.
One might wonder how to go about controlling a massive cluster bigger than my house. Well, we're using sun's neat embedded service management voodoo hardware to monitor and remotely connect to the machines. Basically, it's a little computer that's embedded in the back of the machine you are running. It has it's own ip address, it's own processor, and can power on, turn off (gracefully or immediate) and monitor the server's health, all via ssh, or https(!).
There are 2 ways to do this, through ipmitool (a command-line interface to access, query and control the machines), or with the sun-produced browser-based java gui hoohadilly. I threw that last word in there for people who don't know what I'm talking about (hi mom!).
I use both. I am a strong advocate of command-line, script-based control of machines, and it will never be replaceable. However, I have taken a great liking to the java juju, I must admit. It's nice to pull up a browser and watch a computer boot in another place like I'm standing in the machine room with it. I can control it with both keyboard and mouse. It's a giant network-based kvm. I can access over 4000 machines from my one computer, and it's pretty cool.
Had to run the ethernet networking fabric for all of that. 12 48 port switches all connected to 2 24 port leaf switches that are uplinked via 10-GigE lines. That's alot of ethernet, btw. We run the lines, then velcro the bundles to the floor supports underneath our suspended floors.
Lots of 10-GigE cards installed. Lotsa fiber cards in PCI-E slots.
We have started receiving 18-wheeler shipments almost daily of huge amounts of hardware. 16 huge racks come in each shipment, and soon 200-400 blades will start arriving every day as well.
I'm sure that if our department weren't hosting two hpc conferences right now, we'd have actual photo evidence of all the work, but I will try to ask around for pics of what's going on.
This system didn't feel big until we got 45 racks and pushed those mammerjammers into place. That's when it felt huge. You can no longer walk in between rows through the spaces where the racks would go anymore, and the distance to go around to the sides, where the only openings are, is considerable. Now, you have to plan before you go somewhere, lest you have to turn back to get something you forgot. I'm not kidding. I jogged down the aisle and it took far too long to get to the end. It's really impressive.
Lots and lots of other details are being glossed over, because I am tired, and need to get up early to meet the next truck shipment in the morning.
more later...
*update
below, blogger says I posted this at 8:47 pm or something like that. that isn't true. it's 11:47pm. why must blogspot lie to the world and make little children cry?
Wednesday, August 15, 2007
PXE dust or, my boot over ib story
(editor's note - I wrote this during the night at midnight. upon further reflection, this is extremely stupid. but someone asked for an update, so here it is!)
One day there was a little infiniband card, freshly born into the world, wearing nothing but a blanket made of beautiful infiniband fabric.
This little card had no name, and was unable to get a name from his parents, for they could not speak his language. So he approached the gods in his area, and on bended knee, submitted a proposal that the god Mellan Ox strike him with lightning, and flash his memory so that he could speak to his parents, D.H. and C.P., in their native tongue.
So the great god Mellan Ox flashed his memory, and touched his tongue with a burning ember of rom, as well as a heaping pile of PXE dust, and he was thusly able to understand his parent's speech!
He felt very special, because there were not many other cards in the world that speak this language. His first order of business was to ask for a name. For this reason, he broadcast his message to his parents, shouting blindly from his crib in the night, but his parents could not hear him.
They did, however see that he was waving frantically to them from his crib. They did this by peering into a crystal ball called tcpdump, which told them not only his qpn number, but also the name given to him - his god-id, aka, guid. They were also able to see that he was asking for a name, but realized that the D.H. and C.P servant was blocking their ears.
They thusly asked their servant to re compile himself, which he did. This was achieved by removing the pounds of symbols near the rune-markings of his source-scrolls, allowing something called 'USE_SOCKETS'.
After the re-compile, the servant no longer prevented the little card's parents from hearing the little card's plaintive cries. D.H and C.P could now hear him asking for a name! They gladly responded, giving him the name '192.168.0.2'. Not original by any stretch of the imagination, but they heard that this new name was pretty popular, so they went with it.
Now, the little card was able to stand up in his crib, and speak with his parents very clearly. He was so happy that they could understand him! For the first time, he realized he was hungry, and asked for some food.
The first order of business was to put on his special tftp boots, which would allow him the strength needed to actually hold the platters of food that were soon coming.
He liked the taste of the PXE dust the god Mellan Ox had given him earlier, so he asked for some more. They brought it to him on a silver platter, along with heaping helpings of initrd meat and some binary image gravy.
He was able to put the initrd in his mouth and start chewing, but it wouldn't go down!
He was choking on the initrd because it didn't understand his delicate infiniband nature. The only thing that would help him digest the initrd meat would be the 7 tiny dancing ib drivers, who all dance on the head of a pin, and who don't really fit into the initrd meat without alot of effort.
Which is where our story will continue next time.....
One day there was a little infiniband card, freshly born into the world, wearing nothing but a blanket made of beautiful infiniband fabric.
This little card had no name, and was unable to get a name from his parents, for they could not speak his language. So he approached the gods in his area, and on bended knee, submitted a proposal that the god Mellan Ox strike him with lightning, and flash his memory so that he could speak to his parents, D.H. and C.P., in their native tongue.
So the great god Mellan Ox flashed his memory, and touched his tongue with a burning ember of rom, as well as a heaping pile of PXE dust, and he was thusly able to understand his parent's speech!
He felt very special, because there were not many other cards in the world that speak this language. His first order of business was to ask for a name. For this reason, he broadcast his message to his parents, shouting blindly from his crib in the night, but his parents could not hear him.
They did, however see that he was waving frantically to them from his crib. They did this by peering into a crystal ball called tcpdump, which told them not only his qpn number, but also the name given to him - his god-id, aka, guid. They were also able to see that he was asking for a name, but realized that the D.H. and C.P servant was blocking their ears.
They thusly asked their servant to re compile himself, which he did. This was achieved by removing the pounds of symbols near the rune-markings of his source-scrolls, allowing something called 'USE_SOCKETS'.
After the re-compile, the servant no longer prevented the little card's parents from hearing the little card's plaintive cries. D.H and C.P could now hear him asking for a name! They gladly responded, giving him the name '192.168.0.2'. Not original by any stretch of the imagination, but they heard that this new name was pretty popular, so they went with it.
Now, the little card was able to stand up in his crib, and speak with his parents very clearly. He was so happy that they could understand him! For the first time, he realized he was hungry, and asked for some food.
The first order of business was to put on his special tftp boots, which would allow him the strength needed to actually hold the platters of food that were soon coming.
He liked the taste of the PXE dust the god Mellan Ox had given him earlier, so he asked for some more. They brought it to him on a silver platter, along with heaping helpings of initrd meat and some binary image gravy.
He was able to put the initrd in his mouth and start chewing, but it wouldn't go down!
He was choking on the initrd because it didn't understand his delicate infiniband nature. The only thing that would help him digest the initrd meat would be the 7 tiny dancing ib drivers, who all dance on the head of a pin, and who don't really fit into the initrd meat without alot of effort.
Which is where our story will continue next time.....
Friday, August 3, 2007
back online
Ok, so I'm back from vacation, and back in swing on the new system.
We are in the midst of getting everything ready for when the large amount of hardware starts arriving. Fun things like filesystem structure, pxe booting over ib, provisioning the OS over ib using torrents, and the necessary setup that will easily allow us to install and keep track of 4000 machines and all their characteristics, much less making them all play nicely together.
The cabling alone will be a massive undertaking, and will need to be completed before the equipment begins to arrive. In fact, the filesystem hardware, all 1.7 Petabytes, will begin arriving in about a week or so. Not much time before the tsunami of hardware arrives.
I'm pretty jazzed about working on this new, cutting edge equipment. The constellation system, and all the new management hardware and software is really making my life much easier. I spend about half as much time in a 57 degree meat locker-like machine room, which means my body can get re-acclimated to a normal temperature again. I am constantly switching between a 50 degree difference in temperature. 57 inside, 107 outside. It's like a hot tub and cold dip, especially with the humidity lately.
We are in the midst of getting everything ready for when the large amount of hardware starts arriving. Fun things like filesystem structure, pxe booting over ib, provisioning the OS over ib using torrents, and the necessary setup that will easily allow us to install and keep track of 4000 machines and all their characteristics, much less making them all play nicely together.
The cabling alone will be a massive undertaking, and will need to be completed before the equipment begins to arrive. In fact, the filesystem hardware, all 1.7 Petabytes, will begin arriving in about a week or so. Not much time before the tsunami of hardware arrives.
I'm pretty jazzed about working on this new, cutting edge equipment. The constellation system, and all the new management hardware and software is really making my life much easier. I spend about half as much time in a 57 degree meat locker-like machine room, which means my body can get re-acclimated to a normal temperature again. I am constantly switching between a 50 degree difference in temperature. 57 inside, 107 outside. It's like a hot tub and cold dip, especially with the humidity lately.
Friday, July 27, 2007
Monday, July 23, 2007
Tuesday, July 3, 2007
for geeks only
I have nothing to say but this:
Cpu0 : 0.4% us, 0.4% sy, 0.0% ni, 97.7% id, 1.5% wa, 0.0% hi, 0.0% siNow, I can't tell you anything about this, but I can say it isn't intel, that it is really cool, and, not many people have been allowed to touch it or see it yet. But, here I am, touching all of it! Ha!
Cpu1 : 0.3% us, 0.1% sy, 0.0% ni, 99.4% id, 0.1% wa, 0.0% hi, 0.0% si
Cpu2 : 0.1% us, 0.0% sy, 0.0% ni, 99.9% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu3 : 0.1% us, 0.0% sy, 0.0% ni, 99.9% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu4 : 0.0% us, 0.1% sy, 0.0% ni, 99.8% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu5 : 0.1% us, 0.1% sy, 0.0% ni, 99.7% id, 0.1% wa, 0.0% hi, 0.0% si
Cpu6 : 0.0% us, 0.0% sy, 0.0% ni, 99.9% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu7 : 0.1% us, 0.0% sy, 0.0% ni, 99.9% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu8 : 0.1% us, 0.0% sy, 0.0% ni, 99.9% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu9 : 0.1% us, 0.1% sy, 0.0% ni, 99.5% id, 0.2% wa, 0.0% hi, 0.0% si
Cpu10 : 0.0% us, 0.0% sy, 0.0% ni, 99.9% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu11 : 0.1% us, 0.0% sy, 0.0% ni, 99.9% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu12 : 0.0% us, 0.0% sy, 0.0% ni, 99.9% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu13 : 0.1% us, 0.1% sy, 0.0% ni, 99.7% id, 0.1% wa, 0.0% hi, 0.0% si
Cpu14 : 0.0% us, 0.0% sy, 0.0% ni, 99.9% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu15 : 0.1% us, 0.0% sy, 0.0% ni, 99.9% id, 0.0% wa, 0.0% hi, 0.0% si
If you figured out what I've gotten a hold of, you can rest assured that no one will be disappointed with it, whatever 'it' is.
Friday, June 29, 2007
...and now for something completely different
First Beta Rack of the new Sun Constellation system
The new beta version of the Sun Constellation series has arrived! this is the back of it as it is being unloaded from the crate. Please keep in mind that no one in the world has one of these. Pretty awesome.
Here is the front of the rack, after being moved into place, with no blades in it.
And here is a blurry shot of the actual blades that will be going in - quad cpu slots that will each hold quad amd barcelona chips. Once again, no one has any of these - they are all beta units.
(editor's note: these blades are not quad-cores, these are just run of the mill dual core systems)
We've turned on and installed two of the blades, and they are really fast! We've also tested the magnum switch, and it passed all the tests with flying colors. Just as fast as they claimed it would be, and only one tiny issue that is software related, and we should be good to go.
Here is the front of the rack, after being moved into place, with no blades in it.
And here is a blurry shot of the actual blades that will be going in - quad cpu slots that will each hold quad amd barcelona chips. Once again, no one has any of these - they are all beta units.
(editor's note: these blades are not quad-cores, these are just run of the mill dual core systems)
We've turned on and installed two of the blades, and they are really fast! We've also tested the magnum switch, and it passed all the tests with flying colors. Just as fast as they claimed it would be, and only one tiny issue that is software related, and we should be good to go.
Tuesday, June 26, 2007
Biggest Switch in the World for the Biggest Computer in the World
So, today, one of the many heads of Sun is announcing the new Infiniband Switch, called Magnum, which will be the biggest switch of its kind in the world. 3,456 ports, in a fat-tree configuration, this switch will be the first of its kind, although with a different name, I'm sure (who wants to fight a copyright battle with a condom manufacturer?).
This switch will be one of two for the entire system, and will comprise the spine and nervous system for the biggest supercomputer in the world.
As you can see in the above picture, there was a huge, head-sized hole punched in the side of the crate when it arrived here. After unpacking, we determined that it was only superficial. I just know that made some people sweat, seeing as how we are receiving the ONLY copy of this switch in the entire world - wouldn't want it to get screwed up in transit.
Here are the extra parts that come with the main chassis - the line cards, and all the cable management, etc.
Here is the empty box! How exciting! This is right after it was rolled off the pallet.
And now, the back! OOOOOOOh. AAAAAAAAh.
And a little magnum frontal action, too! This is pre-line card/filler panel install. Please start salivating now.
The great thing about this is, no one else has one! You can probably order it after today, but good luck getting it any time this year!
This switch will be one of two for the entire system, and will comprise the spine and nervous system for the biggest supercomputer in the world.
As you can see in the above picture, there was a huge, head-sized hole punched in the side of the crate when it arrived here. After unpacking, we determined that it was only superficial. I just know that made some people sweat, seeing as how we are receiving the ONLY copy of this switch in the entire world - wouldn't want it to get screwed up in transit.
Here are the extra parts that come with the main chassis - the line cards, and all the cable management, etc.
Here is the empty box! How exciting! This is right after it was rolled off the pallet.
And now, the back! OOOOOOOh. AAAAAAAAh.
And a little magnum frontal action, too! This is pre-line card/filler panel install. Please start salivating now.
The great thing about this is, no one else has one! You can probably order it after today, but good luck getting it any time this year!
Odds and Ends
It's been 16 days since my last post, and there are many good reasons for that.
We've been chugging along, installing and testing, testing and reinstalling. Lots and lots of work to do. I've been putting in insane hours trying to get everything ready for the main components to arrive. Here are some neat details. 1st, a picture of knoppix booting on one of the X4600s.
Each penguin in that picture represents a CPU. In 3 months, you would see that boot with 16 penguins! Those would each represent 1 of the 4 cores in the new AMD 4core barcelona chip.
Next, I managed to figure out the way that Thumpers, or X4500's, boot. Grub on an X4500 is a very weird, fickle thing. You have to relax your brain and accept that it will not act the way you think it should.
When you are installing Rocks on a thumper, it can only see the first 24 drives, only one of which is bootable. It is very unfortunate that Sun decided to put the two bootable drives on the same scsi controller, since if you lose the controller, you can't boot. But whatever.
Here's how to get a thumper to boot with grub in linux. Install to /dev/sdy - while installing, the OS, via anaconda, sees the bootable disk as sdy - the 24th disk. At the end, you want to install grub to this disk (a quick synopsis of this is, in grub: device (hd1) /dev/sdy;root (hd1,0); setup (hd1);).
Once you have marked the drive with the grub magic, you now need to set up grub.conf to point to, get this, hd0. Yes, I know you set grub up on what you defined as hd1, but just accept this and move on.
While the machine is booting, the BIOS of the X4500 only present grub with two disks - disk 0 and disk 1. So in grub.conf, you need to point to 0, like this: root (hd0,0). Then, tell it where you installed root (like /dev/sdy1), so that once grub loads, it can find the kernel, etc. That is where I got hung up - I kept thinking grub would see what I defined as hd1, but it couldn't see it - it could only see hd0.
I know I'm not really explaining it well, but that's all that you need to do.
Sunday, June 10, 2007
installing like a madman
OK, so where was I?
Last Friday, we started to receive the first $X dollars worth of equipment (known as the 'starter system'). That equates to 8 X4600s, 16 X4100s, a 4G fiber-channel disk array and 4 thumpers (X4500s). Doesn't sound like much, until you start to work with them, and see just how nicely they are built. Here is a pic of the inside of an X4600 - and yes, that is 8 slots for cpus, which can all hold either dual or quad core amd chips in there!
Here is a fuzzy shot of the disk controller, the disks that will hold our Lustre metadata, and the thumpers on the right.
Here is the view I have to my right while I am working on this. This data center feels like a set from Star Wars. By the way - it is pretty bad for your health to stay in a brand new machine room for extended periods of time. Besides the fact that all the air is blowing everywhere at 65 degrees, there is a bunch of dust and particulate matter flying around that I can feel in my lungs. Not a good sign for my health!
Here are the X4600's sitting nicely on the ground, waiting to be installed:
Last Friday, we started to receive the first $X dollars worth of equipment (known as the 'starter system'). That equates to 8 X4600s, 16 X4100s, a 4G fiber-channel disk array and 4 thumpers (X4500s). Doesn't sound like much, until you start to work with them, and see just how nicely they are built. Here is a pic of the inside of an X4600 - and yes, that is 8 slots for cpus, which can all hold either dual or quad core amd chips in there!
Here is a fuzzy shot of the disk controller, the disks that will hold our Lustre metadata, and the thumpers on the right.
Here is the view I have to my right while I am working on this. This data center feels like a set from Star Wars. By the way - it is pretty bad for your health to stay in a brand new machine room for extended periods of time. Besides the fact that all the air is blowing everywhere at 65 degrees, there is a bunch of dust and particulate matter flying around that I can feel in my lungs. Not a good sign for my health!
Here are the X4600's sitting nicely on the ground, waiting to be installed:
bad pbr sig
I had posted about this error earlier, then I deleted the post because I thought it was stupid. Then, I noticed alot of people searching google for this post after I removed it. So I am posting my solution to help out.
I have come across this problem only on Sun hardware, but it may occur on others. I encountered it running CentOS linux, using grub.
This error, Bad PBR Sig, means that your Primary Boot Record has become borked. This normally happens when you are installing an OS on a machine that the OS is not familiar with, and it writes the record to the wrong place.
In my case, I was installing a thumper (sun x4500). My version of linux doesn't see all 48 hard drives in it, and so the install process is pretty lengthy. First you install an OS to /dev/sda, then you have to copy the OS, once it's installed, to /dev/sdy, which is the first bootable disk on the system. In doing so, if grub writes it's boot record to /dev/sda, then you won't be able to boot, as the BIOS don't see /dev/sda as bootable, just sdy and sdac. I think this is due to the layout of the disks and their closeness in position to the scsi channel.
Anyways, the way I fixed it was to use grub to write out the correct record to the correct drive.
You need to write to /boot/grub/device.map, and tell grub which hard drive is which. Mine looked like this:
(fd0) /dev/fd0
(hd0) /dev/sda
(hd1) /dev/sdy
(hd2) /dev/sdac
I installed the os first to hd0, moved it over to hd1, and mirrored it to hd2. I will be booting from a mirrored boot partition.
in grub, you set everything up like this:
type grub;
in grub:
grub> device (hd1) /dev/sdy
grub> root (hd1,0)
grub> setup (hd1)
grub> device (hd2) /dev/sdac
grub> root (hd2,0)
grub> setup (hd2)
this will mark the correct disks with boot records.
then, you make sure you boot from the correct drive in grub.conf:
title CentOS-4 x86_64 (2.6.9-42.ELsmp)
root (hd1,0)
kernel /vmlinuz-2.6.9-42.ELsmp ro root=/dev/md1 rhgb quiet
initrd /initrd-2.6.9-42.ELsmp.img
that's it.
Monday, June 4, 2007
Gotta get a handle on this
Here's an interesting physical problem we came across. We have received 4 thumpers so far (what's a thumper? check here), and 2 of the 4 have different system controller handles. One allows for the IB cables to be plugged in with plenty of room to spare. The other is too large, and you can barely manage to squeeze the connector in. It locks, but this worries us, as any stress on the IB cables equates to imminent failure in the future. It looks like the one that allows for easiest access was designed after someone tried to actually use the old version, in figure 2, below.
Figure 1: Example X4500 with notched handle which allows for correct access of Port 0 on each of the PCI-X Infiniband HCAs. This is the desired configuration (note that the IB cable is connected).
Figure 2: Example X4500 without notched handle which does not allow easy access to Port 0 on each of the PCI-X Infiniband HCAs (IB cable not connected).
Thursday, May 31, 2007
Just the facts, Man.
OK, I've been given official permission to blog about the new system. So here is the geek porn, live ready and waiting for your perusal:
The system will be called Ranger. Here are the specs:
Compute powerMemory
- 529 Teraflops(!) aggregate peak
- 3,936 Sun four-socket, quad-core nodes
- 15,744 AMD Opteron “Barcelona” processors
- Quad-core, four flops/cycle (dual pipelines)
Disk subsystem
- 2 GB/core, 32 GB/node, 125 TB total
- 132 GB/s aggregate bandwidth
Infiniband interconnect
- 72 Sun x4500 “Thumper” I/O servers, 24TB each
- 1.7 Petabyte total raw storage
- Aggregate bandwidth ~32 GB/sec
- Full non-blocking 7-stage Clos fabric
- Low latency (~2.3 /sec), high-bandwidth (~950 MB/s)
The overall look and feel of Ranger from the user perspective will be very similar to our current Linux cluster (Lonestar).
Full Linux OS w/ hardware counter patches on login and compute nodes (2.6.12.6 is starting working kernel)
Lustre File System
$HOME, and multiple $WORKS will be available
Largest $WORK will be ~1PB total
Standard 3rd party packages
Infiniband using next generation of Open Fabrics
MVAPICH and OpenMPI (MPI1 and MPI2)
Now, having come from my humble beginnings as an English Major, this is somewhat impressive to me. I have come pretty far from my first personal purchase of a computer 13 years ago - I used my wife's student loan money to buy it.
So, slobber away, punks! Biggest computer in the world, coming soon!
Thursday, May 17, 2007
been a while?
yes.
the contract was finally signed, and we have sent out a purchase order for the first $X worth of 'starter equipment', which means they will start shipping the machines that actually exist at this moment.
I haven't been blogging about this yet, because nothing interesting has happened. Just boring preparation work for the upcoming deluge of work. Like figuring out the Server Management software, as well as getting pxe-over-ib working, and doing filesystem tests and whatnot with lustre/different raid configurations. no picture-inducing stuff.
Friday, May 4, 2007
the power! THE POWER!!!!
So. We now have access to the power and water chilling buildings. The following pictures are of the chillers. The large pipes you see labeled with green signs are the in and out takes for the entire system. Those suckers are about 2 feet in diameter.
And here, we have the power distribution units. These massive circuits each carry 4000 Amps of power to the machine room, for a total of 3MegaWatts. Insane!
This set runs the in-row coolers and the air handlers:
and this set runs the rest of the cluster. yowza!
And here, we have the power distribution units. These massive circuits each carry 4000 Amps of power to the machine room, for a total of 3MegaWatts. Insane!
This set runs the in-row coolers and the air handlers:
and this set runs the rest of the cluster. yowza!
Wednesday, May 2, 2007
new machine room is done!
I was finally allowed to go into this room without a safety helmet! I hated wearing those things. The construction crew finally finished hooking up the 116 in row coolers, and they are planning on doing the final cleanup next week.
Very soon, we will start getting the first pieces of equipment in, and we can start our mad dash towards getting everything running the way it should.
First up will be the filesystem, made up of alot of servers which each have a large number of disks in them, which will all be connected to two massive infiniband networks, and incorporated into 2 giant, very fast, very low latency and high i/o filesystems. We will be using the Lustre distributed filesystem.
In the above photo, you can see the view from our loading dock - which actually has massive doors that you can fit massive things through! What a neat thought - a building that was designed from the ground up to be a machine room, not the other way around. It really got old rolling racks up the only handicap ramp in the building in our old machine room.
Here, you can see the actual loading dock, which a real machine room should have. A truck should only have to back up, then roll the racks/equipment straight onto the machine room floor. Hell Yes.
Very soon, we will start getting the first pieces of equipment in, and we can start our mad dash towards getting everything running the way it should.
First up will be the filesystem, made up of alot of servers which each have a large number of disks in them, which will all be connected to two massive infiniband networks, and incorporated into 2 giant, very fast, very low latency and high i/o filesystems. We will be using the Lustre distributed filesystem.
In the above photo, you can see the view from our loading dock - which actually has massive doors that you can fit massive things through! What a neat thought - a building that was designed from the ground up to be a machine room, not the other way around. It really got old rolling racks up the only handicap ramp in the building in our old machine room.
Here, you can see the actual loading dock, which a real machine room should have. A truck should only have to back up, then roll the racks/equipment straight onto the machine room floor. Hell Yes.
more of what I do
So, I am rewiring 4 of the 8 infiniband switches on our "old" cluster this week:
This is just half of the entire network, which connects over 4000 nodes to each other on 8 different paths, so that any time, every node has 8 places to go for communication.
Above is a finished rack with two switches in it.
And in this shot above, is the rack I am currently working on. We had to re-wire them because the subnet manager couldn't handle the way the cables were placed, and performance (latency) was really being negatively affected. If you order them *exactly* the opposite of what we did, the entire system runs almost 2x as fast.
This is just half of the entire network, which connects over 4000 nodes to each other on 8 different paths, so that any time, every node has 8 places to go for communication.
Above is a finished rack with two switches in it.
And in this shot above, is the rack I am currently working on. We had to re-wire them because the subnet manager couldn't handle the way the cables were placed, and performance (latency) was really being negatively affected. If you order them *exactly* the opposite of what we did, the entire system runs almost 2x as fast.
Tuesday, April 24, 2007
new geek stuff
As a super geek, I am WAY into gadgets. Unfortunately I lack the income necessary to purchase all the things I want. I did, however, manage to snag a new phone:It has a 1.3 Megapixel camera, and a really cool microSD slot, which I promptly filled with a 1GB card! For only $14! Here is a nice little pic of the card in my hand.
How the hell do they fit 1GB in there? I think they even sell 4GB and 8GB versions as well, but they aren't $14.
On another geek front, I found this nice 37 inch monitor for $540 on newegg.com! I should be getting it in about 2 or 3 days. NICE:
YESSS. (disclaimer - this 'TV' is 'HDTV ready' - meaning it does not have a tv tuner in it. We bought this screen to watch movies on - I still hate TV)
How the hell do they fit 1GB in there? I think they even sell 4GB and 8GB versions as well, but they aren't $14.
On another geek front, I found this nice 37 inch monitor for $540 on newegg.com! I should be getting it in about 2 or 3 days. NICE:
YESSS. (disclaimer - this 'TV' is 'HDTV ready' - meaning it does not have a tv tuner in it. We bought this screen to watch movies on - I still hate TV)
arrgh! an update!
I've meant to post here more often, but I've been too darn busy. Anywho (I hate that word), the workers have nearly finished their magic on the machine room. All the coolers have been attached to their water lines, and the PDU's have all been connected and are turned on (little red lights aglowin'). Also, they have the air handlers going as well, testing things out and making sure they all work correctly. Exciting!
(here, you can see my reflection in the window, with my hand holding the phone to take a picture, hobo-gloves and all. I wear fingerless hobo gloves because my office gets too cold and then I can't type anymore)
It sure is gonna be nice when we start actually receiving the equipment.
We would have started messing with the new equipment sooner, but the contract between us, the NSF and the unnamed company has not been signed yet. I know, it's *only* been 6 months! Who could ever get a contract signed in a measly 6 months (sarcasm)? Everyone is dragging their feet on signing off on the contract, because it is a huge amount of money, and the bureaucracy at all three places is preventing everyone from getting anything done.
So, it's been knots-in-the-stomach the whole way, and everyone I work with is starting to get nervous.
But I am confident that everything will turn out ok in the end - this is just one of MANY growing pains that will be experienced while we continue to build the BIGGEST COMPUTER IN THE WORLD. BWA-HAHAHAHAHAH.
(here, you can see my reflection in the window, with my hand holding the phone to take a picture, hobo-gloves and all. I wear fingerless hobo gloves because my office gets too cold and then I can't type anymore)
It sure is gonna be nice when we start actually receiving the equipment.
We would have started messing with the new equipment sooner, but the contract between us, the NSF and the unnamed company has not been signed yet. I know, it's *only* been 6 months! Who could ever get a contract signed in a measly 6 months (sarcasm)? Everyone is dragging their feet on signing off on the contract, because it is a huge amount of money, and the bureaucracy at all three places is preventing everyone from getting anything done.
So, it's been knots-in-the-stomach the whole way, and everyone I work with is starting to get nervous.
But I am confident that everything will turn out ok in the end - this is just one of MANY growing pains that will be experienced while we continue to build the BIGGEST COMPUTER IN THE WORLD. BWA-HAHAHAHAHAH.
Monday, April 9, 2007
tv is a nightmare
So, about 10 years ago, I got rid of my television. Here is a comment I made on a blog I came across where the author wrote about how to become more productive:
I haven't had a tv for 10 years. Bravo!
There is not only the money, there is the TIME wasted on tv. If you watch an average amount of television, most people spend around 20 hours a week in front of the tv. Most people claim to watch far less, but they are lying to themselves.
Do the math - that's around 43-50 SOLID *days* of watching tv if you were to do so concurrently (20 hours a week times 52 weeks in the year). The majority of people watch way more - I think the average for most kids is around 40 hours a week.
Anyways, if you only watch 20 hours a week, for every 5 years you are alive, ONE ENTIRE YEAR of that is spent in front of a tv, if watched continuously. So by the time you are twenty years old, you have wasted FOUR YEARS of your life! 6 years wasted at 30, and so on...
Once I did the math, I threw my tv out the next day.
It's addicting, by the way - you will have tv withdrawal symptoms for the next month or 2, before you are completely weaned.
Enjoy that thought!
Sunday, April 8, 2007
irrational exuberance
I know this isn't related to supercomputers or clusters, but supergeek owns a house that he recently purchased. This house was purchased for almost twice the amount I swore I would never go over in buying a home. It's a great house, and I really don't deserve to live here. Thing is, I was looking at it as an investment, and sunk a large amount of money into it. This graph tells me I am about to lose all that money. I'll still have the house, but I can tell that the perceived value of my house, and every other house in America, is about to plummet.
Friday, April 6, 2007
unrelated side note
This post is about something my department is working on for an un-named company in exchange for a donation of said product. We developed a simple, inexpensive solution for this company to create a video wall of nine 42" monitors all working in unison to display one picture. Such a solution is normally prohibitively expensive, but our open source solution makes it come within reach of the everyday joe (or ordinary joe organization with a good number of relatively affordable bucks)
I would like to point out that this is about 5 feet tall and 8 feet wide:
by the way - that's an image of blood flow in an artery.
*updated, from comments:
I would like to point out that this is about 5 feet tall and 8 feet wide:
by the way - that's an image of blood flow in an artery.
*updated, from comments:
I probably should have said 'affordable' in scare-quotes. This thing costs around $100,000 the way it was implemented. We could probably lower the price by using less memory per node, and less high-quality nodes as well, which might get it to go down to around an 'affordable' $60,000!
Subscribe to:
Posts (Atom)
Can you give us some linpack benchmarks on 'it'? I'm very curious.
July 24, 2007 8:23 AM
Ah yes, dear reader. I wish I could give you specifics, but I am under a very strict NDA, and if I were to put any results up here, I could strongly affect stock prices, either up or down, for this particular company.
So no, while I would LOVE to tell the world what I know, I just can't.
July 27, 2007 9:08 AM