Entrepreneurial Geekiness
Building a high performance cluster with Ubuntu 9.10 and Eucalyptus
I’ve spent the last day (almost) installing Eucalyptus on Ubuntu 9.10 to create a mini ‘high performance computing’ environment. We’re testing the concept and could build 100+ machines if the prototype works as expected.
This is a running log of my notes, for this post I only have a partial setup.
Note – I have a Eucalytpus follow-up which gets further than this post but ultimately fails.
To start you need Ubuntu 9.10 Server edition, this includes the open Eucalyptus software. Eucalyptus is an API for cluster computing that is compatible with Amazon’s EC2. This means you can build an in-house network for testing and private computation and later switch to EC2 if you want to scale up. This is great if some clients need privacy and some want true utility computing.
Note that the process of installing Eucalyptus requires at least one CD download, or two if you need both the 64bit edition for the Node and 32bit for your Cluster Controller because the machine is too old. The hardware requirements are a bit steep (Node machines need 1GB+ RAM, 40GB+ HD etc). Once installed you’ll also have to download at least one instance image that will run on the Nodes, these are about 180MB. This is a lot to download if you have a tiny VPN pipe to the outside world.
Two good background papers from open.eucalytpus.com are:
- The Eucalyptus Open-source Cloud-computing System (pdf)
- Eucalyptus: A Technical Report on an Elastic Utility Computing Architecture Linking Your Programs to Useful Systems (pdf)
Installing UEC via the CD (and UEC main page) is fairly easy, I actually followed these notes (first of three parts) before finding the official docs.
Installing the server took about 30 minutes, most of that was spent reading from the CD. The questions were pretty easy. Some notes:
- For the hard disk setup I used a fresh 40GB disk and chose ‘Guided – Use entire disk’ (not the LVM option)
- I chose no email configuration (I don’t know the SMTP local details here in the client’s office)
- For apt-get I had to configuring the proxy so it could see outside of the corporate firewall
To install the Node (1 client) I needed to dual-boot an existing Windows XP machine. For this I had to use PartitionMagic to resize the 500Gb Windows partition down to 100GB. This didn’t work – we kept getting ‘error 983 while executing batch’ and the resize would abort. The solution (as noted many times on the web) is to run ‘chkdsk /f’ at the command prompt – it reboots, does the check, in our case it didn’t report any changes, then PartitionMagic worked.
The candidate Node machine recognises that the Cluster Controller is running on another machine so it nominates itself as a Node. Only a few questions are asked (e.g. the keyboard) and then everything is installed. For the HD installation I chose ‘Use the largest contiguous free space’ having blanked 360GB via PartitionMagic earlier.
For reasons that aren’t clear after installation it had trouble finding the network. I had to ‘sudo /etc/init.d/networking restart’ before it could ‘ping slashdot.org’. It still won’t do a full ‘sudo apt-get update’ (it completes just fine on the Cluster Controller) but I’ll assume that this isn’t a problem.
Now that the network is good, if I run ‘sudo euca_conf –no-rsync –discover-nodes’ on my Cloud Controller then it reports finding 1 Node. I can accept the Node but after that I have some sort of authentication fail. This might be due to the corporate network firewall.
If I jump a step forwards then I can run ‘sudo euca_conf –get-credentials mycreds.zip’, ‘unzip mycreds.zip’, ‘./eucarc’ but then when I run ‘euca-describe-availability-zones verbose’ I get an XML parse error much like this bug.
There are enough network errors here to suggest that the corporate firewall isn’t playing ball (it won’t be the first time). I’ll restart installation on my two test machines when we have a public internet connection established that avoids the corporate firewall. I’ll post another entry when I run the second experiment (December, all going well).
Update: I followed the NodeInstallation notes to set the Cloud Controller’s eucalyptus user’s public key into the Node Controller’s eucalyptus user’s authorized_keys file. That hasn’t fixed the above two errors.
Books:
The following books will help you move forwards, the Eucalyptus one will make the above configuration easier and the second on EC2 will help you see how Eucalyptus and EC2 compare.
Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
£5 App Christmas Special – Weds 2nd December
John and I are very pleased to announce our upcoming music-themed £5 App Christmas Special on Wednesday 2nd December, 8-11pm at Hector’s House in collaboration with the lovely Playgroup guys. Please do the usual – sign-up on Upcoming so we know how much beer to brew for you all. If you don’t know what this is then see last year’s Xmas Special write-up and details of all the previous events (with videos).
We want 40-60 of you along this year so please spread the word – Tweets and blog posts would be hugely appreciated!
Outline:
- Seb Lee-Delisle – “My life as a wannabe rock star at the birth of the internet music boom” – full description below
- Toby Cole – “Zero to Theremin in 20 days” – How BuildBrighton built a feature rich, ultrasonic, laser etched MIDI controller in under three weeks”
- Tom Hume – “You’re all an orchestra, get over it” – Bluetooth devices will interact with the audience to create changing ambient music, created by Future Platforms for a Music Hack Day
- Jim Purbrick – “A short talk on the Mrmr/LiveAPI guitar mounted iPhone ableton live interface by the head of Second Life Europe and later a demo with 100Robots”
- lastminute.com labs – Bottle-Rock-It, a music game for n iPhones where (with any luck) n > 3 (Richard, Russ, Sam, Mathias)
- 100Robots – Jim and Max Williams play live and loud for us
Seb has the main talk, his full blurb is:
“Before Seb Lee-Delisle was peddling his digital creations, he had an entirely different life. He spent most of his 20s setting up Solar Records and promoting his band Stargirl (later Laine). Investing over £50,000 of their own money, they released their own CDs, made it onto the radio and TV, played in front of 30,000 people, recorded at George Martin’s Air Studios and had full page spreads in the nationals.
They were at the forefront internet music boom of the late 90s. The future was looking rosy for this group of dynamic 20-somethings. So come and find out what it was like, how the hell they got the £50K, and why their plans didn’t quite reach fruition…”
Beer – several of us who are doing well this year will put up some bar-money (Alan of SensibleDevelopment, Paul Silver of Brighton Farm and my ProCasts so far, several more to come, get in contact if you want to share the love).
Food – maybe nibbles.
Next, please sign-up on Upcoming so we know how much beer to provide and tweet/post about the event to help us spread the word. Cheers!
Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
Vicinity-like iPhone app for ‘nearby people I’ve met on-line’?
Another from the crazy ideas dept….
Whilst preparing for the ‘How to build a network‘ workshop last week I got to wondering about conferences and groupings of geeks (and Normals, but they need to catch-up with our tech first).
Why is it that when I’m at a conference or event, I don’t know if anyone nearby is a person that I haven’t met in real-life but someone that I do know online? Surely there’s an iPhone app for that…
Here’s what I want – I start (proposed duff name) ‘WhosHere’ and it tells me, via my location:
- TF people are nearby that are Twitter friends (two way reciprocal relationship or at least I’m following them)
- NF people are nearby that I’ve referred to on Twitter but I don’t follow or vice-versa
- BL people are nearby who have a blog that I’ve commented on (see below for details)
- EM people are nearby who I’ve referred to in email recently (see below for details)
- Same for Facebook, LinkedIn etc…
Probably I can mark off people that I know well so it doesn’t keep showing them to me (or maybe they appear in a separate tab?) – I’m interested in finding out when people I don’t know well are nearby as this will help me to turn weak-ties into stronger-ties.
Tieing a location to Twitter friends is probably really easy (assuming they’re posting location info). Presumably searching for people tweeting via a location is also easy (since iPhone apps already do it).
For the blog (BL) report the iPhone app would need to talk to a service that can check the Twitter profiles of nearby people, reference their blogs (or use a social graph explorer) and determine if I have left them a comment (since I’d use my domain when commenting) or linked to their blog. I’d love to see this in an app!
For the email (EM) report the app would need to read my email (can it do that?) and look for names or URLs that are mentioned. From these it can do a similar lookup via nearby Twitter people as for the blog report above. Knowing that a company or individual is nearby that I’ve referred to in an email with a friend could be really interesting.
Am I barking up a crazy tree or does this idea make some sense?
Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
“How to Build a Network” workshop for WiredSussex interns
Yesterday I ran a workshop on ‘How to build a network’ for 35 WiredSussex interns. The presentation is clear, the links below will help.
During the talk I linked to some OpenStreetMap early progress videos for London 2006- and worldwide-2008 edits, these demonstrate a nice graphical result of building personal networks around a project.
Early on I ran an idea based on ‘free schools‘, I asked everyone to name something they could teach and something they could learn. People put up their hand if they could teach something that another wanted to learn – of the 35 in the room we had 33 hits for skills that could be taught, sometimes over half the room could teach that skill. The goal was to show everyone that they had many as-yet-unknown links with everyone in the room which could help build their network.
Everyone wrote their skills/needs on a post-it and WiredSussex has listed them all here, here’s an example:
Roughly, here’s what I covered:
- Be yourself, be human, don’t be a shiny-suited-salesman-with-secret-handshakes
- Everyone spoke to someone they didn’t know and then introduced their name, their company and how they either met a friend or started a conversation – this was an ice-breaker aimed at getting everyone to meet one new person
- Laying the foundation of a FreeSchool (ignore the anarchistic overtones! just take the general idea of non-formal education and skill sharing) using Post-Its to show ‘something you can teach’ and ‘something you want to learn’, then each person read out what they want to learn and others put up their hands if they had a relevant skill. 33 of the 35 interns could learn a desired skill from others in the room, only 2 misses is not bad at all.
- 10 years of my experiences learning to network, working for others and building my own businesses and projects all in 15 minutes
- Getting 3 people to stand-up and explain ‘here’s what I’m good at’ (for reference later)
- 10 minute break for the interns to meet someone new – most of them succeeded (which was rather lovely)
- Ranking business cards using bluetack on the wall – which cards were ‘most communicative’ to ‘least communicative’ and discussing what makes for a good or bad card
- Getting a Moo card – super easy card creation for personal cards and projects
- Who remembers people that were introduced earlier – emphasising that if you meet someone for a personal chat or stand-up you’re more likely to be remembered – so always take the opportunity to be memorable
- Online networking – who uses blogs, twitter, facebook etc
- Homework – interns to mail me a write-up on their blog, tweet, facebook posting or whatever that links them to the event – I’ll then update this post when they mail me the link
- Discussion of local events (listed below)
- Places and people the interns might come across – The Skiff, The Werks, SInC, Cafe Delice, Jon Markwell, Paul Silver, Sarah Bird, Seb Lee-Delisle, Emily Toop, Matt Weston
Some local events: Likemind, OpenCoffeeSussex, £5 App, BrightonFarm, FlashBrighton, BrightArray, BuildBrighton, BrightonRobotics, Slackspace, Brighton Business on LinkedIn, WriteClub, BANG, Brighton Illustrators, Girl Geeks, UXBri, CultureGeeks, GeekWineThing.
Someone (say if it was you!) asked me in the pub about the state of Artificial Intelligence (that’s another subject of mine), I came across this article on the End of the AI Winter which you might want to read.
My projects include working for MASA, building IMOzsvaldSystems, building Mor Consulting Ltd, co-building ShowMeDo with Kyran Dale, co-creating £5 App with John Montgomery, building ProCasts, writing The Screencasting Handbook.
My pages on LinkedIn, Twitter, my blog – feel free to follow me or link to me.
Thanks Hon Mond Ng for the tweet. Thanks to Maria Welby and Gearoid Conlon for Linking In, Alexandra Gaiger for Linking In and blogging, David Howard for Linking In and welcome Stefan Daniels to LinkedIn. Hi Katie, Oli.
Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
Internship ‘How to build a network’ workshop
Tomorrow I run a workshop for WiredSussex to teach 35 interns ‘how to build a network’. I’ll write an update that links to my presentation after the workshop, here I want to link to a few examples I’ll probably use.
Update – I now have a full write-up which makes this post obsolete.
OpenStreetMap early progress videos for London 2006- and worldwide-2008 edits.
My past projects include working for MASA, building IMOzsvaldSystems, building Mor Consulting Ltd, co-building ShowMeDo with Kyran Dale, co-creating £5 App with John Montgomery, building ProCasts.
My pages on LinkedIn, Twitter, my blog.
Some events: Linkmind, OpenCoffeeSussex, £5 App, BrightonFarm, FlashBrighton, BrightArray, BuildBrighton, BrightonRobotics, Slackspace, Brighton Business on LinkedIn, WriteClub, BANG, Brighton Illustrators, Girl Geeks, UXBri, CultureGeeks, GeekWineThing
The main discussion list is the BNM.
Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
Read my book
AI Consulting
Co-organiser
Trending Now
1Leadership discussion session at PyDataLondon 2024Data science, pydata, RebelAI2What I’ve been up to since 2022pydata, Python3Upcoming discussion calls for Team Structure and Buidling a Backlog for data science leadsData science, pydata, Python4My first commit to PandasPython5Skinny Pandas Riding on a Rocket at PyDataGlobal 2020Data science, pydata, PythonTags
Aim Api Artificial Intelligence Blog Brighton Conferences Cookbook Demo Ebook Email Emily Face Detection Few Days Google High Performance Iphone Kyran Laptop Linux London Lt Map Natural Language Processing Nbsp Nltk Numpy Optical Character Recognition Pycon Python Python Mailing Python Tutorial Robots Running Santiago Seb Skiff Slides Startups Tweet Tweets Twitter Ubuntu Ups Vimeo Wikipedia