I’ve been eyeing crowbar recently, it looks pretty useful and interesting for deploying servers and applications. I haven’t seen much if at all any documentation out there which suggests that people in the digital preservation and archiving fields are implementing systems at scale, I’m under the impression that most systems/sites are building systems up one piece at a time without much automation.
It seems to use chef in the backend for all the automation.
There’s a new release of Ceph, I hope that they release a stable soon so we can do further evaluations of the Ceph storage system. A few of my work colleagues are going to the Ceph workshop next week.
I’m wondering if anyone has taken the CRUSH algorithm and used it in other domains.
What do you do when you need a crash course on RoR, Hydra and frameworks for digital preservation and archiving? You go to Hydracamp!
The syllabus was
Day 1 - Rails, CRUD, TDD and Git Day 2 - Collaborative development with Stories, Tickets, TDD and Git Day 3 - Hydra, Fedora, XML and RDF (ActiveFedora and OM) Day 4 - SOLR and Blacklight Day 5 - Hydra-head, Hydra Access Controls Most of the training sessions were hands on from day 1 which was refreshing, as it was hands on I getting the most out of the training session.
I shall be going to SC2012 next month, I plan on hitting a few of the storage vendors for possible collaborations and flagging to them that we’re on the look out for storage systems. One of the first observation that the reader will note is “where is that link between HPC and Digital Preservation and Archiving”. It’s probably not obvious to most people, one of the big problems in the area of preservation and archiving is the the amount of data involved and the varied types of data.
A co-worker of mine (Paddy Doyle) had originally hacked at a perl script for reporting balances from SLURM’s accounting system a year or two ago and he had figured out that it might be possible to do some minimalistic ‘configuration’ and scripting to get a system that’s very basic but functional.
It was just one of those things that funding agencies wanted to justify how the system was being used, GOLD was clunky and obtrusive and complicated for what we wanted.
The latest development branch of Ceph is out with some rather nice looking features, what’s probably the most useful are the RPM builds for those that run RHEL6 like systems.
Still no real sight of backported kernel modules :P Also some of the guys in work here just deployed a ~200tb Ceph installation which I’ve access to a 10tb RBD for doing backups on.
Given that I have a number of old 64bit capable desktop machines and a collection of hard drives at home, I could have run Tahoe-LAFS like I do in work for backup purposes. In fact Tahoe works quite well for the technically capable user.
Recently I’ve decided that I need a more central location at home to store my photo collection (I love to take photos with my Canon DSLR and Panasonic LX5).
I’ve used Talend ETL a few times, however I came across this application http://datacleaner.org/, I need to take a look at this somepoint to see if its an alternative to Taled or not, or whether it works on a Mac or not!
There’s a new stable release of Ceph Argonaut, I seem to be having better luck with playing with the development releases of Ceph.
Oh how I wish that there was a backport of the kernel ceph and rbd drivers for RHEL6, I have a dodgy repo and some reverted commits that one of the guys in work told me about. It seems to run but it isn’t great, it can be found at https://github.
Having learnt how to remove and add monitor’s, meta-data and data servers (mon’s, mds’s and osd’s) for my small two node Ceph cluster. I want to say that it wasn’t too hard to do, the ceph website does have documentation for this.
As the default CRUSH map replicates across OSD’s I wanted to try replicating data across hosts just to see what would happen. In a real world scenario I would probably treat individual hosts in a rack as a failure unit and if I had more than one rack of storage, I would want to treat each rack as the minimum unit.