Introduction
We are developing a cluster for local ATLAS computing using the TACC Rodeo system to boot virtual machines. If you just want to use the system, see the next section and ignore the rest (which describes the virtual machine setup and is a bit out of date as of Sep 2015).
Getting started with Bosco
The Tier-3 uses utatlas.its.utexas.edu as a submission host - this is where the Condor scheduler lives. However
Bosco is a job submission manager designed to manage job submissions across different resources. It is needed to submit jobs from our workstations to the Tier-3.
Make sure you have an account on our local machine utatlas.its.utexas.edu, and that you have passwordless ssh set up to it from the tau* machines.
To do this create an RSA key and copy your .ssh folder onto the tau machine using scp.
Then carry out the following instructions on any of the tau* workstations:
cd ~ curl -o bosco_quickstart.tar.gz ftp://ftp.cs.wisc.edu/condor/bosco/1.2/bosco_quickstart.tar.gz tar xvzf ./bosco_quickstart.tar.gz ./bosco_quickstart
this will ask you if you would like to install. Select y and continue.
Bosco Quickstart Detailed logging of this run is in ./bosco_quickstart.log ************** Starting Bosco: *********** BOSCO Started ************** Connect one cluster (resource) to BOSCO: *********** At any time hit [CTRL+C] to interrupt. Type the submit host name for the BOSCO resource and press [ENTER]: No default, please type the name and press [ENTER]: utatlas.its.utexas.edu Type your username on utatlas.its.utexas.edu (default USERNAME) and press [ENTER]: Type the queue manager for utatlas.its.utexas.edu (pbs, condor, lsf, sge, slurm) and press [ENTER]: condor Connecting utatlas.its.utexas.edu, user: USERNAME, queue manager: condor
This may take some time to configure and test, but when it finishes, you should run.
source ~/bosco/bosco_setenv
Then you will be able to submit jobs as if you were running condor!
In order to get more than ten jobs to submit at once (through ATLAS Connect you have access to hundreds of job slots at other institutions), edit the ~/bosco/local.bosco/condor_config.local
file, changing the last line to reflect the maximum number of simultaneous jobs to submit:
GRIDMANAGER_MAX_SUBMITTED_JOBS_PER_RESOURCE=200
Lastly, here is a more detailed guide to Bosco .
VM configuration
Our virtual machines are CentOS 6 instances configured with CVMFS for importing the ATLAS software stack from CERN. They also boot individual instances of the Condor job scheduling system. They access the same instance of the Squid HTTP caching server which our local workstations use (on utatlas.its.utexas.edu), which help reduce network traffic required for CVMFS and for database access using the Frontier system.
Booting a VM on Nimbus with scratch disk
We need to use tools other than the standard cloud-client.sh
provided by Nimbus. We use a slightly modified vm-helpers
. Untar the file vm-helpers.tgz in your nimbus-cloud-client directory (it will dump four files into bin/
). Now you can run e.g.
cd nimbus-cloud-client-21 grid-proxy-init -cert conf/usercert.pem -key conf/userkey.pem bin/vm-run --cloud master1.futuregrid.tacc.utexas.edu:8443 --image "cumulus://master1.futuregrid.tacc.utexas.edu:8888/Repo/VMS/71a4ea3e-07e4-11e2-a3b7-02215ecdcdaf/centos-5.7-x64-clusterbase.kvm" --blank-space 500000
which will give you a 500 GB scratch space. Our images will mount this under /scratch
.
Building Image Using Boxgrinder
Ensure that Boxgrinder is present on the VM which you are building from. If this is a temporary image, it is likely you will need to copy over or git conf files and the appliance definition (.appl). Boxgrinder can be run by:boxgrinder-build definition.appl -d local
Boxgrinder options include:
?-f #Remove previous build for this image --os-config format:qcow2 #Build image with qcow2 disk -p #Specify location style (VMware, KVM, Player, etc..) -d local #Deliver to local directory --debug #Prints debug info while building --trace #Print trace info while building
Creating Openstack Nodes
cp ......./openstack cp ......./openstack-share source openstack/ec2rc.sh cd openstack-share ./boot-instances
Accessing Openstack Nodes
ssh username@alamo.futuregrid.org
Then visit the list of instances to see which nodes are running. Then simply
ssh root@10.XXX.X.XX
and you are now accessing a node!