<?xml version="1.0" encoding="UTF-8"?> <rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
> <channel><title>gehrcke.de &#187; GSoC 2009</title> <atom:link href="http://gehrcke.de/category/technical-stuff/google-summer-of-code/feed/" rel="self" type="application/rss+xml" /><link>http://gehrcke.de</link> <description>Jan-Philip Gehrcke&#039;s website</description> <lastBuildDate>Sun, 13 May 2012 17:17:35 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3.2</generator> <item><title>Google Summer of Code end: code upload and acknowledgement</title><link>http://gehrcke.de/2009/08/google-summer-of-code-end-code-upload-and-acknowledgement/</link> <comments>http://gehrcke.de/2009/08/google-summer-of-code-end-code-upload-and-acknowledgement/#comments</comments> <pubDate>Mon, 24 Aug 2009 19:17:01 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[Amazon Web Services]]></category> <category><![CDATA[CernVM]]></category> <category><![CDATA[Clobi]]></category> <category><![CDATA[General]]></category> <category><![CDATA[GSoC 2009]]></category> <category><![CDATA[Nimbus]]></category> <category><![CDATA[Personal Stuff]]></category> <category><![CDATA[Python]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=929</guid> <description><![CDATA[<p>The Google Summer of Code 2009 final evaluation deadline is today; 19 UTC. I don&#8217;t have time to summarize my summer here now, but there are two things I want to say to the world. First, I want to thank many people for enriching my summer. Second, I would like to announce the Clobi project [...]]]></description> <content:encoded><![CDATA[<p>The <em>Google Summer of Code 2009</em> final evaluation deadline is today; 19 UTC. I don&#8217;t have time to summarize my summer here now, but there are two things I want to say to the world. First, I want to thank many people for enriching my summer. Second, I would like to announce the <em>Clobi</em> project on <em>Google Code</em>.<span
id="more-929"></span></p><h4>Acknowledgement</h4><ul><li><a
href="http://www.mcs.anl.gov/~keahey/">Kate Keahey</a> (<a
href="http://www.anl.gov/">Argnonne National Laboratory</a>/<a
href="http://globus.org/">The Globus Alliance</a>/<a
href="http://workspace.globus.org/">Nimbus team</a>).<br
/> She offered me a fabulous mentorship. I assess this at its true worth by looking at what I&#8217;ve learned throughout the summer just because of these great conversations <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> . Thank you for all your support and dedication and for pushing me and the project forwards!</li></ul><ul><li><a
href="http://www.mcs.anl.gov/~tfreeman/">Tim Freeman</a> and <a
href="http://www.linkedin.com/pub/david-labissoniere/b/888/663">David LaBissoniere</a> (<a
href="http://www.anl.gov/">Argnonne National Laboratory</a>/<a
href="http://globus.org/">The Globus Alliance</a>/<a
href="http://workspace.globus.org/">Nimbus team</a>).<br
/> Wonderful, extensive and patient extra-premium-support-with-special-treatment regarding <em>Nimbus/Workspace</em> in the MUD. Thank you so much! You smoothed my technical way through the project.</li></ul><ul><li><a
href="https://excess.org/">Ian Ward</a> (creator of <a
href="http://excess.org/urwid">urwid</a>, a console user interface library for <em>Python</em>).<br
/> He supported me so great while I was implementing the user interface for <em>Clobi&#8217;s Resource Manager </em>using <em>urwid</em>&#8216;s new <em>SelectEventLoop</em> technology. Talking to him saved so much valuable time. Thank you!</li></ul><ul><li><a
href="http://www.elastician.com/">Mitchell Garnaat</a> (the creator of <a
href="http://code.google.com/p/boto/">boto</a>, a Python interface to Amazon Web Services) and the <a
href="http://groups.google.com/group/boto-users">boto users mailinglist</a>.<br
/> Great and essential support since almost one year now. Thank you very much for answering many questions!</li></ul><ul><li><a
href="https://savannah.cern.ch/users/pbuncic">Predrag Buncic</a> (<a
href="http://cernvm.cern.ch/cernvm/">CernVM</a>) and <a
href="http://www.linkedin.com/in/artemharutyunyan">Artem Harutyunyan</a> (<a
href="http://aliceinfo.cern.ch/Collaboration/index.html">ALICE@LHC</a>).<br
/> Thank you <a
href="http://gehrcke.de/2009/06/cernvm-on-nimbusec2-public-key-injection-problem/">for your support</a> regarding <em>CernVM</em> on <em>Nimbus</em>!</li></ul><ul><li><a
href="http://www.linkedin.com/pub/jakub-moscicki/0/629/b79">Jakub Moscicki</a>, <a
href="http://www3.imperial.ac.uk/people/u.egede">Ulrik Egede</a>, <a
href="http://homepages.physik.uni-muenchen.de/~Johannes.Elmsheuser/contact.html">Johannes Elmsheuser</a> and the rest of the <a
href="http://ganga.web.cern.ch/ganga/">Ganga</a> crew.<br
/> Thanks a lot for very important, effective and efficient support concerning the system behind <em>Clobi</em> and <em>Clobi&#8217;s Ganga backend</em>. You are great!</li></ul><ul><li><a
href="http://www.linkedin.com/pub/stefan-kluth/4/72a/971">Stefan Klut</a>h and <a
href="http://www.linkedin.com/profile?viewProfile=&#038;key=15670615">Stefan Stonjek</a> (<a
href="http://atlas.ch/">ATLAS@LHC</a>, <a
href="http://www.mpp.mpg.de/">Max-Planck-Institut für Physik, Munich</a>)<br
/> Thank you for many helpful discussions, support and beautiful times in Munich!</li></ul><ul><li><a
href="http://people.cs.uchicago.edu/~borja/">Borja Sotomayor</a> (<a
href="http://www.uchicago.edu/">University of Chicago</a>, <a
href="http://globus.org">Globus Alliance</a>).<br
/> He made a great job as GSoC mentoring organization administrator. Thank you for this and for introducing the MUD <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> .</li></ul><ul><li>Paul D Marshall.<br
/> Thank you for discussions regarding dynamical deployment of computing resources.</li></ul><ul><li><a
href=" http://www.linkedin.com/pub/xiaoming-gao/9/a90/348">Xiaoming Gao</a>.<br
/> Thank you for the exciting cooperation regarding <em>Virtual Block Stores</em> for Nimbus.</li></ul><ul><li><a
href="http://en.wikipedia.org/wiki/Alex_Martelli">Alex Martelli</a>.<br
/> He answered <a
href="http://stackoverflow.com/questions/1185660/python-is-os-read-os-write-on-an-os-pipe-threadsafe">an important Python question</a> that most people could not answer. Smoothed my way to implement inter thread communication in <em>Clobi&#8217;s Resource Manager</em>.</li></ul><h4>Clobi @ Google Code</h4><p>In <a
href="http://gehrcke.de/2009/08/distribute-high-performance-computing-jobs-among-multiple-computing-clouds/">this blog post</a> I introduced <strong>Clobi</strong>, the result of this <em>Google Summer of Code</em> project. Today, I created a new project page for <em>Clobi</em> on <em>Google Code</em>: <a
href="http://code.google.com/p/clobi/">http://code.google.com/p/clobi/</a></p><p>As a first action, I pushed my local <a
href="http://en.wikipedia.org/wiki/Mercurial_%28software%29">mercurial</a> code repository into the online repository. You can browse the code <a
href="http://code.google.com/p/clobi/source/browse/#hg">here</a> and you can look through the development history (the commits I&#8217;ve made) <a
href="http://code.google.com/p/clobi/source/list">here</a>.</p><p>Then I prepared a test release of all <em>Clobi</em> components. You can get it <a
href="http://code.google.com/p/clobi/downloads/list">in the download section</a>.</p><p>That&#8217;s all about <em>Clobi</em> for the next time. From tomorrow on, I will go on with my master thesis project in Physics (about <em>Magnetic Particle Imaging</em>). Next week, the <a
href="http://www.icmrm10.montana.edu/">ICMRM</a> conference in Montanta starts and I&#8217;ve to make a poster for my contribution (abstract <a
href="gehrcke_ICMRM09_abstr_MPI_MMF2vsSPM_090713.pdf">here</a>).</p><p><strong>I will go on with <em>Clobi</em></strong>, when there is more free time <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> .</p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2009/08/google-summer-of-code-end-code-upload-and-acknowledgement/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item><title>new system successfully tested: &#8220;Distribution of High Performance Computing Jobs among Multiple Computing Clouds&#8221;</title><link>http://gehrcke.de/2009/08/distribute-high-performance-computing-jobs-among-multiple-computing-clouds/</link> <comments>http://gehrcke.de/2009/08/distribute-high-performance-computing-jobs-among-multiple-computing-clouds/#comments</comments> <pubDate>Tue, 18 Aug 2009 12:14:26 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[Amazon Web Services]]></category> <category><![CDATA[ATLAS Software]]></category> <category><![CDATA[CernVM]]></category> <category><![CDATA[Clobi]]></category> <category><![CDATA[GSoC 2009]]></category> <category><![CDATA[Linux]]></category> <category><![CDATA[Nimbus]]></category> <category><![CDATA[Python]]></category> <category><![CDATA[Science]]></category> <category><![CDATA[Technical Stuff]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=794</guid> <description><![CDATA[<p>Hello you out there!</p><p>I just started running the first serious test of the system I&#8217;ve developed during this year&#8217;s Google Summer of Code. If I wanted to put it in sensational words, the test could be called &#8220;Distribution of Particle Physics High Performance Computing Jobs among Multiple Computing Clouds&#8221;; just to get some readers [...]]]></description> <content:encoded><![CDATA[<p>Hello you out there!</p><p>I just started running the first serious test of the system I&#8217;ve developed during this year&#8217;s <a
href="http://code.google.com/soc/">Google Summer of Code</a>. If I wanted to put it in sensational words, the test could be called <strong>&#8220;Distribution of Particle Physics High Performance Computing Jobs among Multiple Computing Clouds&#8221;</strong>; just to get some readers <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> . During the test, there will be some time I just sit around and watch my monitor, so I decided to share my experience about the new system with you and keep record of the test progress within this blog post.<span
id="more-794"></span></p><p><strong>Content</strong> (don&#8217;t worry, the sections are short):</p><ul><li><a
href="#intro">0 Introduction</a></li><li><a
href="#prep">1 Preparation</a></li><li><a
href="#start">2 VM startup</a></li><li><a
href="#sessmon">3 Session monitoring (number of Job Agents)</a></li><li><a
href="#submit">4 Job submission</a></li><li><a
href="#sessmon2">5 Session monitoring (number of jobs)</a></li><li><a
href="#jobmon">6 Job monitoring</a></li><li><a
href="#jobcomplete">7 Job completion, output receipt</a></li><li><a
href="#output">8 Examine output</a></li><li><a
href="#shutdown">9 VM shutdown</a></li><li><a
href="#appendix">10 Appendix</a></li></ul><h4><a
name="intro">0 Introduction</a></h4><p>First of all, I&#8217;ve to introduce the system. This is the longest section.</p><p>It&#8217;s a job scheduling system supporting Virtual Machines (VMs) in multiple <a
href="http://en.wikipedia.org/wiki/Infrastructure_as_a_service">Infrastructure-as-a-Service computing clouds</a>. Because of this, I found the name <strong>&#8220;Clobi&#8221;</strong>, which somehow comes from &#8220;cloud&#8221; and &#8220;combination&#8221;. If you know a better name, then please let me know <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> . Currently, the system is using and supporting <a
href="http://workspace.globus.org/">Nimbus</a> (and it&#8217;s prepared for Cumulus, Nimbus&#8217; storage service) and <a
href="http://aws.amazon.com/">Amazon Web Services</a> (more precisely <a
href="http://aws.amazon.com/ec2/">EC2</a>, <a
href="http://aws.amazon.com/sqs/">SQS</a>, <a
href="http://aws.amazon.com/simpledb/">SimpleDB</a>, <a
href="http://aws.amazon.com/s3/">S3</a>).</p><p>It&#8217;s an &#8220;elastic&#8221; and &#8220;scalable&#8221; job system that can set up a huge computing resource pool almost instantly. VMs are added to or removed from the pool dynamically; based on need and demand. Jobs are submitted and processed using an asynchronous queueing system. An arbitrary number of clients is allowed to submit jobs to an existing resource pool. The basical realiability is inherited from the reliability of the core messaging components Amazon SQS (for queueing) and Amazon SimpleDB (for bookkeeping): to prevent data from being lost or becoming unavailable, it is stored redundantly and geographically dispersed across multiple datacenters.</p><p>Furthermore, the job system&#8217;s components are highly decoupled, which allows single components to fail or to get re-initialized without affecting the others.</p><p>The motivating application for this system is <a
href="https://twiki.cern.ch/twiki/bin/view/Atlas/AtlasComputing">ATLAS Computing</a> (for the <a
href="http://atlas.ch/">ATLAS experiment</a> at <a
href="http://lhc.web.cern.ch/lhc/">LHC</a>, <a
href="http://public.web.cern.ch/public/">CERN</a> (Geneva)): a common ATLAS Computing application (the so-called &#8220;full chain&#8221;) will be run during this test.</p><p>But: the basic system is totally generic and can be used in any case whenever it&#8217;s convenient to distribute jobs among different clouds. This is always the case when one tries to satisfy the basic computing power needs for a low price by e.g. operating an own Nimbus cloud, but wants to able to instantly balance out peaks of desired computing power by simply adding Amazon’s EC2 to the resource pool for a certain amout of time. By using <strong>Clobi</strong>, combining different clouds to one big resource pool becomes very easy.</p><p>These are the main components used during this test:</p><ul><li>a <strong>special ATLAS Software Virtual Machine image</strong> based on <a
href="http://cernvm.cern.ch/cernvm/">CernVM</a>. I placed it on <a
href="http://workspace.globus.org/clouds/nimbus.html">Nimbus Teraport Cloud</a> and (as Amazon Machine Image) on S3 for EC2</li></ul><ul><li>the <strong>Clobi Resource Manager</strong> (observing job queues, starting/killing VMs, &#8230;)</li></ul><ul><li>the <strong>Clobi Job Agent</strong> (running on VMs, polling &#038; running jobs, bookkeeping, &#8230;)</li></ul><ul><li>the <strong>Clobi Job Management Interface</strong> (providing methods to submit / remove / kill / monitor / &#8230; jobs)</li></ul><ul><li>a <a
href="http://ganga.web.cern.ch/ganga/">Ganga</a> <strong>Clobi Backend</strong> (integrates <strong>Clobi</strong> into <a
href="http://ganga.web.cern.ch/ganga/">Ganga</a>, which is &#8220;an easy-to-use frontend for job definition and management&#8221;)</li></ul><p>The meaning of these components will become clearer in the following parts.</p><p>Let me show step by step &#8212; but only very roughly &#8212; how I use <strong>Clobi</strong> within the first serious test. Many details are left out, but you will get it in principle. After reading this blog post, you&#8217;ve an overview about what the system does and what I&#8217;ve actually done during the summer.</p><h4><a
name="prep">1 Preparation</a></h4><p>I&#8217;ve prepared session/cloud configuration files. Using them, I started a new <strong>Clobi session</strong> (a resource pool) with the <strong>Clobi Resource Manager</strong> (It&#8217;s a <a
href="http://www.python.org/">Python</a> application and I use it locally; here on my desktop machine). At first, it does much configuration and initialization stuff, including setting up SQS queues and SimpleDB domains. Interaction with Amazon Web Services is done via the <a
href="http://code.google.com/p/boto/">boto</a> module for Python. After initialization, the Resource Manager offers an interactive mode (it&#8217;s a multi-threaded console application with user interface, built using the <a
href="http://excess.org/urwid">urwid</a> module for Python).</p><h4><a
name="start">2 VM startup</a></h4><p>Using the <strong>Resource Manager</strong>, I started one VM on EC2 and one on Nimbus Teraport Cloud, both based on &#8220;the special ATLAS Software VM&#8221;, containing <a
href="http://atlas-computing.web.cern.ch/atlas-computing/projects/releases/status/">ATLAS Software 15.2.0</a> and the <strong>Clobi Job Agent</strong>. Starting VMs manually is done with a very simple command. The driving forces in the background are boto in case of EC2 and the <a
href="http://workspace.globus.org/vm/TP2.2/dev/reference.html">Nimbus cloud reference client</a>, which I&#8217;ve wrapped and controlled via Python&#8217;s subprocess module. The main loop thread of the <strong>Resource Manager</strong> periodically polls the states of just started EC2 instances and the states of Nimbus client subprocesses to figure out if the instructed actions result in success or not.</p><p>The following screenshot shows the <strong>Resource Manager</strong> in action (you basically see a terminal window). Follow a bit of the log. As you can see, it&#8217;s very easy to run VMs and &#8212; after a certain amout of time &#8212; the <strong>Resource Manager</strong> detects that both VMs have successfully started booting:</p><div
id="attachment_796" class="wp-caption aligncenter" style="width: 222px"><a
href="http://gehrcke.de/wp/wp-content/uploads/001_RM_VMs_started.png" rel="lightbox[794]"><img
src="http://gehrcke.de/wp/wp-content/uploads/001_RM_VMs_started-212x300.png" alt="Clobi Resource Manager showing two started Virtual Machines" title="001_RM_VMs_started" width="212" height="300" class="size-medium wp-image-796" /></a><p
class="wp-caption-text"><strong>Clobi Resource Manager</strong> showing two started Virtual Machines</p></div><p>The EC2 VM needed around 10 minutes to start up, while the Nimbus VM needed 20 minutes. Reason: the AMI is ~10 GB big; the image on Nimbus ~20 GB (wasted space &#038; time, but it&#8217;s just a test..).</p><p>You might say: &#8220;Only two VMs? Boring..&#8221;. I say: I could have taken several hundred. The point is: it would not make any difference, except in cost and in the amount of space used for log files. <strong>Clobi</strong> uses technology / is designed to always work reliably; even in different orders of magnitude. This is often called &#8220;scalable&#8221; or &#8220;elastic&#8221;. Basically, this positive characteristic is inherited from Amazon&#8217;s SQS, SDB and S3, which are used by <strong>Clobi</strong> to do management and control of the system.</p><h4><a
name="sessmon">3 Session monitoring (number of Job Agents)</a></h4><p>I conceal almost all the details of how the components exchange information. But I&#8217;ve to tell the following to you, to not completely confuse you:</p><ul><li>the <strong>Resource Manager</strong> gave some bootstrap information to the VMs.</li></ul><ul><li>the <strong>Job Agent</strong> is automatically invoked on VM operating system startup.</li></ul><p>Using the bootstrap information, each VM&#8217;s <strong>Job Agent</strong> &#8220;registers with SimpleDB&#8221;. The <strong>Resource Manager</strong> has a monitoring functionality to check SimpleDB for running Job Agents:</p><div
id="attachment_803" class="wp-caption aligncenter" style="width: 310px"><a
href="http://gehrcke.de/wp/wp-content/uploads/2009/08/002_RM_JAs_running.png" rel="lightbox[794]"><img
src="http://gehrcke.de/wp/wp-content/uploads/2009/08/002_RM_JAs_running-300x169.png" alt="Clobi Resource Manager showing two started Job Agents" title="002_RM_JAs_running" width="300" height="169" class="size-medium wp-image-803" /></a><p
class="wp-caption-text"><strong>Clobi Resource Manager</strong> showing two started <strong>Clobi Job Agents</strong></p></div><p>Voilà, now it&#8217;s definitely known that both VMs successfully started their <strong>Job Agent&#8217;s</strong>. These start in a &#8220;watching/lurking&#8221; state, periodically polling SQS for jobs.</p><h4><a
name="submit">4 Job submission</a></h4><p>The SimpleDB / SQS / S3 data structure together with <strong>Clobi&#8217;s Job Management Interface</strong> allows to submit jobs with different priorities, to remove jobs, to kill running jobs and to monitor jobs. Furthermore, transmission and receipt of an input/output sandbox archive is possible. This is needed to deliver executables and small input data and to receive small output data as well as stdout/err and other logs.</p><p>I&#8217;ve downloaded Ganga and installed it to my local machine. Then, I&#8217;ve started developing a new &#8220;<strong>Clobi backend</strong>&#8221; to integrate <strong>Clobi&#8217;s Job Management Interface</strong> into Ganga. Using this new backend, it&#8217;s possible to submit/kill/monitor/.. jobs right away from the Ganga interface, using Ganga&#8217;s common job description and management commands.</p><p>To test the system, I&#8217;ve prepared some shellscripts that invoke running <a
href="http://gehrcke.de/2009/06/atlas-software-how-to-run-the-full-chain/">&#8220;The Full Chain&#8221;</a> on the worker nodes. This is a very good test to validate the whole system: it needs some very small input files, only works if the ATLAS Software was set up properly (uoooh.. not trivial!), stresses the VM (the simulation step consumes much CPU power) and leaves some small output files for the output sandbox.</p><p>At Ganga startup, I provided a configuration file containing few but essential information about the <strong>Clobi session</strong> that I&#8217;ve set up before via <strong>Resource Manager</strong>. From this configuration file Ganga&#8217;s <strong>Clobi backend</strong> e.g. knows which SimpleDB domain to query and to which SQS queues the job messages must be submitted. Using this bootstrap information, <strong>an arbitrary number of Gangas could be used from anywhere to submit and manage jobs</strong> within this special <strong>Clobi session</strong>.</p><p>I will now submit the same job (the &#8220;full chain&#8221; thing) three times: the Nimbus VM has two virtual cores and its <strong>Job Agent</strong> will try to receive and run two jobs at the same time. The EC2 VM (m1.small) only has one virtual core. Hence, three jobs are needed to use the VMs to full capacity <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> . This is a screenshot from the Ganga terminal session where I submitted the jobs:</p><div
id="attachment_805" class="wp-caption aligncenter" style="width: 227px"><a
href="http://gehrcke.de/wp/wp-content/uploads/2009/08/003_Ganga_submitted.png" rel="lightbox[794]"><img
src="http://gehrcke.de/wp/wp-content/uploads/2009/08/003_Ganga_submitted-217x300.png" alt="Ganga: job submission via Clobi backend" title="003_Ganga_submitted" width="217" height="300" class="size-medium wp-image-805" /></a><p
class="wp-caption-text">Ganga: job submission via <strong>Clobi backend</strong></p></div><p>The <strong>Clobi</strong> backend successfully did its job: it created three <strong>Clobi</strong> job IDs, submitted three SQS messages and uploaded three input sandbox archives.</p><h4><a
name="sessmon2">5 Session monitoring (number of jobs)</a></h4><p>The <strong>Resource Manager</strong> is able to observe the queues and to determine the number of jobs submitted to them. It recognizes two jobs in the queue for priority 2:<br
/><div
id="attachment_809" class="wp-caption aligncenter" style="width: 310px"><a
href="http://gehrcke.de/wp/wp-content/uploads/2009/08/004_RM_2jobs.png" rel="lightbox[794]"><img
src="http://gehrcke.de/wp/wp-content/uploads/2009/08/004_RM_2jobs-300x170.png" alt="Clobi Resource Manager detected two jobs in the queues" title="004_RM_2jobs" width="300" height="170" class="size-medium wp-image-809" /></a><p
class="wp-caption-text"><strong>Clobi Resource Manager</strong> detected two jobs in the queues</p></div></p><p>Only two? Maybe one <strong>Job Agent</strong> polled a job right away after submission, or maybe the SQS measurement was not exact (this is possible, too). Anyway, few time later there is only one job left in the queues and then they are empty. This means that the <strong>Job Agent</strong> on the Nimbus VM successfully grabbed two jobs and the EC2 VM grabbed one:<br
/><div
id="attachment_812" class="wp-caption aligncenter" style="width: 310px"><a
href="http://gehrcke.de/wp/wp-content/uploads/2009/08/005_RM_0jobs.png" rel="lightbox[794]"><img
src="http://gehrcke.de/wp/wp-content/uploads/2009/08/005_RM_0jobs-300x169.png" alt="Clobi Resource Manager detects zero jobs in the queues" title="005_RM_0jobs" width="300" height="169" class="size-medium wp-image-812" /></a><p
class="wp-caption-text"><strong>Clobi Resource Manager</strong> detects zero jobs in the queues</p></div></p><p>While me and Ganga are waiting for the jobs to finish (this takes some time and Ganga periodically polls the state of the jobs via <strong>Clobi backend / Clobi Job Management Interface</strong>), I use the time to advise you of an important fact: it&#8217;s the objective to automate the observe-queues-and-start/kill-VMs-as-required process in the future. The current <strong>Resource Manager</strong> is very prepared for this. Let me show the monitoring loop to you:<br
/><div
id="attachment_814" class="wp-caption aligncenter" style="width: 301px"><a
href="http://gehrcke.de/wp/wp-content/uploads/2009/08/006_RM_monitoring_loop.png" rel="lightbox[794]"><img
src="http://gehrcke.de/wp/wp-content/uploads/2009/08/006_RM_monitoring_loop-291x300.png" alt="Clobi Resource Manager showing its monitoring loop" title="006_RM_monitoring_loop" width="291" height="300" class="size-medium wp-image-814" /></a><p
class="wp-caption-text"><strong>Clobi Resource Manager</strong> showing its monitoring loop</p></div></p><p>It observes the number of jobs in the queues and the number of running <strong>Job Agents</strong> periodically. Based on this information the <strong>Resource Manager</strong> easily could start / kill virtual machines (I&#8217;ve already demonstrated how easy starting is; killing is described later). I did not implement this algorithm until now, because a) I had no time and b) I could have done it quick and dirty, but I really did not need this feature to develop and test the rest of the system. But this feature will come, because if it&#8217;s implemented properly with intelligent policies, it&#8217;s just great.</p><h4><a
name="jobmon">6 Job monitoring</a></h4><p>As I&#8217;ve already mentioned, Ganga periodically checks the jobs&#8217; states. Therefore, the <strong>Ganga Clobi backend</strong> provides a special method that Ganga calls from time to time from one of its monitoring threads. Normally this happens quietly, but I&#8217;ve put some debug output into this method. Let&#8217;s check it out:<br
/><div
id="attachment_815" class="wp-caption aligncenter" style="width: 310px"><a
href="http://gehrcke.de/wp/wp-content/uploads/2009/08/007_Ganga_monitoring_jobs.png" rel="lightbox[794]"><img
src="http://gehrcke.de/wp/wp-content/uploads/2009/08/007_Ganga_monitoring_jobs-300x97.png" alt="Ganga receives monitoring information via the Clobi backend" title="007_Ganga_monitoring_jobs" width="300" height="97" class="size-medium wp-image-815" /></a><p
class="wp-caption-text">Ganga receives monitoring information via the <strong>Clobi backend</strong></p></div></p><h4><a
name="jobcomplete">7 Job completion, output receipt</a></h4><p>After some more time, Ganga discovered that one of the three jobs finished successfully. This means that the <strong>Job Agent</strong> detected a returncode of 0 of the job shellscript and could successfully store the output sandbox archive to S3. At this point, the <strong>Clobi backend</strong> triggers to download and extract the output sandbox archive. This looks like:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">Clobi                              : INFO     status for job-090818044445-3585-3088: completed_success
Ganga.GPIDev.Lib.Job               : INFO     job 60 status changed to &quot;completed&quot;
Clobi                              : INFO     download atlassessions/0907210728-testsess-0c7e/jobs/out_sndbx_job-090818044445-3585-3088.tar.bz2 from S3
Clobi                              : INFO     store key 0907210728-testsess-0c7e/jobs/out_sndbx_job-090818044445-3585-3088.tar.bz2 as file /home/gurke/gangadir/workspace/gurke/LocalAMGA/60/output/out_sndbx_job-090818044445-3585-3088.tar.bz2 from bucket atlassessions
Clobi                              : INFO     Download of output sandbox archive successfull.</pre></div></div></div></div></div></div></div><p>Did you have doubts that this is my first serious test and everything worked until now? Some parts of the system are already tested very much, of course. But the <strong>Clobi backend</strong> for Ganga made the transition from vitally-important-features-missing to just-scratch-along-usability only a few hours ago. I&#8217;m really very happy that everything worked until now, but the output sandbox archive extraction could be improved:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">Clobi                              : CRITICAL Error while extracting output sandbox
Clobi                              : CRITICAL Traceback:
Traceback (most recent call last):
  File &quot;/mnt/hgfs/E/gsoc_code_repo/ganga_clobi_backend/Clobi/Clobi.py&quot;, line 243, in clobi_dl_extrct_outsandbox_arc
    sp = subprocess.Popen(
NameError: global name 'subprocess' is not defined</pre></div></div></div></div></div></div></div><p>Yoooah, I (want to) use Python&#8217;s subprocess module to extract the <code>tar.bz2</code> archive with system&#8217;s <code>tar</code>, but I forgot to <code>import subprocess</code> <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' /> . This is forgotten and fixed easily <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> . Btw: of course I got three downloaded output archives and three extraction errors <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> .</p><h4><a
name="output">8 Examine output</a></h4><p>The output sandbox archive files were stored on my local machine by Ganga&#8217;s <strong>Clobi backend</strong>. I take a look into one of it by extracting it manually:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">$ ls
out_sndbx_job-090818044451-3585-01c2.tar.bz2
$ tar xjf out_sndbx_job-090818044451-3585-01c2.tar.bz2
$ ls
AOD_007410_00001.pool.root  evgen.log  joblog_job-090818044451-3585-01c2  recoAOD.log
EVGEN_007410_00001.pool.root  jobagent_log  out_sndbx_job-090818044451-3585-01c2.tar.bz2</pre></div></div></div></div></div></div></div><p><strong>Great! The AOD file is there</strong>. This means that 1) <strong>Clobi</strong> did perfect work to control and manage the job and 2) the ATLAS Computing part (&#8220;The Full Chain&#8221;) worked perfectly:</p><ul><li>the interaction between a certain particle (which was defined within the input sandbox) and the ATLAS detector was successfully simulated.</li></ul><ul><li>the ATLAS detector output (basically times and voltages) was calculated successfully.</li></ul><ul><li>particle tracks and energy deposits were successfully reconstructed from times and voltages.</li></ul><ul><li>an event summary was successfully built from tracks and energy deposits.</li></ul><p>&#8220;The summary&#8221; is saved within the AOD file, which successfully returned to my local machine. Cool. Every single part of the system worked as it should (psss, don&#8217;t think of the extraction&#8230;).</p><h4><a
name="shutdown">9 VM shutdown</a></h4><p>You perhaps asked yourself how to dynamically kill VMs. Besides the &#8220;hard kill&#8221; (invoking Nimbus/EC2 API calls to shut down a VM), I&#8217;ve implemented a mechanism that I call &#8220;soft kill&#8221;: the <strong>Resource Manager</strong> sets the &#8220;soft kill flag&#8221; for a specific VM (in SimpleDB) and the corresponding <strong>Job Agent</strong> checks it from time to time. When it is set, it waits until all currently running jobs are done and then the <strong>Job Agent</strong> shuts down the VM. Let&#8217;s watch it in action (I had to look up the command of my own application, too few sleep recently!):<br
/><div
id="attachment_820" class="wp-caption aligncenter" style="width: 288px"><a
href="http://gehrcke.de/wp/wp-content/uploads/2009/08/008_RM_softkill.png" rel="lightbox[794]"><img
src="http://gehrcke.de/wp/wp-content/uploads/2009/08/008_RM_softkill-278x300.png" alt="Clobi Resource Manager setting up the softkill flag for both VMs" title="008_RM_softkill" width="278" height="300" class="size-medium wp-image-820" /></a><p
class="wp-caption-text"><strong>Clobi Resource Manager</strong> setting up the softkill flag for both VMs</p></div></p><p>After waiting some time we see the number of running <strong>Job Agents</strong> decrease to zero..<br
/><div
id="attachment_823" class="wp-caption aligncenter" style="width: 288px"><a
href="http://gehrcke.de/wp/wp-content/uploads/2009/08/009_RM_softkill_0jobagents.png" rel="lightbox[794]"><img
src="http://gehrcke.de/wp/wp-content/uploads/2009/08/009_RM_softkill_0jobagents-278x300.png" alt="Clobi Resource Manager detected that both Job Agents / VMs have shut down" title="009_RM_softkill_0jobagents" width="278" height="300" class="size-medium wp-image-823" /></a><p
class="wp-caption-text"><strong>Clobi Resource Manager</strong> detected that both Job Agents / VMs have shut down</p></div></p><h4><a
name="appendix">10 Appendix</a></h4><p>If you are very interested, you can find some additional material:</p><ul><li>My earlier work on this topic (from last year), &#8220;Amazon Web Services for ATLAS Computing&#8221; (AWSAC) can be found here: <a
href="http://gehrcke.de/awsac">http://gehrcke.de/awsac</a>.</li></ul><ul><li>The aboriginal GSoC project description <a
href="http://gehrcke.de/projects/google-summer-of-code/">can be found here</a>.</li></ul><ul><li>I&#8217;ve already written <a
href="http://gehrcke.de/category/technical-stuff/google-summer-of-code/">some blog posts about this project during Google Summer of Code</a>.</li></ul><ul><li>The last visualizations of the system (from the planning period) are these two schemes: <a
href="http://gehrcke.de/gsoc/jobsystem_scheme.jpeg" rel="lightbox[794]">one</a>, <a
href="http://gehrcke.de/gsoc/jobsystem_scheme_two_sessions.jpeg" rel="lightbox[794]">two</a>.</li></ul><p>A detailled, up-to-date and exact description of the system (&#8220;<strong>Clobi</strong>&#8220;) is planned for the future.</p><p>I will need the last days of GSoC to implement some missing and important features, to search and fix bugs and to clean everything up to make it presentable. I will definitely work on this project after GSoC (as the time allows it, of course). Currently I think about pushing the project to either <a
href="http://bitbucket.org/">bitbucket</a> or <a
href="http://code.google.com/">Google code</a>. Both support <a
href="http://en.wikipedia.org/wiki/Mercurial_%28software%29">mercurial</a> repositories and this is what I used for my code until now (locally).</p><p>If you like this, spread it! Every question and/or comment is much appreciated!</p><p>Thanks for listening <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /></p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2009/08/distribute-high-performance-computing-jobs-among-multiple-computing-clouds/feed/</wfw:commentRss> <slash:comments>3</slash:comments> </item> <item><title>&#8220;What&#8217;s faster&#8211;a supercomputer or EC2?&#8221; A reconsideration&#8230;</title><link>http://gehrcke.de/2009/08/whats-faster-a-supercomputer-or-ec2-a-reconsideration/</link> <comments>http://gehrcke.de/2009/08/whats-faster-a-supercomputer-or-ec2-a-reconsideration/#comments</comments> <pubDate>Thu, 06 Aug 2009 19:55:02 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[Amazon Web Services]]></category> <category><![CDATA[GSoC 2009]]></category> <category><![CDATA[Technical Stuff]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=760</guid> <description><![CDATA[<p>In his blog, Ian Foster recently presented What&#8217;s faster &#8212; a supercomputer or EC2?. It&#8217;s not about the computing power itself, but about</p><p>elapsed time from submission to the completion of execution. (In other words, the time that we must wait before execution starts becomes significant.)</p><p>Via considering 300 seconds startup-time and 100 seconds [...]]]></description> <content:encoded><![CDATA[<p>In his blog, <a
href="http://www.mcs.anl.gov/about/people_detail.php?id=285">Ian Foster</a> recently presented <a
href="http://ianfoster.typepad.com/blog/2009/08/whats-fastera-supercomputer-or-ec2.html">What&#8217;s faster &#8212; a supercomputer or EC2?</a>. <span
id="more-760"></span>It&#8217;s not about the computing power itself, but about</p><blockquote><p>elapsed time from submission to the completion of execution. (In other words, the time that we must wait before execution starts becomes significant.)</p></blockquote><p>Via considering 300 seconds startup-time and 100 seconds execution time of a benchmark on <a
href="http://aws.amazon.com/ec2/">EC2</a> (32 VMs) vs. a propability of about only 1/3 that the super computer finishes the task within 400 s <strong>after job submission</strong>, Ian comes to the result that for such things EC2 is more convenient, which is &#8212; of course &#8212; true, but only regarding these very special conditions.</p><p>We have to consider that EC2 is billed per hour, which means that it&#8217;s uneconomical to run EC2 instances for less than an hour. Hence, the comparison made has no particular meaning for real use cases. Instead of running 32 nodes for 100 seconds each, one should have started <strong>one</strong> VM which would have finished within the first hour for 1/32 of the total price. EC2 is simply not adequate for tasks that need less time than an hour. Considering at-least-one-hour-lasting-tasks changes  the whole examination dramatically, since then &#8220;<em>the time that we must wait before execution starts becomes</em>&#8221; less significant compared to the execution time: the super computer will take lead again, because of its computing power.</p><p>Finally, while answering the question <em>What&#8217;s faster &#8212; a supercomputer or EC2?</em>, one has to say that most of the poeple are not that lucky that they&#8217;ve access to a super computer. But they have access to EC2, <em>instantly</em>! I think <strong>that</strong> is the real advantage of EC2 <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> .</p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2009/08/whats-faster-a-supercomputer-or-ec2-a-reconsideration/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Counting stuff in Python and assembling a histogram: analyze thread communication via os.pipe()</title><link>http://gehrcke.de/2009/08/counting-stuff-in-python-histogram-thread-pipe-communication/</link> <comments>http://gehrcke.de/2009/08/counting-stuff-in-python-histogram-thread-pipe-communication/#comments</comments> <pubDate>Fri, 31 Jul 2009 22:41:38 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[GSoC 2009]]></category> <category><![CDATA[Python]]></category> <category><![CDATA[Technical Stuff]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=718</guid> <description><![CDATA[<p>These days I built up an inter-thread communication via os.pipe(). While one thread is only writing to the &#8220;write end&#8221; of the pipe, the other thread is only reading from the &#8220;read end&#8221;. The latter is realized with select.select(), which observes the read end and raises an os.read() event whenever it recognizes data in the [...]]]></description> <content:encoded><![CDATA[<p>These days I built up an inter-thread communication via <code>os.pipe()</code>. While one thread is only writing to the &#8220;write end&#8221; of the pipe, the other thread is only reading from the &#8220;read end&#8221;. <span
id="more-718"></span>The latter is realized with <code>select.select()</code>, which observes the read end and raises an <code>os.read()</code> event whenever it recognizes data in the pipe. Hence, whenever something is written to the pipe on the one side,  it is immediately read on the other side. This means one string written by one call to <code>os.write()</code> is read entirely by one call to <code>os.read()</code>.</p><p>Really? What happens, when the writing thread pushes a hugh amout of data into the pipe &#8212; with high frequency? I wanted to test this.</p><h5>Analysis</h5><p>From the writing thread, I pushed a string with fixed length <strong>L</strong> (and terminated by a newline) through the pipe; <strong>N</strong> times &#8212; one after the other, within a loop. In the other thread, I <code>os.read()</code> from the pipe, based on <code>select.select()</code> events and checked the length of each line returned by one os.read(). To evaluate the data, I needed a basic histogram. How to realize fast counting in Python? Here is a neat way to count events:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="python"><pre class="de1"><span class="kw1">import</span> <span class="kw3">collections</span>
histogram <span class="sy0">=</span> <span class="kw3">collections</span>.<span class="me1">defaultdict</span><span class="br0">&#40;</span><span class="kw2">int</span><span class="br0">&#41;</span>
<span class="kw1">for</span> event <span class="kw1">in</span> events:
    histogram<span class="br0">&#91;</span>event<span class="br0">&#93;</span> +<span class="sy0">=</span><span class="nu0">1</span></pre></div></div></div></div></div></div></div><p>If you access a non-existing key in a <code>defaultdict(callable)</code>, it is initialized automatically by calling <code>callable()</code> without arguments. For the histogram this can be used by creating a <code>defaultdict</code> of type <code>int</code>. Accessing <code>histogram[non-existing-key]</code> first initializes the value with <code>int()</code>, which gives zero. So <code>histogram[event] +=1</code> is secure and just counts your events without a need to pre-define them <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> This is the way I used it:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="python"><pre class="de1">lines <span class="sy0">=</span> <span class="kw3">os</span>.<span class="me1">read</span><span class="br0">&#40;</span>pipe_read<span class="sy0">,</span><span class="nu0">9999999</span><span class="br0">&#41;</span>.<span class="me1">splitlines</span><span class="br0">&#40;</span><span class="br0">&#41;</span>
<span class="kw1">for</span> line <span class="kw1">in</span> lines:
    histogram<span class="br0">&#91;</span><span class="kw2">len</span><span class="br0">&#40;</span>line<span class="br0">&#41;</span><span class="br0">&#93;</span> +<span class="sy0">=</span> <span class="nu0">1</span></pre></div></div></div></div></div></div></div><p>I don&#8217;t need a big visualization. A sorted listing in histogram style is enough for me:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="python"><pre class="de1"><span class="kw1">for</span> key <span class="kw1">in</span> <span class="kw2">sorted</span><span class="br0">&#40;</span>histogram<span class="sy0">,</span> key<span class="sy0">=</span>histogram.<span class="me1">get</span><span class="sy0">,</span> reverse<span class="sy0">=</span><span class="nu0">1</span><span class="br0">&#41;</span>:
    <span class="kw1">print</span> <span class="st0">'%7d %s'</span> % <span class="br0">&#40;</span>histogram<span class="br0">&#91;</span>key<span class="br0">&#93;</span><span class="sy0">,</span> key<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div><p>This sorts the dictionary by its values in reversed order (small first) and prints the key/value pairs with a bit of formatting.</p><p>The optimal result would be only one datapoint in the histogram: <strong>N</strong> times <strong>L</strong>. This is what I got for short strings. But with increasing <strong>L</strong>, I got a probability of about 10^-5 for chopped lines. For <strong>L = 150</strong>, I e.g. got ~100000 times 150 chars, and 1-3 times 1-149 chars.</p><p>This means that in some cases strings returned by <code>os.read()</code> looked like <code>&quot;thisisaline\nthisisaline\nthisis&quot;</code>: <strong>The last few characters of the last line are chopped and will be returned with the next <code>os.read()</code></strong>: &#8220;aline\nthisisaline\nthisisaline\n&#8221; &#8212; and so on. The good thing was, that the sum of all received characters <strong>always</strong> matched the number of chars written to the pipe (as expected, but I felt good by measuring it).</p><h5>Result and solution</h5><p>In extreme situations (long strings, high writing frequency), relying on the <em>&#8220;one os.write(string), one os.read() returns entire string&#8221;</em> principle is a bad idea, because in seldom cases lines will not be read entirely and, hence, get returned chopped. The solution is to accomplish some post-processing to reassamble chopped lines:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="python"><ol><li
class="li1"><pre class="de1">new_data <span class="sy0">=</span> prefix + <span class="kw3">os</span>.<span class="me1">read</span><span class="br0">&#40;</span>pipe_read<span class="sy0">,</span><span class="nu0">9999999</span><span class="br0">&#41;</span></pre></li><li
class="li1"><pre class="de1">lines <span class="sy0">=</span> new_data.<span class="me1">splitlines</span><span class="br0">&#40;</span><span class="kw2">True</span><span class="br0">&#41;</span></pre></li><li
class="li1"><pre class="de1">prefix <span class="sy0">=</span> <span class="st0">''</span></pre></li><li
class="li1"><pre class="de1"><span class="kw1">if</span> <span class="kw1">not</span> lines<span class="br0">&#91;</span>-<span class="nu0">1</span><span class="br0">&#93;</span>.<span class="me1">endswith</span><span class="br0">&#40;</span><span class="st0">&quot;<span class="es0">\n</span>&quot;</span><span class="br0">&#41;</span>:</pre></li><li
class="li1"><pre class="de1">    prefix <span class="sy0">=</span> lines<span class="br0">&#91;</span>-<span class="nu0">1</span><span class="br0">&#93;</span></pre></li><li
class="li1"><pre class="de1">    <span class="kw1">del</span> lines<span class="br0">&#91;</span>-<span class="nu0">1</span><span class="br0">&#93;</span></pre></li><li
class="li1"><pre class="de1"><span class="kw1">for</span> line <span class="kw1">in</span> lines:</pre></li><li
class="li1"><pre class="de1">    histogram<span class="br0">&#91;</span><span class="kw2">len</span><span class="br0">&#40;</span>line.<span class="me1">rstrip</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#93;</span> +<span class="sy0">=</span> <span class="nu0">1</span></pre></li></ol></div></div></div></div></div></div></div><p>In <code>line 2</code>, I keep the trailing newline characters in the splitted parts. In <code>lines 4-6</code>, I check if the last part has a trailing newline. If not, then this is a chopped line! It is deleted from the list of lines and prepended to the next string returned by <code>os.read()</code> (see <code>line 1</code>). Before putting resulting lines into the histogram, trailing newlines are <code>.rstrip()ped</code>. Can you imagine what the histogram says now? <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /></p><p>E.g. <code>&quot;6000000 150&quot;</code>, which means 6000000 times received a string of length 150. That&#8217;s the <strong>N</strong> times <strong>L</strong> datapoint I wanted!</p><h5>Conclusion</h5><p>It was shown that one should not rely on simple <code>os.write()</code> and <code>os.read()</code> calls on an <code>os.pipe()</code> to accomplish a communication channel between two threads. A &#8220;communication protocol&#8221; has to be applied. Then, <strong>with a small post-processing and reassembling piece of code, the communication between the two treads is reliable and foreseeable</strong>: every trailing-newline-labeled-line written to the pipe by <em>Thread 1</em> is read and evaluated entirely by <em>Thread 2</em>. Great <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> .</p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2009/08/counting-stuff-in-python-histogram-thread-pipe-communication/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>CernVM: local ATLAS Software &#8212; the clean solution</title><link>http://gehrcke.de/2009/06/cernvm-local-atlas-software-the-clean-solution/</link> <comments>http://gehrcke.de/2009/06/cernvm-local-atlas-software-the-clean-solution/#comments</comments> <pubDate>Tue, 30 Jun 2009 14:54:01 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[Amazon Web Services]]></category> <category><![CDATA[ATLAS Software]]></category> <category><![CDATA[CernVM]]></category> <category><![CDATA[GSoC 2009]]></category> <category><![CDATA[Linux]]></category> <category><![CDATA[Nimbus]]></category> <category><![CDATA[Technical Stuff]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=556</guid> <description><![CDATA[<p>In my blog post CernVM: how to set up a local ATLAS Software Release, I presented a brutal approach how to override CVMFS (CernVM&#8216;s filesystem with HTTP backend) to install a local ATLAS Software release. Now I worked out a very clean and smooth solution. This approach allows:</p> to use the local ATLAS Software without [...]]]></description> <content:encoded><![CDATA[<p>In my blog post <a
href="http://gehrcke.de/2009/06/cernvm-how-to-set-up-a-local-atlas-software-release/"><em>CernVM: how to set up a local ATLAS Software Release</em></a>, I presented a brutal approach how to override <a
href="https://cernvm.cern.ch/project/trac/cernvm/wiki/SyncWebFS">CVMFS</a> (<a
href="http://cernvm.cern.ch/cernvm/">CernVM</a>&#8216;s filesystem with HTTP backend) to install a local <a
href="http://atlas-computing.web.cern.ch/atlas-computing/projects/releases/status/">ATLAS Software release</a>. Now I worked out a very clean and smooth solution. This approach allows:</p><ul><li>to use the local ATLAS Software without &#8220;hacking&#8221; anything</li><li>to use local software and software provided by CVMFS at the same time.</li></ul><p><span
id="more-556"></span></p><p>The introduction and motivation from <a
href="http://gehrcke.de/2009/06/cernvm-how-to-set-up-a-local-atlas-software-release/"><em>CernVM: how to set up a local ATLAS Software Release</em></a> is still valid. The rest can be considered as deprecated.</p><p>The objective is to install ATLAS Software 15.2.0 to the local filesystem of CernVM 1.2.0 using <a
href="http://atlas.bu.edu/~youssef/pacman/">pacman</a>. Therefore we need some room in the filesystem (around 10 GB). Regarding this, the blog post <a
href="http://gehrcke.de/2009/06/resize-ext3-file-system-in-loopback-file/ "><em>Resize ext3 file system in loopback file</em></a> could be interesting for you. This step is necessary on e.g. <a
href="http://workspace.globus.org/">Nimbus</a>. If you work on Amazon&#8217;s <a
href="http://aws.amazon.com/ec2/">EC2</a>, you perhaps would like to embed an <a
href="http://aws.amazon.com/ebs/">EBS volume</a> to install the software there. Useful hints for this are given in this blog post: <a
href=" http://gehrcke.de/2009/06/ec2-install-atlas-software-to-an-ebs-volume/"><em>EC2: Install ATLAS Software to an EBS volume</em></a></p><h4>Prepare installation &#8212; start off from a fresh CernVM</h4><p>Assume a fresh (maybe with a bigger file system than by default) CernVM 1.2.0 just booted up. Bootstrap it to the following configuration (via web interface):</p><ul><li>create the new linux user <code>atlasuser</code></li><li>under <em>Virtual Organization Configuration</em> choose ATLAS &#8212; keep everything else as <em>No</em></li></ul><p>Log in via <code>ssh</code>, using the new <code>atlasuser</code> account. At first, the necessary system components have to be installed using <a
href="http://wiki.rpath.com/wiki/Conary">Conary</a>.</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[atlasuser@bla ~]$ sudo conary migrate group-atlas --interactive</pre></div></div></div></div></div></div></div><p>You will need the &#8220;admin password&#8221; you set at bootstrap. If this migration step throws errors for you, consider this blog post: <a
href="http://gehrcke.de/2009/06/cernvm-on-nimbus-kernel-problems/"><em>CernVM on Nimbus: kernel problems</em></a></p><p>Now switch to root via <code>sudo su</code>. Create the directory <code>/opt/atlas-local</code>, which will be the place to put all the local ATLAS related software like CMT and pacman and the ATLAS Software itself. When using an EBS volume, <code>/opt/atlas-local</code> would be a convenient mount point. Give all rights to this directory.</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[root@bla opt]# mkdir /opt/atlas-local
[root@bla opt]# chmod 777 /opt/atlas-local</pre></div></div></div></div></div></div></div><p>Now we can already start installing an ATLAS Software release using pacman. I do this stuff so often, so I&#8217;ve created a little script for this. It checks if pacman&#8217;s already there. If not, it downloads &#038; extracts it. pacman then is set up and the ATLAS Software installing command gets invoked. To use this script, create a file <code>/opt/atlas-local/install_ATLAS-15-2-0.sh</code> and fill it up with this content:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1"><span class="re2">PACMANDIR</span>=<span class="sy0">/</span>opt<span class="sy0">/</span>atlas-local<span class="sy0">/</span>pacman
<span class="re2">ATLASINSTALLDIR</span>=<span class="sy0">/</span>opt<span class="sy0">/</span>atlas-local<span class="sy0">/</span>15.2.0
<span class="kw1">if</span> <span class="br0">&#91;</span> <span class="re5">-e</span> <span class="co1">${PACMANDIR}</span> <span class="br0">&#93;</span>; <span class="kw1">then</span>
    <span class="kw3">echo</span> --info: <span class="co1">${PACMANDIR}</span> exists
    <span class="kw3">echo</span> --info: <span class="kw3">cd</span> <span class="co1">${PACMANDIR}</span><span class="sy0">/</span>pacman-<span class="sy0">*</span>
    <span class="kw3">cd</span> <span class="co1">${PACMANDIR}</span><span class="sy0">/</span>pacman-<span class="sy0">*</span>
    <span class="kw3">echo</span> --info: pwd: <span class="sy0">`</span><span class="kw3">pwd</span><span class="sy0">`</span>
    <span class="kw3">echo</span> --info: pacman setup
    <span class="kw3">source</span> setup.sh
<span class="kw1">else</span>
    <span class="kw3">echo</span> --info: <span class="kw2">mkdir</span> <span class="re5">-p</span> and <span class="kw3">cd</span> to <span class="co1">${PACMANDIR}</span>
    <span class="kw2">mkdir</span> <span class="re5">-p</span> <span class="co1">${PACMANDIR}</span>
    <span class="kw3">cd</span> <span class="co1">${PACMANDIR}</span>
    <span class="kw3">echo</span> --info: download latest pacman
    <span class="kw2">wget</span> http:<span class="sy0">//</span>atlas.bu.edu<span class="sy0">/</span>~youssef<span class="sy0">/</span>pacman<span class="sy0">/</span>sample_cache<span class="sy0">/</span>tarballs<span class="sy0">/</span>pacman-latest.tar.gz
    <span class="kw3">echo</span> --info: extract..
    <span class="kw2">tar</span> xzf pacman-latest.tar.gz
    <span class="kw3">echo</span> --info: <span class="kw3">cd</span> pacman-<span class="sy0">*</span>
    <span class="kw3">cd</span> pacman-<span class="sy0">*</span>
    <span class="kw3">echo</span> --info: pwd: <span class="sy0">`</span><span class="kw3">pwd</span><span class="sy0">`</span>
    <span class="kw3">echo</span> --info: setup pacman..
    <span class="kw3">source</span> setup.sh
<span class="kw1">fi</span>
&nbsp;
<span class="kw3">echo</span> --info: <span class="kw2">mkdir</span> -p, <span class="kw3">cd</span> to <span class="co1">${ATLASINSTALLDIR}</span>
<span class="kw2">mkdir</span> <span class="re5">-p</span> <span class="co1">${ATLASINSTALLDIR}</span>
<span class="kw3">cd</span> <span class="co1">${ATLASINSTALLDIR}</span>
<span class="kw3">echo</span> --info: pwd: <span class="sy0">`</span><span class="kw3">pwd</span><span class="sy0">`</span>
&nbsp;
<span class="kw3">echo</span> --info: start installing ATLAS:
<span class="kw3">echo</span> --info: invoke: pacman <span class="re5">-pretend-platform</span> SLC-<span class="nu0">4</span> <span class="re5">-allow</span> trust-all-caches <span class="re5">-get</span> am-IU:15.2.0
pacman <span class="re5">-pretend-platform</span> SLC-<span class="nu0">4</span> <span class="re5">-allow</span> trust-all-caches <span class="re5">-get</span> am-IU:15.2.0</pre></div></div></div></div></div></div></div><p>With this script, the <em>ATLAS Software 15.2.0</em> will get installed to <code>/opt/atlas-local/15.2.0</code> while <em>pacman</em> resides at <code>/opt/atlas-local/pacman</code>. I use Indiana University mirror (am-IU). You perhaps want to modify this. Additionally, one could use <code>:15.2.0+KV</code> to automatically perform the Kit Validation after installation. This is a good idea for a new platform. But for CernVM 1.2.0 + group-atlas this KV always succeeds.</p><p>If you don&#8217;t like using this script at all, then just grab out the important commands. But please think twice about the path where to install ATLAS Software. Keep in mind that an ATLAS Software &#8212; once installed to a specific path &#8212; should <strong>always</strong> work under this path. By moving it around, you <strong>will</strong> get problems, even with new set up scripts. Furthermore, keeping everything necessary <strong>under</strong> <code>/opt/atlas-local</code> allows to e.g. move an EBS volume from one EC2 instance to another without losing important components and functionality.</p><h4>Start ATLAS Software installation using pacman</h4><p>Be sure to work as <code>atlasuser</code>. It&#8217;s just more clean to install the software not as <code>root</code>. Then, source the <code>/opt/atlas-local/install_ATLAS-15-2-0.sh</code> script from above.</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[atlasuser@bla atlas-local]$ source install_atlas_15-2-0.sh
--info: mkdir -p and cd to /opt/atlas-local/pacman
--info: download latest pacman
--07:58:25--  http://atlas.bu.edu/~youssef/pacman/sample_cache/tarballs/pacman-latest.tar.gz
[...]
pacman-latest.tar.gz' saved [856237/856237]
--info: extract..
--info: cd pacman-3.28 pacman-latest.tar.gz
--info: pwd: /opt/atlas-local/pacman/pacman-3.28
--info: setup pacman..
--info: mkdir -p, cd to /opt/atlas-local/15.2.0
--info: pwd: /opt/atlas-local/15.2.0
--info: start installing ATLAS:
--info: invoke: pacman -pretend-platform SLC-4 -allow trust-all-caches -get am-IU:15.2.0</pre></div></div></div></div></div></div></div><p>This will take a while (something around an hour for installing from am-IU to EC2 in US). Inbetween, you can log in with a second shell and proceed with the next important steps.</p><h4>Make useful CVMFS content local</h4><p>While working on the dirty solution, some problems came up. In particular, I had to hack around the <code>CMTCONFIG</code> problem. The CernVM collaborators have spent considerable amount of time to sort out these problems. So let&#8217;s profit from their work. I figured out that &#8220;taking over&#8221; their version of the <a
href="http://www.cmtsite.org/">Configuration Management Tool CMT</a> and their <code>cmthome</code> directory is enough to get everything working properly.</p><h5>Copy CernVM&#8217;s CMT from CVMFS to the local filesystem</h5><p>Work as <code>root</code>. At first, copy the whole CMT directory from CVMFS (which has <code>/opt/atlas/</code> as mountpoint) to <code>/opt/atlas-local</code>:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[root@bla ~]# cp -R /opt/atlas/software/sw/CMT /opt/atlas-local</pre></div></div></div></div></div></div></div><p>Now CMT&#8217;s path changed, which means that it has to be re-installed. By doint this, <code>/opt/atlas-local/CMT/v1r20p20090520/mgr/setup.sh</code> gets refreshed. This is important to set up CMT&#8217;s environment later on.</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[root@bla atlas-local]# cd /opt/atlas-local/CMT/v1r20p20090520/mgr/
[root@bla mgr]# ./INSTALL
============================================
       CMT installation terminated.
           --------------------
 cmt.exe is available on this site for:
    Darwin-PowerMacintosh
    Darwin-i386
    Linux-i686
    Linux-x86_64
    VisualC
============================================</pre></div></div></div></div></div></div></div><p>CMT has successfully found its new location. Btw, let&#8217;s look at the currently existing directory structure of <code>/opt/atlas-local</code>:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[root@bla ~]# cd /opt/atlas-local
[root@bla atlas-local]# ls
15.2.0  CMT  install_atlas_15-2-0.sh  pacman</pre></div></div></div></div></div></div></div><p>It should look like that <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /></p><h5>Create <code>cmthome</code> and modify the <code>requirements</code> file</h5><p>The essential part of this step is to grab&#038;modify CernVM&#8217;s <code>requirements</code> file for ATLAS Software releases. Work as <code>atlasuser</code> and copy the <code>cmthome</code> directory to your local ATLAS Software:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[atlasuser@bla atlas-local]$ cp -R /opt/atlas/software/15.2.0/15.2.0/cmthome /opt/atlas-local/15.2.0/</pre></div></div></div></div></div></div></div><p>Modify the <code>requirements</code> file within that new <code>cmthome</code> directory so that <code>SITEROOT</code> points to the new local ATLAS Software:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[atlasuser@bla atlas-local]$ cd /opt/atlas-local/15.2.0/cmthome
[atlasuser@bla cmthome]$ ls
Makefile  cleanup.csh  cleanup.sh  requirements  setup.csh  setup.sh
[atlasuser@bla cmthome]$ vi requirements
[... modify SITEROOT path ...]
[atlasuser@bla cmthome]$ cat requirements
#---------------------------------------------------------------------
set CMTSITE STANDALONE
set SITEROOT /opt/atlas-local/15.2.0
macro ATLAS_TEST_AREA ${HOME}/testarea
macro ATLAS_DIST_AREA ${SITEROOT}
apply_tag projectArea
macro SITE_PROJECT_AREA ${SITEROOT}
macro EXTERNAL_PROJECT_AREA ${SITEROOT}
apply_tag opt
apply_tag setup
apply_tag simpleTest
use AtlasLogin AtlasLogin-* $(ATLAS_DIST_AREA)
set CMTCONFIG i686-slc4-gcc34-opt</pre></div></div></div></div></div></div></div><p>In the next step, this new <code>requirements</code> file will be parsed by CMT to create a setup script which will be necessary to set up the correct runtime environment for ATLAS Software.</p><h4>Set up CMT environment, create the ATLAS Software setup script</h4><p>Work as <code>atlasuser</code>. To set up the CMT environment, use the <code>/opt/atlas-local/CMT/v1r20p20090520/mgr/setup.sh</code>, which was created while installing CMT.</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[atlasuser@bla ~]$ source /opt/atlas-local/CMT/v1r20p20090520/mgr/setup.sh</pre></div></div></div></div></div></div></div><p>Now, <code>cmt</code> and <code>cmt.exe</code> should be in the path. You can check it by typing <code>cm *TAB*</code>.</p><p>Switch to the <code>cmthome</code> directory and use <code>cmt config</code> to create the setup script for the ATLAS Software runtime environment:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[atlasuser@bla ~]$ cd /opt/atlas-local/15.2.0/cmthome
[atlasuser@bla cmthome]$ cmt config
sh: manpath: command not found
------------------------------------------
Configuring environment for standalone package.
CMT version v1r20p20090520.
System is Linux-i686
------------------------------------------
Creating setup scripts.
Creating cleanup scripts.</pre></div></div></div></div></div></div></div><h4>Interim report: installation is done!</h4><p>Now we&#8217;re done. This new ATLAS Software installation is prepared for usage. Each of the previous steps is unique for this installation and does not have to be repeated. To use this ATLAS Software, only <code>/opt/atlas-local/15.2.0/cmthome/setup.sh</code> has to be invoked.</p><p>In case that <code>/opt/atlas-local</code> was the mountpoint to an EBS volume, this volume now contains a fully working ATLAS Software release. It may get attached to any VM running an operating system that supports ATLAS Software (Like CernVM + group-atlas, Scientific Linux 3/4, &#8230;) to start production runs with this specific installation.</p><h4>Test the new local ATLAS Software: Run the Full Chain</h4><p>To test &#038; validate the new ATLAS Software installation, let&#8217;s run <a
href="https://twiki.cern.ch/twiki/bin/view/Atlas/WorkBookFullChain">the Full Chain</a>. An easy explanation of what this is and how to realize it with a shell script is given in this blog post: <a
href="http://gehrcke.de/2009/06/atlas-software-how-to-run-the-full-chain/"><em>ATLAS Software: How to run The Full Chain</em></a></p><h5>Create working directory, setup runtime environment</h5><p>Work as <code>atlasuser</code>. Create a working directory. Many files will be produced while running the full chain.</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[atlasuser@bla ~]$ mkdir atlasrun_15-2-0___1/
[atlasuser@bla ~]$ cd atlasrun_15-2-0___1</pre></div></div></div></div></div></div></div><p>When the pacman installation from above has finished, you can set up the runtime environment for the new ATLAS Software installation.</p><p><strong>This step always has to be executed before using the software after a new log in.</strong></p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[atlasuser@bla atlasrun_15-2-0___1]$ source /opt/atlas-local/15.2.0/cmthome/setup.sh -tag=15.2.0
sh: manpath: command not found
sh: manpath: command not found
sh: manpath: command not found
sh: manpath: command not found
AtlasLogin: WARNING - test directory [/home/atlasuser/testarea/15.2.0] doesn't exist - the runtime environment won't reflect it</pre></div></div></div></div></div></div></div><p><strong>Don&#8217;t forget</strong> the <code>-tag</code> parameter!<br
/> The <code>manpath</code> problem isn&#8217;t an acutal problem. In <a
href="http://gehrcke.de/2009/06/cernvm-how-to-set-up-a-local-atlas-software-release/">this blog post</a> you can read how to make <code>manpath</code> available. But it&#8217;s not needed. The warning about the test directory is correct, but for our full chain test we don&#8217;t need the testarea. If you need it for your production run, then create the stated directory.</p><h5>Run the full chain!</h5><p>I prepared a little script to run it. It&#8217;s explained in <a
href="http://gehrcke.de/2009/06/atlas-software-how-to-run-the-full-chain/"><em>ATLAS Software: How to run The Full Chain</em></a></p><p>Download the script an run it:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[atlasuser@bla atlasrun_15-2-0___1]$ wget gehrcke.de/gsoc/athena_whole_chain_1event.sh
[...]
`athena_whole_chain_1event.sh' saved [1229/1229]
[atlasuser@bla atlasrun_15-2-0___1]$ source athena_whole_chain_1event.sh
[...]
`singlepart_singlepi' saved [961/961]
&nbsp;
read CSC file, generate 1 single pion.. create EVGEN file..-&gt; evgen.log
&nbsp;
real    0m42.774s
user    0m29.074s
sys     0m1.292s
&nbsp;
read EVGEN file.. simulate.. create HITS file..-&gt; simul.log
&nbsp;
real    9m30.879s
user    9m2.227s
sys     0m4.936s
&nbsp;
read HITS file.. digitize.. produce RDO file..-&gt; digi.log
&nbsp;
real    2m35.875s
user    2m12.147s
sys     0m5.116s
&nbsp;
read RDO file.. reconstruct.. produce ESD file..-&gt; recoESD.log
&nbsp;
real    3m54.783s
user    3m17.422s
sys     0m6.427s
&nbsp;
read ESD file.. convert.. produce AOD file..-&gt; recoAOD.log
&nbsp;
real    1m44.373s
user    1m31.754s
sys     0m3.687s</pre></div></div></div></div></div></div></div><h4>Success!</h4><p>The full chain performed successfully with the local ATLAS Software installation. At the same time, we could have used the &#8220;remote ATLAS Software&#8221; via CVMFS. I verified this by</p><ul><li>logging in with an additional shell to the same machine</li><li>creating another working directory</li><li>setting up the runtime environment for the remote ATLAS Software 15.2.0 by sourcing <code>/opt/atlas/software/setupScripts/setupAtlasProduction_15.2.0.sh</code></li><li>running my full chain script</li></ul><p>It resulted in</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">read CSC file, generate 1 single pion.. create EVGEN file..-&gt; evgen.log
&nbsp;
real    6m13.287s
user    0m31.648s
sys     0m1.441s
&nbsp;
read EVGEN file.. simulate.. create HITS file..-&gt; simul.log
&nbsp;
real    25m53.893s
user    9m2.742s
sys     0m5.593s
&nbsp;
read HITS file.. digitize.. produce RDO file..-&gt; digi.log
&nbsp;
real    15m2.761s
user    2m12.836s
sys     0m6.455s
&nbsp;
read RDO file.. reconstruct.. produce ESD file..-&gt; recoESD.log
&nbsp;
real    22m21.185s
user    3m19.384s
sys     0m6.891s
&nbsp;
read ESD file.. convert.. produce AOD file..-&gt; recoAOD.log
&nbsp;
real    3m10.793s
user    1m33.584s
sys     0m4.454s</pre></div></div></div></div></div></div></div><p>Pay attention to the long real timings. The reason is clear: The files had to be transferred via HTTP from a proxy server providing the content of CVMFS.</p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2009/06/cernvm-local-atlas-software-the-clean-solution/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>ATLAS Software: How to run The Full Chain</title><link>http://gehrcke.de/2009/06/atlas-software-how-to-run-the-full-chain/</link> <comments>http://gehrcke.de/2009/06/atlas-software-how-to-run-the-full-chain/#comments</comments> <pubDate>Tue, 30 Jun 2009 14:49:36 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[ATLAS Software]]></category> <category><![CDATA[GSoC 2009]]></category> <category><![CDATA[Linux]]></category> <category><![CDATA[Science]]></category> <category><![CDATA[Technical Stuff]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=625</guid> <description><![CDATA[<p>During development, a system running ATLAS Software has to be tested and validated. There are some standard tests that almost don&#8217;t need any input data, stress the system and &#8212; if they run properly &#8212; are a (very) good indicator that everything is set up correctly. I talk about the so-called JobTransforms. By combining these [...]]]></description> <content:encoded><![CDATA[<p>During development, a system running ATLAS Software has to be tested and validated. There are some standard tests that almost don&#8217;t need any input data, stress the system and &#8212; if they run properly &#8212; are a (very) good indicator that everything is set up correctly. I talk about the so-called <a
href="https://twiki.cern.ch/twiki/bin/view/Atlas/AtlasPythonTrf">JobTransforms</a>. By combining these <em>JobTransforms</em>, the so-called <strong>Full Chain</strong> can be run &#8212; a convenient test. In this blog post I summarize what&#8217;s behind the <em>Full Chain</em> and provide a shell script to easily run it.<span
id="more-625"></span></p><h4>What is &#8220;The Full Chain&#8221;?</h4><p>The so-called <a
href="https://twiki.cern.ch/twiki/bin/view/Atlas/AtlasPythonTrf">JobTransforms</a> are python scripts that are used to run production tasks. They take an input file, a set of parameters and &#8220;transform&#8221; the input into one or more output files. Combined in the correct order, they build up <em>The Full Chain</em>. I will summarize the elements of the chain now:</p><ul><li><strong>Step 1)</strong> <strong>Event generation</strong>: a virtual particle gets created with a specific energy and direction.<br
/> <strong>Input</strong>: particle definition file. <strong>Output</strong>: EVGEN file.</li><li><strong>Step 2)</strong> <strong>Simulation</strong>: the interaction between this particle and the detector is simulated.<br
/> <strong>Input</strong>: EVGEN file. <strong>Output</strong>: HITS file.</li><li><strong>Step 3)</strong> <strong>Digitization</strong>: the ATLAS detector output is calculated.<br
/> <strong>Input</strong>: HITS file. <strong>Output</strong>: RDO file.</li><li><strong>Step 4)</strong> <strong>Reconstruction</strong>: times and voltages are reconstructed into tracks and energy deposits.<br
/> <strong>Input</strong>: RDO file. <strong>Output</strong>: ESD file.</li><li><strong>Step 5)</strong> <strong>Conversion</strong>: only keep the most important data from the last step.<br
/> <strong>Input</strong>: ESD file. <strong>Output</strong>: AOD file.</li></ul><p>The following picture from <a
href="https://twiki.cern.ch/twiki/bin/view/Atlas/WorkBookFullChain">here</a> visualizes the chain and &#8212; you might be interested in that &#8212; shows where real data from the <a
href="http://www.cern.ch/lhc">LHC</a> will play a role in reality:<br
/><div
class="wp-caption aligncenter" style="width: 409px"><img
alt="The Full Chain" src="http://gehrcke.de/wp/blog_content/FullChain_small.png" width="399" height="500" /><p
class="wp-caption-text">The Full Chain</p></div></p><h4>How to realize &#8220;The Full Chain&#8221;</h4><p>As you know, the input of step 1 is a small user-given file defining particle parameters. <a
href="http://gehrcke.de/gsoc/singlepart_singlepi">This file</a> on my webserver describes a single pion with specific energy and direction. The content basically is the following:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="python"><pre class="de1"><span class="co1"># Single pi+ in log(E) between 200 MeV and 2 TeV</span>
<span class="kw1">from</span> AthenaCommon.<span class="me1">AlgSequence</span> <span class="kw1">import</span> AlgSequence
topAlg <span class="sy0">=</span> AlgSequence<span class="br0">&#40;</span><span class="st0">&quot;TopAlg&quot;</span><span class="br0">&#41;</span>
<span class="kw1">from</span> ParticleGenerator.<span class="me1">ParticleGeneratorConf</span> <span class="kw1">import</span> ParticleGenerator
topAlg +<span class="sy0">=</span> ParticleGenerator<span class="br0">&#40;</span><span class="br0">&#41;</span>
ParticleGenerator <span class="sy0">=</span> topAlg.<span class="me1">ParticleGenerator</span>
<span class="co1"># For DEBUG output from ParticleGenerator.</span>
ParticleGenerator.<span class="me1">OutputLevel</span> <span class="sy0">=</span> <span class="nu0">2</span>
ParticleGenerator.<span class="me1">orders</span> <span class="sy0">=</span> <span class="br0">&#91;</span>
  <span class="st0">&quot;PDGcode: constant 211&quot;</span><span class="sy0">,</span>
  <span class="st0">&quot;e: log 200. 2000000.&quot;</span><span class="sy0">,</span>
  <span class="st0">&quot;eta: flat -5.5 5.5&quot;</span><span class="sy0">,</span>
  <span class="st0">&quot;phi: flat -3.14159 3.14159&quot;</span>
  <span class="br0">&#93;</span>
<span class="kw1">from</span> EvgenJobOptions.<span class="me1">SingleEvgenConfig</span> <span class="kw1">import</span> evgenConfig</pre></div></div></div></div></div></div></div><p>At this point it&#8217;s not important to understand each line of this file. It works <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /></p><p>The following shellscript downloads this file and initializes <em>The Full Chain</em> for exactly one event of this particle. During execution, it measures timings.</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1"><span class="kw2">wget</span> http:<span class="sy0">//</span>gehrcke.de<span class="sy0">/</span>gsoc<span class="sy0">/</span>singlepart_singlepi
<span class="kw2">mv</span> singlepart_singlepi CSC.007410.singlepart_singlepi+_logE.py
&nbsp;
<span class="kw3">echo</span> <span class="re5">-e</span> <span class="st0">&quot;<span class="es1">\n</span>read CSC file, generate 1 single pion.. create EVGEN file..-&gt; evgen.log&quot;</span>
<span class="kw1">time</span> csc_evgen_trf.py 007410 <span class="nu0">1</span> <span class="nu0">1</span> <span class="nu0">765432</span> CSC.007410.singlepart_singlepi+_logE.py EVGEN_007410_00001.pool.root <span class="sy0">&gt;</span> evgen.log
&nbsp;
<span class="kw3">echo</span> <span class="re5">-e</span> <span class="st0">&quot;<span class="es1">\n</span>read EVGEN file.. simulate.. create HITS file..-&gt; simul.log&quot;</span>
<span class="kw1">time</span> csc_simul_trf.py EVGEN_007410_00001.pool.root HITS_007410_00001.pool.root NONE <span class="nu0">1</span> <span class="nu0">0</span> <span class="nu0">452368</span> <span class="st0">&quot;ATLAS-CSC-02-00-00&quot;</span> <span class="nu0">0</span> <span class="nu0">0</span> <span class="st0">&quot;QGSP_BERT&quot;</span> CalHits.py <span class="sy0">&gt;</span> simul.log
&nbsp;
<span class="kw3">echo</span> <span class="re5">-e</span> <span class="st0">&quot;<span class="es1">\n</span>read HITS file.. digitize.. produce RDO file..-&gt; digi.log&quot;</span>
<span class="kw1">time</span> csc_digi_trf.py HITS_007410_00001.pool.root RDO_007410_00001.pool.root <span class="nu0">1</span> <span class="nu0">0</span> <span class="st0">&quot;ATLAS-CSC-02-00-00&quot;</span> <span class="nu0">740581234</span> <span class="nu0">29402491</span> <span class="st_h">'NONE'</span> <span class="st_h">'NONE'</span> CalHits.py <span class="st_h">'NONE'</span> <span class="st_h">'AtRndmGenSvc'</span> <span class="st_h">'QGSP_EMV'</span> <span class="st_h">'NONE'</span> <span class="sy0">&gt;</span> digi.log
&nbsp;
<span class="kw3">echo</span> <span class="re5">-e</span> <span class="st0">&quot;<span class="es1">\n</span>read RDO file.. reconstruct.. produce ESD file..-&gt; recoESD.log&quot;</span>
<span class="kw1">time</span> csc_recoESD_trf.py RDO_007410_00001.pool.root ESD_007410_00001.pool.root <span class="st_h">'NONE'</span> <span class="nu0">1</span> <span class="nu0">0</span> <span class="st0">&quot;ATLAS-CSC-02-00-00&quot;</span> <span class="st_h">'NONE'</span> <span class="sy0">&gt;</span> recoESD.log
&nbsp;
<span class="kw3">echo</span> <span class="re5">-e</span> <span class="st0">&quot;<span class="es1">\n</span>read ESD file.. convert.. produce AOD file..-&gt; recoAOD.log&quot;</span>
<span class="kw1">time</span> csc_recoAOD_trf.py ESD_007410_00001.pool.root AOD_007410_00001.pool.root <span class="nu0">1</span> <span class="nu0">0</span> <span class="st0">&quot;ATLAS-CSC-02-00-00&quot;</span> <span class="st_h">'NONE'</span> <span class="sy0">&gt;</span> recoAOD.log</pre></div></div></div></div></div></div></div><p>You can download the script <a
href="http://gehrcke.de/gsoc/athena_whole_chain_1event.sh">here</a>.</p><p>After setting up the runtime environment for your ATLAS Software installation, you can simply download and execute this script. It should work! I tested it with ATLAS Software 15.2.0, as you can see in this blog post: <a
href="http://gehrcke.de/2009/06/cernvm-local-atlas-software-the-clean-solution"><em>CernVM: local ATLAS Software — the clean solution</em></a></p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2009/06/atlas-software-how-to-run-the-full-chain/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>EC2: Install ATLAS Software to an EBS volume</title><link>http://gehrcke.de/2009/06/ec2-install-atlas-software-to-an-ebs-volume/</link> <comments>http://gehrcke.de/2009/06/ec2-install-atlas-software-to-an-ebs-volume/#comments</comments> <pubDate>Tue, 30 Jun 2009 13:58:15 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[Amazon Web Services]]></category> <category><![CDATA[ATLAS Software]]></category> <category><![CDATA[CernVM]]></category> <category><![CDATA[GSoC 2009]]></category> <category><![CDATA[Linux]]></category> <category><![CDATA[Technical Stuff]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=612</guid> <description><![CDATA[<p>In CernVM: local ATLAS Software — the clean solution I proposed to install an ATLAS Software release to an EBS volume. I did it to make it locally available in CernVM running on EC2. The approach allows to move an EBS volume from one EC2 instance to another without losing important components and functionality. Here [...]]]></description> <content:encoded><![CDATA[<p>In <a
href="http://gehrcke.de/2009/06/cernvm-local-atlas-software-the-clean-solution"><em>CernVM: local ATLAS Software — the clean solution</em></a> I proposed to install an ATLAS Software release to an EBS volume. I did it to make it locally available in <a
href="http://cernvm.cern.ch/cernvm/">CernVM</a> running on <a
href="http://aws.amazon.com/ec2/">EC2</a>. The approach allows to move an EBS volume from one EC2 instance to another without losing important components and functionality. Here are some hints (both, CernVM specific and general) to follow before installing the software via <a
href="http://atlas.bu.edu/~youssef/pacman/">pacman</a>.<span
id="more-612"></span>.</p><h4>Start off from a fresh CernVM &#8212; migrate to group-atlas</h4><p>Start a fresh CernVM 1.2.0 instance on EC2 (I used ami-a50cebcc, aki-9b00e5f2). Bootstrap it to the following configuration (via web interface):</p><ul><li>create the new linux user <code>atlasuser</code></li><li>under <em>Virtual Organization Configuration</em> choose ATLAS &#8212; keep everything else as <em>No</em></li></ul><p>Log in via <code>ssh</code>, using the new <code>atlasuser</code> account. At first, the necessary system components have to be installed using <a
href="http://wiki.rpath.com/wiki/Conary">Conary</a>.</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[atlasuser@bla ~]$ sudo conary migrate group-atlas --interactive</pre></div></div></div></div></div></div></div><p>You will need the &#8220;admin password&#8221; you set at bootstrap. If this migration step throws errors for you, consider this blog post: <a
href="http://gehrcke.de/2009/06/cernvm-on-nimbus-kernel-problems/"><em>CernVM on Nimbus: kernel problems</em></a>.</p><p>With migration to group-atlas, <code>gcc</code> gets installed. We will need it soon!</p><h4>Create and attach EBS volume, make file system</h4><p>Create a new EBS volume within <strong>the same availability zone</strong> as the CernVM instance just startet up. I chose <code>15 GB</code> size. This should be enough to hold one ATLAS Software release plus a bit of additional stuff.</p><p>Attach this new EBS volume as e.g. <code>/dev/sdh</code> to the CernVM instance.</p><p>Regarding CernVM, now <a
href="http://e2fsprogs.sourceforge.net/">e2fsprogs</a> has to be installed to create a filesystem on the fresh block device <code>/dev/sdh</code>. Therefore we need the compiler.<br
/> Work as <code>root</code>, download&#038;install:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[root@bla ~]# wget http://prdownloads.sourceforge.net/e2fsprogs/e2fsprogs-1.41.6.tar.gz
[...]
`e2fsprogs-1.41.6.tar.gz' saved [4422395/4422395]
[root@bla ~]# tar xzf e2fsprogs-1.41.6.tar.gz
[root@bla ~]# cd e2fsprogs-1.41.6
[root@bla e2fsprogs-1.41.6]# ./configure &gt; cfg.log
[root@bla e2fsprogs-1.41.6]# make install &gt; makeinst.log
make[1]: texi2dvi: Command not found
make[1]: [libext2fs.dvi] Error 127 (ignored)
/usr/bin/install: cannot stat `libext2fs.info*': No such file or directory
make[1]: [install-doc-libs] Error 1 (ignored)
gzip: /usr/share/info/libext2fs.info*: No such file or directory
make[1]: [install-doc-libs] Error 1 (ignored)</pre></div></div></div></div></div></div></div><p>The warnings are no problem. Now <code>mkfs.ext3</code> can be used:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[root@bla e2fsprogs-1.41.6]# /sbin/mkfs.ext3 /dev/sdh
mke2fs 1.41.6 (30-May-2009)
/dev/sdh is entire device, not just one partition!
Proceed anyway? (y,n) y
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
[...]
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
&nbsp;
This filesystem will be automatically checked every 37 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.</pre></div></div></div></div></div></div></div><h4>Mount it</h4><p>Now the mountpoint has to be chosen. The objective is to install everything needed by the ATLAS Software to this EBS volume so that moving around with this volume between different instances (which are running operating systems supporting ATLAS Software) becomes possible.</p><p>Keep in mind that an ATLAS Software &#8212; once installed to a specific path &#8212; should <strong>always</strong> work under this path. By moving it around, you <strong>will</strong> get problems, even with new set up scripts. Furthermore, keeping everything necessary <strong>under</strong> the mountpoint allows to e.g. move an EBS volume from one EC2 instance to another without losing important components and functionality.</p><p>I chose the mountpoint <code>/opt/atlas-local</code>. Now and in the future, I will always have to mount the EBS volume to this directory.</p><p>Work as <code>root</code>: create mountpoint, mount device and give all rights:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[root@bla opt]# mkdir /opt/atlas-local
[root@bla opt]# mount /dev/sdh /opt/atlas-local
[root@bla opt]# chmod 777 /opt/atlas-local</pre></div></div></div></div></div></div></div><p>Now log in as <code>atlasuser</code> and check out the new space:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">[atlasuser@bla atlas-local]$ df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1              9293008   1999624   6821316  23% /
none                    890952         0    890952   0% /dev/shm
/dev/sdh              15481840    169592  14525816   2% /opt/atlas-local</pre></div></div></div></div></div></div></div><h4>Install ATLAS Software to this EBS volume</h4><p>To install a new ATLAS Software Release to a subdirectory of the mountpoint, you can use my following script which automatically checks out pacman and invokes the installing command:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1"><span class="re2">PACMANDIR</span>=<span class="sy0">/</span>opt<span class="sy0">/</span>atlas-local<span class="sy0">/</span>pacman
<span class="re2">ATLASINSTALLDIR</span>=<span class="sy0">/</span>opt<span class="sy0">/</span>atlas-local<span class="sy0">/</span>15.2.0
<span class="kw1">if</span> <span class="br0">&#91;</span> <span class="re5">-e</span> <span class="co1">${PACMANDIR}</span> <span class="br0">&#93;</span>; <span class="kw1">then</span>
    <span class="kw3">echo</span> --info: <span class="co1">${PACMANDIR}</span> exists
    <span class="kw3">echo</span> --info: <span class="kw3">cd</span> <span class="co1">${PACMANDIR}</span><span class="sy0">/</span>pacman-<span class="sy0">*</span>
    <span class="kw3">cd</span> <span class="co1">${PACMANDIR}</span><span class="sy0">/</span>pacman-<span class="sy0">*</span>
    <span class="kw3">echo</span> --info: pwd: <span class="sy0">`</span><span class="kw3">pwd</span><span class="sy0">`</span>
    <span class="kw3">echo</span> --info: pacman setup
    <span class="kw3">source</span> setup.sh
<span class="kw1">else</span>
    <span class="kw3">echo</span> --info: <span class="kw2">mkdir</span> <span class="re5">-p</span> and <span class="kw3">cd</span> to <span class="co1">${PACMANDIR}</span>
    <span class="kw2">mkdir</span> <span class="re5">-p</span> <span class="co1">${PACMANDIR}</span>
    <span class="kw3">cd</span> <span class="co1">${PACMANDIR}</span>
    <span class="kw3">echo</span> --info: download latest pacman
    <span class="kw2">wget</span> http:<span class="sy0">//</span>atlas.bu.edu<span class="sy0">/</span>~youssef<span class="sy0">/</span>pacman<span class="sy0">/</span>sample_cache<span class="sy0">/</span>tarballs<span class="sy0">/</span>pacman-latest.tar.gz
    <span class="kw3">echo</span> --info: extract..
    <span class="kw2">tar</span> xzf pacman-latest.tar.gz
    <span class="kw3">echo</span> --info: <span class="kw3">cd</span> pacman-<span class="sy0">*</span>
    <span class="kw3">cd</span> pacman-<span class="sy0">*</span>
    <span class="kw3">echo</span> --info: pwd: <span class="sy0">`</span><span class="kw3">pwd</span><span class="sy0">`</span>
    <span class="kw3">echo</span> --info: setup pacman..
    <span class="kw3">source</span> setup.sh
<span class="kw1">fi</span>
&nbsp;
<span class="kw3">echo</span> --info: <span class="kw2">mkdir</span> -p, <span class="kw3">cd</span> to <span class="co1">${ATLASINSTALLDIR}</span>
<span class="kw2">mkdir</span> <span class="re5">-p</span> <span class="co1">${ATLASINSTALLDIR}</span>
<span class="kw3">cd</span> <span class="co1">${ATLASINSTALLDIR}</span>
<span class="kw3">echo</span> --info: pwd: <span class="sy0">`</span><span class="kw3">pwd</span><span class="sy0">`</span>
&nbsp;
<span class="kw3">echo</span> --info: start installing ATLAS:
<span class="kw3">echo</span> --info: invoke: pacman <span class="re5">-pretend-platform</span> SLC-<span class="nu0">4</span> <span class="re5">-allow</span> trust-all-caches <span class="re5">-get</span> am-IU:15.2.0
pacman <span class="re5">-pretend-platform</span> SLC-<span class="nu0">4</span> <span class="re5">-allow</span> trust-all-caches <span class="re5">-get</span> am-IU:15.2.0</pre></div></div></div></div></div></div></div><p>With this script, the <em>ATLAS Software 15.2.0</em> will get installed to <code>/opt/atlas-local/15.2.0</code> while <em>pacman</em> resides at <code>/opt/atlas-local/pacman</code>. Adjust the version numbers to your needs. I use Indiana University mirror (am-IU). You perhaps want to modify this, too. Additionally, one could use <code>:15.2.0+KV</code> to automatically perform the Kit Validation after installation. This is a good idea for a new platform. But for CernVM 1.2.0 + group-atlas this KV always succeeds.</p><p>Furthermore, in <a
href="http://gehrcke.de/2009/06/cernvm-local-atlas-software-the-clean-solution"><em>CernVM: local ATLAS Software — the clean solution</em></a> you can read about:</p><ul><li>how to set up CMT</li><li>how to create a <code>cmthome</code> directory and a valid <code>requirements</code> file</li><li>how to set up the runtime environment for an ATLAS Software Release</li></ul><p>Have fun and let me know what you think!</p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2009/06/ec2-install-atlas-software-to-an-ebs-volume/feed/</wfw:commentRss> <slash:comments>5</slash:comments> </item> <item><title>Amazon Web Services: about REST API, Query API, SOAP API and possible inconsistencies</title><link>http://gehrcke.de/2009/06/aws-about-api/</link> <comments>http://gehrcke.de/2009/06/aws-about-api/#comments</comments> <pubDate>Sun, 28 Jun 2009 14:38:01 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[Amazon Web Services]]></category> <category><![CDATA[GSoC 2009]]></category> <category><![CDATA[Technical Stuff]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=479</guid> <description><![CDATA[<p>In this blog post I dedicate myself to the definitions and differences of SOAP, Query and REST API you can use to manage Amazon Web Services. By reading a good book, I could solve some confusion and find answers on upcoming questions. It turns out that Amazon does not use consistent API terms. I would [...]]]></description> <content:encoded><![CDATA[<p>In this blog post I dedicate myself to the definitions and differences of SOAP, Query and REST API you can use to manage <a
href="http://aws.amazon.com/">Amazon Web Services</a>. By reading a good book, I could solve some confusion and find answers on upcoming questions. It turns out that Amazon does not use consistent API terms. I would like to share this information.<span
id="more-479"></span></p><p>You perhaps know <a
href="http://aws.amazon.com/">Amazon Web Services</a> (AWS). If not: hurry up: it&#8217;s Infrastructure-as-a-Service-cloud-computing at its finest <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> . AWS consists of real web services: they all are controlled over HTTP with different &#8220;abstraction levels&#8221; &#8212; the different Application Programming Interfaces: Some services can be controlled via the <strong>REST API</strong>, some via the <strong>Query API</strong>, some via the <strong>SOAP API</strong> and mostly a mixture is allowed. Even the term <strong>REST-Query API</strong> appears. The following table gives an overview of the services and the corresponding available APIs, as stated by AWS (taken out of <a
href="http://aws.amazon.com/documentation/">the documentations</a>):</p><p><center><br
/><table><tr><th>Service</th><th
style="width:85px;">REST</th><th
style="width:85px">Query</th><th
style="width:85px">SOAP</th><th
style="width:85px">REST-Query</th></tr><tr><td>S3</td><td>X</td><td></td><td>X</td><td></td></tr><tr><td>EC2 (incl. sub services)</td><td></td><td>X</td><td>X</td><td></td></tr><tr><td>SQS</td><td></td><td>X</td><td>X</td><td></td></tr><tr><td>SimpleDB</td><td></td><td>X</td><td>X</td><td></td></tr><tr><td>FPS</td><td></td><td>X</td><td>X</td><td></td></tr><tr><td>CloudFront</td><td>X</td><td></td><td></td><td></td></tr><tr><td>Elastic MapReduce</td><td></td><td>X</td><td></td><td></td></tr><tr><td>Fulfillment Web Service</td><td></td><td>X</td><td>X</td><td></td></tr><tr><td>Mechanical Turk</td><td
style="background-color: #FF0000">X</td><td></td><td>X</td><td></td></tr><tr><td>DevPay</td><td></td><td></td><td>X</td><td
style="background-color: #FF0000">X</td></tr></table><p></center></p><p>The red fields indicate that I think there something is wrong. Read on to learn why.</p><p>At first, I was a bit confused about the different APIs. In particular, it was not clear to me why we have to distinguish REST and Query API. In both cases the requests to AWS are based on <a
href="http://en.wikipedia.org/wiki/HTTP#Request_methods">standard HTTP protocol requests</a> like GET and POST. REST and Query API do not use a protocol on top of HTTP, in contrast to the SOAP API: it relies on XML documents transported via HTTP.</p><p>Additionally, it was not clear to me why we have to have at least three different APIs. Is that necessary? What are pros and cons?</p><p>If you look around for client libraries providing the AWS API for famous programming languages like Python, Java, Ruby and so on, you mostly find libraries implementing the REST and Query API. SOAP API &#8212; which is available for <strong>the majority</strong> of the services &#8212; is implemented quite seldom. This confused me.</p><p>Although the cooperative guys in the boto mainlinglist <a
href="http://groups.google.com/group/boto-users/t/ca13a4a2a3e9d22">already helped me on this topic</a>, I looked for a book to read about this in more detail. In <a
href="http://my.safaribooksonline.com/9780596515812">Programming Amazon Web Services by James Murty</a>, I found a very good summary about the topic and &#8212; with that &#8212; answers to my questions. I will now quote big parts, because I think that it could be very useful to some people out there:</p><blockquote><h4>Interfaces: REST and Query Versus SOAP</h4><p>AWS infrastructure services are made available through three separate APIs: REST, Query, and SOAP. In this book we will focus only on the REST and Query APIs and will not demonstrate how to use the SOAP APIs. We have a number of reasons for doing this, reasons which will become clearer after a brief explanation of the differences between the interfaces.</p><h5>REST interfaces</h5><p>The REST interfaces offered by AWS use only the standard components of HTTP request messages to represent the API action that is being performed. These components include:</p><ul><li>HTTP method: describes the action the request will perform</li><li><strong>Universal Resource Identifier (URI): path and query elements that indicate the resource on which the action will be performed</strong></li><li>Request Headers: pieces of metadata that provide more information about the request itself or the requester</li><li>Request Body: the data on which the service will perform an action</li></ul><p>Web services that use these components to describe operations are often termed <em>RESTful</em> services, a categorization for services that use the HTTP protocol as it was originally intended.</p><h5>Query interfaces</h5><p>The Query interfaces offered by AWS also use the standard components of the HTTP protocol to represent API actions; however these interfaces use them in a different way. <strong>Query requests rely on parameters, simple name and value pairs, to express both the action the service will perform and the data the action will be performed on. When you are using a Query interface, the HTTP envelope serves merely as a way of delivering these parameters to the service</strong>.</p><p>To perform an operation with a Query interface, you can express the parameters in the URI of a GET request, or in the body of a POST request. The method component of the HTTP request merely indicates where in the message the parameters are expressed, while the URI may or may not indicate a resource to act upon.</p><p><strong>These characteristics mean that the Query interfaces cannot be considered properly RESTful because they do not use the HTTP message components to fully describe API operations</strong>. Instead, the Query interfaces can be considered REST-like, because although they do things differently, <strong>they still only use standard HTTP message components to perform operations.</strong></p><h5>SOAP interfaces</h5><p>The SOAP interfaces offered by AWS use XML documents to express the action that will be performed and the data that will be acted upon. <strong>These SOAP XML documents are constructed as another layer on top of the underlying HTTP request, such that all the information about the operation is moved out of the HTTP message and encapsulated in the SOAP message instead.</strong></p><p>[...]</p><p>The approach used in the SOAP interfaces are very different from those used by the REST and Query interfaces. Operations expressed in SOAP messages are completely divorced from the underlying HTTP message used to transmit the request, and the HTTP message components, such as method and URI, reveal nothing about the operation being performed.</p><p><strong>The main reason we eschew the SOAP interface in this book is because we believe that SOAP interfaces in general add unnecessary complexity and overhead, effectively spoiling the simplicity and transparency that can make web services such powerful and flexible tools</strong>. [...] <strong>We are not alone in feeling this way. According to Amazon staff members, a vast majority of developers use the REST-based APIs to interact with AWS.</strong></p></blockquote><p>Thank you, James Murty for providing a deeper understanding. With this knowledge, I looked through the developer guides of the different services to check if everything just learned really fits to the actual API realizations and their terms (given by AWS). I found some inconsistencies:</p><h5>Amazon&#8217;s DevPay service: two names for the same thing</h5><p>As you can read <a
href="http://docs.amazonwebservices.com/AmazonDevPay/latest/DevPayDeveloperGuide/MakingRESTRequests.html">here</a>, this service is managed via SOAP API or <strong>&#8220;REST-Query&#8221; API</strong>. &#8220;REST-Query API&#8221; characterizes exactly the same method of invoking requests as the &#8220;Query API&#8221; anywhere else within AWS, as stated <a
href="http://docs.amazonwebservices.com/AmazonDevPay/latest/DevPayDeveloperGuide/MakingRESTRequests.html">in the DevPay documentation</a>: <em>&#8220;REST-Query requests are simple HTTPS requests, using the GET or POST method with query parameters in the URL&#8221;</em>.</p><p>An example <em>REST-Query API</em> request for DevPay:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">https://ls.amazonaws.com/?Action=ActivateHostedProduct​&amp;ActivationKey=XX​&amp;ProductToken=XX&amp;AWSAccessKeyId=XX&amp;Version=XX&amp;Timestamp=XX​&amp;Signature=XYZ</pre></div></div></div></div></div></div></div><p>For comparison, an example <em>Query API</em> request for EC2:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">https://ec2.amazonaws.com/?Action=DescribeImages&amp;ImageId.1=XX&amp;Version=XX&amp;Expires=XX&amp;Signature=XX&amp;SignatureVersion=XX&amp;SignatureMethod=XX&amp;AWSAccessKeyId=XX</pre></div></div></div></div></div></div></div><p>Hence, this method should simply get called &#8220;Query API&#8221;, too.</p><p></p><h5>Amazon&#8217;s Mechanical Turk service: it&#8217;s not RESTful</h5><p>This service can be controlled via SOAP or REST API &#8212; says Amazon. But when we look at <a
href="http://docs.amazonwebservices.com/AWSMechTurk/latest/AWSMechanicalTurkRequester/MakingRequests_MakingRESTRequestsArticle.html">the description of this &#8220;REST API&#8221;</a>, it&#8217;s doubtable that &#8220;REST API&#8221; is the right term here: <em>&#8220;REST requests are simple HTTP requests, using either the GET method with parameters in the URL, or the POST method with parameters in the POST body&#8221;</em></p><p>Let&#8217;s look at an example request of this Mechanical Turk <em>&#8220;REST API&#8221;</em>:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">http://mechanicalturk.amazonaws.com/?Service=AWSMechanicalTurkRequester&amp;AWSAccessKeyId=XX&amp;Version=XX&amp;Operation=XX&amp;Signature=XX&amp;Timestamp=XX&amp;ResponseGroup.0=XX&amp;ResponseGroup.1=XX</pre></div></div></div></div></div></div></div><p>There is almost no difference to the EC2 Query API request example above: It&#8217;s just a GET request and a bunch of parameters. From my perspective, the URI does not represent the resource on which the action will be performed, although <code>?Service=AWSMechanicalTurkRequester</code> possibly looks a bit like that. But all the parameters afterwards argue against calling this API RESTful. Hence, this API should have been called Query API, too.</p><p></p><p>Resulting, for DevPay and Mechanical Turk the corrections in the table from above should look like this:<br
/><center><br
/><table><tr><th>Service</th><th
style="width:85px;">REST</th><th
style="width:85px">Query</th><th
style="width:85px">SOAP</th><th
style="width:85px">REST-Query</th></tr><tr><td>Mechanical Turk</td><td
style="text-decoration:line-through;">X</td><td
style="background-color: #00FF00">X</td><td>X</td><td></td></tr><tr><td>DevPay</td><td></td><td
style="background-color: #00FF00">X</td><td>X</td><td
style="text-decoration:line-through;">X</td></tr></table><p></center></p><p>What do you think? I did not read <a
href="http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm">Fielding&#8217;s thesis</a>, so maybe I&#8217;ve still too few knowledge here.</p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2009/06/aws-about-api/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>CernVM: how to set up a local ATLAS Software Release (dirty version)</title><link>http://gehrcke.de/2009/06/cernvm-how-to-set-up-a-local-atlas-software-release/</link> <comments>http://gehrcke.de/2009/06/cernvm-how-to-set-up-a-local-atlas-software-release/#comments</comments> <pubDate>Sun, 21 Jun 2009 00:05:05 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[ATLAS Software]]></category> <category><![CDATA[CernVM]]></category> <category><![CDATA[GSoC 2009]]></category> <category><![CDATA[Linux]]></category> <category><![CDATA[Nimbus]]></category> <category><![CDATA[Technical Stuff]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=424</guid> <description><![CDATA[<p>One of the main features of CernVM is its special filesystem CVMFS with http backend (based on FUSE). Using CernVM in the standard way, the different experiment softwares work out-of-the-box and are made accessible over the web via CVMFS. Although this is a great feature, I like to set up an ATLAS Software release locally [...]]]></description> <content:encoded><![CDATA[<p>One of the main features of <a
href="http://cernvm.cern.ch/cernvm/">CernVM</a> is its special filesystem CVMFS with http backend (based on <a
href="http://fuse.sourceforge.net/">FUSE</a>). Using CernVM in the standard way, the different experiment softwares work out-of-the-box and are made accessible over the web via CVMFS. Although this is a great feature, I like to set up an <a
href="http://atlas-computing.web.cern.ch/atlas-computing/projects/releases/status/">ATLAS Software release</a> locally &#8212; as real offline version &#8212; to be independent of the software-providing webservers.<span
id="more-424"></span></p><p>Currently, the access to the providing server(s) must come from within CERN network. I am working with VMs in Chicago (<a
href="http://workspace.globus.org/clouds/nimbus.html">Teraport, Nimbus cloud</a>), don&#8217;t have a CERN account and doubt that the performance of CVMFS would have been good enough between Geneva and Chicago for my testing purposes. Hence, I tried to set up an ATLAS Software release locally, which required some special work.</p><p>Everything is based on CernVM 1.2.0 and ATLAS Software 15.1.0. The current solution is dirty and I am still looking for the optimal way. In the end, I would like to have an easy &#8220;switch&#8221; option between</p><ul><li>using CVMFS and being dependent of way software-providing webservers and</li><li>keeping everything local</li></ul><h3>[Update]</h3><p>The introduction/motivation is still valid and the parts below are still true. But in the meanwhile I found a much cleaner solution. It allows to use local and remote ATLAS Software at the same time and does not need any &#8220;hacking&#8221;. You can read about it in this blog post: <a
href="http://gehrcke.de/2009/06/cernvm-local-atlas-software-the-clean-solution"><em>CernVM: local ATLAS Software &#8212; the clean solution</em></a></p><h3>[/Update]</h3><p>To make room for the heavy ATLAS Software, I at first had to extend the filesystem within CernVM&#8217;s loopback file (I worked on the Xen-based Nimbus cloud, which did not support additional partitions at that time). This procedure is described in this blog post:<br
/> <a
href="http://gehrcke.de/2009/06/resize-ext3-file-system-in-loopback-file/ "><em>Resize ext3 file system in loopback file</em></a></p><p>Btw: While acting on CernVM&#8217;s image and filesystem locally with the intention to extend everything, I added the <code>/root/.ssh</code> folder to make public key insertion of Nimbus and EC2 possible. The directory does not exist by default. <a
href="https://savannah.cern.ch/bugs/?52057">Predrag revealed</a> that he will add it in the next release (should be 1.3.0).</p><p>Okay, imagine you are logged in as root to a booted fresh CernVM (only with the modifications named above: bigger filesystem, <code>/root/.ssh</code> added). If the Xen hypervisor added a correct kernel to the system, you can start right away by migrating the system to <code>group-atlas</code>:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1"><span class="co4">$ </span>conary migrate group-atlas <span class="re5">--interactive</span></pre></div></div></div></div></div></div></div><p>If this throws errors for you, consider this blog post: <a
href="http://gehrcke.de/2009/06/cernvm-on-nimbus-kernel-problems/">CernVM on Nimbus: kernel problems</a></p><p>After migration to <code>group-atlas</code>, the system is ready for a pacman installation of ATLAS Software. Set up pacman:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1">$ <span class="kw2">wget</span> http:<span class="sy0">//</span>atlas.bu.edu<span class="sy0">/</span>~youssef<span class="sy0">/</span>pacman<span class="sy0">/</span>sample_cache<span class="sy0">/</span>tarballs<span class="sy0">/</span>pacman-latest.tar.gz
$ <span class="kw2">tar</span> xzf pacman-latest.tar.gz
$ <span class="kw3">cd</span> pacman-<span class="nu0">3.28</span><span class="sy0">/</span>
$ <span class="kw3">source</span> setup.sh</pre></div></div></div></div></div></div></div><p>In this case, the ATLAS Software release (+KitValidation) should get installed to <code>/opt/atlas/15.1.0</code>:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1">$ <span class="kw2">mkdir</span> <span class="sy0">/</span>opt<span class="sy0">/</span>atlas<span class="sy0">/</span>15.1.0
$ <span class="kw3">cd</span> <span class="sy0">/</span>opt<span class="sy0">/</span>atlas<span class="sy0">/</span>15.1.0
$ pacman <span class="re5">-pretend-platform</span> SLC-<span class="nu0">4</span> <span class="re5">-allow</span> trust-all-caches <span class="re5">-get</span> am-BU:15.1.0+KV</pre></div></div></div></div></div></div></div><p><code>-pretend-platform SLC-4</code> is necessary, since we&#8217;re not on the standard ATLAS platform. Installing takes quite a while. To optimize the process, I chose Boston  University mirror, which is good from the perspective of Chicago. In the end you should see</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1"><span class="co0">##################################################</span>
<span class="co0">##   AtlasProduction 15.1.0 Validation [  OK  ]</span>
<span class="co0">##################################################</span></pre></div></div></div></div></div></div></div><p>Now the exciting part comes: setting up a working environment for the software.</p><p>Set up a <code>cmthome</code> directory and a <code>requirements</code> file:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1">$ <span class="kw2">mkdir</span> <span class="sy0">/</span>opt<span class="sy0">/</span>atlas<span class="sy0">/</span>15.1.0<span class="sy0">/</span>cmthome
$ <span class="kw2">vi</span> <span class="sy0">/</span>opt<span class="sy0">/</span>atlas<span class="sy0">/</span>15.1.0<span class="sy0">/</span>cmthome<span class="sy0">/</span>requirements
$ <span class="kw2">cat</span> <span class="sy0">/</span>opt<span class="sy0">/</span>atlas<span class="sy0">/</span>15.1.0<span class="sy0">/</span>cmthome<span class="sy0">/</span>requirements
<span class="kw1">set</span> CMTSITE STANDALONE
<span class="kw1">set</span> SITEROOT <span class="sy0">/</span>opt<span class="sy0">/</span>atlas<span class="sy0">/</span>15.1.0
macro ATLAS_DIST_AREA <span class="co1">${SITEROOT}</span>
apply_tag opt
apply_tag setup
apply_tag noTest
use AtlasLogin AtlasLogin-<span class="sy0">*</span> $<span class="br0">&#40;</span>ATLAS_DIST_AREA<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div><p>This <code>requirements</code> file should be enough to use the release for local production runs (standalone and no testarea).</p><p>Now set up <code>CMT</code>, <code>cd</code> to the folder of the new <code>requirements</code> file and configurate <code>CMT</code>:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">$ source /opt/atlas/15.1.0/CMT/v1r20p20081118/mgr/setup.sh
$ cd /opt/atlas/15.1.0/cmthome/
$ cmt config
sh: manpath: command not found
------------------------------------------
Configuring environment for standalone package.
CMT version v1r20p20081118.
System is Linux-i686
------------------------------------------
Creating setup scripts.
Creating cleanup scripts.</pre></div></div></div></div></div></div></div><p>See the <code>manpath</code> issue? Not a real problem, but if you like to fix it for the future:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">$ vi /bin/manpath
$ cat /bin/manpath
#!/bin/bash
man --path $*
$ chmod u+x /bin/manpath
$ conary update man
[...]
$ manpath
/usr/local/share/man:/usr/share/man:/usr/X11R6/man</pre></div></div></div></div></div></div></div><p>Based on the <code>requirements</code> file, in the last step <code>cmt config</code> has created <code>cmthome/setup.sh</code>, which has to be sourced before each usage of the ATLAS Software to set up the environment correctly. This next step offers real issues with the platform:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">$ source /opt/atlas/15.1.0/cmthome/setup.sh -tag=15.1.0
AtlasLogin: Configuration problem - CMTCONFIG (Unsupported-opt) not available for /opt/atlas/15.1.0/AtlasOffline/15.1.0
AtlasLogin: Error - Unsupported-opt installation non-existent and no fallback found
#CMT---&gt; Warning: The tag Unsupported-opt is not used in any tag expression. Please check spelling</pre></div></div></div></div></div></div></div><p>These warnings are not meaningless and the environment got only set up partly. <strong>The ATLAS Software release can almost not be used in this situation</strong>. The errors had to be fixed. So I started debugging.</p><p>Here is what happens basically within <code>cmthome/setup.sh</code>: At first <code>CMT</code> gets called in a special way to produce a temporary shellscript. This is executed directly after creation. At the beginning it sets a lot of environment variables (including <code>CMTCONFIG</code>) and then it executes several other configuration scripts. After execution, the temporary shellscript is deleted.</p><p><strong>And here the error is:</strong><br
/> <code>CMT</code> does not detect CernVM as a proper system for ATLAS Software. Hence, within this named temporary shellscript <code>CMTCONFIG</code> is set to <code>Unsupported-opt</code> or <code>NotSupported</code>. The following configuration scripts partly work and partly throw errors (the errors we saw above).</p><p><strong>This is the solution:</strong><br
/> In principle &#8212; after migration to <code>group-atlas</code> &#8212; CernVM should be very okay as platform. So I decided to override <code>CMTCONFIG</code> with the value <code>i686-slc4-gcc34-opt</code> in the temporary shellscript by a call to <code>sed</code>. This has to happen within <code>cmthome/setup.sh</code>, after creation and before execution. The value <code>i686-slc4-gcc34-opt</code> is totally correct for a 32bit CernVM after migration to <code>group-atlas</code>.</p><p>This is the modified <code>cmthome/setup.sh</code>:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1"><span class="co0"># echo &quot;Setting standalone package&quot;</span>
&nbsp;
<span class="kw1">if</span> <span class="kw3">test</span> <span class="st0">&quot;<span class="es3">${CMTROOT}</span>&quot;</span> = <span class="st0">&quot;&quot;</span>; <span class="kw1">then</span>
  <span class="re2">CMTROOT</span>=<span class="sy0">/</span>opt<span class="sy0">/</span>atlas<span class="sy0">/</span>15.1.0<span class="sy0">/</span>CMT<span class="sy0">/</span>v1r20p20081118; <span class="kw3">export</span> CMTROOT
<span class="kw1">fi</span>
. <span class="co1">${CMTROOT}</span><span class="sy0">/</span>mgr<span class="sy0">/</span>setup.sh
&nbsp;
<span class="re2">tempfile</span>=<span class="sy0">`</span><span class="co1">${CMTROOT}</span><span class="sy0">/</span>mgr<span class="sy0">/</span>cmt <span class="re5">-quiet</span> build temporary_name<span class="sy0">`</span>
<span class="kw1">if</span> <span class="kw3">test</span> <span class="sy0">!</span> <span class="re4">$?</span> = <span class="nu0">0</span> ; <span class="kw1">then</span> <span class="re2">tempfile</span>=<span class="sy0">/</span>tmp<span class="sy0">/</span>cmt.<span class="re4">$$</span>; <span class="kw1">fi</span>
<span class="co1">${CMTROOT}</span><span class="sy0">/</span>mgr<span class="sy0">/</span>cmt setup <span class="re5">-sh</span> <span class="re5">-pack</span>=cmt_standalone <span class="re5">-path</span>=<span class="sy0">/</span>opt<span class="sy0">/</span>atlas<span class="sy0">/</span>15.1.0<span class="sy0">/</span>cmthome  -no_cleanup <span class="re4">$*</span> <span class="sy0">&gt;</span><span class="co1">${tempfile}</span>
<span class="kw3">echo</span> <span class="st0">&quot;********************* CMTCONFIG hack start **********************&quot;</span>
<span class="kw3">echo</span> <span class="st0">&quot;** appearances of CMTCONFIG in <span class="es3">${tempfile}</span>:&quot;</span>
<span class="kw2">cat</span> <span class="co1">${tempfile}</span> <span class="sy0">|</span> <span class="kw2">grep</span> CMTCONFIG
<span class="kw3">echo</span> <span class="st0">&quot;&quot;</span>
<span class="kw3">echo</span> <span class="st0">&quot;** replacing with 'i686-slc4-gcc34-opt'&quot;</span>
<span class="kw2">sed</span> <span class="re5">-r</span> <span class="st_h">'s/^CMTCONFIG=.+/export CMTCONFIG=i686-slc4-gcc34-opt/g'</span> <span class="re5">-i</span> <span class="co1">${tempfile}</span>
<span class="kw3">echo</span> <span class="st0">&quot;&quot;</span>
<span class="kw3">echo</span> <span class="st0">&quot;** appearances of CMTCONFIG in <span class="es3">${tempfile}</span>:&quot;</span>
<span class="kw2">cat</span> <span class="co1">${tempfile}</span> <span class="sy0">|</span> <span class="kw2">grep</span> CMTCONFIG
<span class="kw3">echo</span> <span class="st0">&quot;********** CMTCONFIG hack end: executing <span class="es3">${tempfile}</span> **********&quot;</span>
. <span class="co1">${tempfile}</span>
<span class="sy0">/</span>bin<span class="sy0">/</span><span class="kw2">rm</span> <span class="re5">-f</span> <span class="co1">${tempfile}</span></pre></div></div></div></div></div></div></div><p>Now there were no more errors while setting up the environment:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">$ source /opt/atlas/15.1.0/cmthome/setup.sh -tag=15.1.0
********************* CMTCONFIG hack start **********************
** appearances of CMTCONFIG in /tmp/fileMWrr85:
CMTCONFIG=NotSupported; export CMTCONFIG
NEWCMTCONFIG=&quot;i686-unknown00-gcc344&quot;; export NEWCMTCONFIG
CMTCONFIG=&quot;NotSupported&quot;; export CMTCONFIG
&nbsp;
** replacing with 'i686-slc4-gcc34-opt'
&nbsp;
** appearances of CMTCONFIG in /tmp/fileMWrr85:
export CMTCONFIG=i686-slc4-gcc34-opt
NEWCMTCONFIG=&quot;i686-unknown00-gcc344&quot;; export NEWCMTCONFIG
export CMTCONFIG=i686-slc4-gcc34-opt
********** CMTCONFIG hack end: executing /tmp/fileMWrr85 **********</pre></div></div></div></div></div></div></div><p>With this environment I was able to run <a
href="https://twiki.cern.ch/twiki/bin/view/Atlas/WorkBookFullChain">the full chain</a> of <a
href="https://twiki.cern.ch/twiki/bin/view/Atlas/AtlasPythonTrf">JobTransforms</a>.</p><p>The solution is not clean. For instance, the output of <code>cmt show path</code> is not complete. So, I personally don&#8217;t like it very much. For the future I plan to find a solution that combines CernVM&#8217;s <code>configUpdate</code> to ATLAS&#8217; config (which belongs to CernVM&#8217;s bootstrap process) and a local <a
href="http://atlas.bu.edu/~youssef/pacman/">pacman</a> installation. This should be much more clean.</p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2009/06/cernvm-how-to-set-up-a-local-atlas-software-release/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item><title>Problems at Boston University&#8217;s ATLAS Software mirror</title><link>http://gehrcke.de/2009/06/problems-at-boston-universitys-atlas-software-mirror/</link> <comments>http://gehrcke.de/2009/06/problems-at-boston-universitys-atlas-software-mirror/#comments</comments> <pubDate>Fri, 19 Jun 2009 12:59:28 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[ATLAS Software]]></category> <category><![CDATA[GSoC 2009]]></category> <category><![CDATA[Linux]]></category> <category><![CDATA[Technical Stuff]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=440</guid> <description><![CDATA[<p>While installing an ATLAS Software release from Boston University&#8217;s mirror, I discovered a broken archive file. It was the fault of a bad network card. The mirror had to be rebuilt from scratch.</p><p>I started a pacman installation of a local ATLAS Software release, using Boston University&#8217;s mirror, because I needed the files in [...]]]></description> <content:encoded><![CDATA[<p>While installing an <a
href="http://atlas-computing.web.cern.ch/atlas-computing/projects/releases/status/">ATLAS Software release</a> from Boston University&#8217;s mirror, I discovered a broken archive file. It was the fault of a bad network card. The mirror had to be rebuilt from scratch. <span
id="more-440"></span></p><p>I started a <a
href="http://atlas.bu.edu/~youssef/pacman/">pacman</a> installation of a local <a
href="http://atlas-computing.web.cern.ch/atlas-computing/projects/releases/status/">ATLAS Software release</a>, using Boston University&#8217;s mirror, because I needed the files in Chicago:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">$ pacman -pretend-platform SLC-4 -allow trust-all-caches -get am-BU:15.1.0+KV</pre></div></div></div></div></div></div></div><p>I got this error, over and over again:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">Error executing [tar -z -t -f /opt/atlas/15.1.0/ROOT_5_22_00a_i686_slc4_gcc34_opt.tar.gz] returns status code [&lt;truncated by pacman&gt;
[...]
sw/lcg/app/releases/ROOT/5.22.00a/slc4_ia32_gcc34/root/lib/libPostscript.so
tar: Skipping to next header
gzip: stdin: invalid compressed data--crc error
gzip: stdin: invalid compressed data--length error
tar: Child returned status 1
tar: Error exit delayed from previous errors].</pre></div></div></div></div></div></div></div><p>Using the mirror of Indiana University (am-IU), everything worked. After opening <a
href="https://savannah.cern.ch/bugs/?51504">this savannah bug ticket</a>, it turned out that a bad network card caused corrupted data arriving at Boston University. This explains why I alway got the same corrupted file when receiving data from Boston University. They took am-BU offline and rebuilt it.</p><p><strong>Update</strong>: am-BU is up again!</p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2009/06/problems-at-boston-universitys-atlas-software-mirror/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> </channel> </rss>
