<?xml version="1.0" encoding="UTF-8"?> <rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
> <channel><title>gehrcke.de &#187; Technical Stuff</title> <atom:link href="http://gehrcke.de/category/technical-stuff/feed/" rel="self" type="application/rss+xml" /><link>http://gehrcke.de</link> <description>Jan-Philip Gehrcke&#039;s website</description> <lastBuildDate>Fri, 07 Oct 2011 15:57:11 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3</generator> <item><title>Python WebSocket server powered by gevent and ws4py</title><link>http://gehrcke.de/2011/10/python-websocket-server-powered-by-gevent-and-ws4py/</link> <comments>http://gehrcke.de/2011/10/python-websocket-server-powered-by-gevent-and-ws4py/#comments</comments> <pubDate>Fri, 07 Oct 2011 15:56:21 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[Python]]></category> <category><![CDATA[Technical Stuff]]></category> <category><![CDATA[WebSockets]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=1761</guid> <description><![CDATA[<p>I am following the emerging WebSocket standard with a lot of interest. Today, I would like to update my recommendation of tools presented in the article &#8220;The best and simplest tools to create a basic WebSocket application with Flash fallback and Python on the server side&#8221;. ws4py (WebSocket for Python) by Sylvain Hellegouarch is worth [...]]]></description> <content:encoded><![CDATA[<p>I am following the emerging <a
href="http://en.wikipedia.org/wiki/WebSocket">WebSocket</a> standard with a lot of interest. Today, I would like to update my recommendation of tools presented in the article <a
href="http://gehrcke.de/2011/06/the-best-and-simplest-tools-to-create-a-basic-websocket-application-with-flash-fallback-and-python-on-the-server-side/">&#8220;The best and simplest tools to create a basic WebSocket application with Flash fallback and Python on the server side&#8221;</a>. <a
href="https://github.com/Lawouach/WebSocket-for-Python"><strong>ws4py</strong></a> (WebSocket for Python) by <a
href="http://www.defuze.org/">Sylvain Hellegouarch</a> is worth spreading the word.<span
id="more-1761"></span></p><p>ws4py is still under development and may underlie considerable structural changes in the future. But what we can see so far is a cleanly written generic pure-Python WebSocket protocol implementation that can be used for servers as well as for clients. I would like to point out that &#8212; among others &#8212; a <a
href="http://gevent.org/">gevent</a>-driven server is already included. I have not seen or done any benchmarks about it, but theoretically this server should be very well performing. Overall, ws4py looks very promising.</p><p>In my playground setups, I&#8217;ve replaced <a
href="https://bitbucket.org/Jeffrey/gevent-websocket/">gevent-websocket</a> by ws4py. This way, I successfully ran some WebSocket communication tests between the clients implemented in Firefox 7, Chrome 14, Chrome 15 and the ws4py/gevent-based server &#8212; making use of the <a
href="http://tools.ietf.org/html/draft-ietf-hybi-thewebsocketprotocol-10">HyBI WebSocket protocol version 10</a>. Together with a recent version of the perfect <a
href="https://github.com/gimite/web-socket-js">web-socket-js</a>, the communication also works with older browsers like Firefox 3.</p><p>Standard conformance and performance of ws4py is tested via the <a
href="http://www.tavendo.de/autobahn/testsuite.html">Autobahn WebSocket test suite</a>. To have a testsuite like that is extremely important in order to push the development of high-quality WebSocket libraries. You should definitely look at it if you are a WebSocket client/server developer yourself.</p><p>Another thing I would like to show to you is the table at <a
href="http://en.wikipedia.org/wiki/User:Oberstet/Comparison_of_WebSocket_implementations">http://en.wikipedia.org/wiki/User:Oberstet/Comparison_of_WebSocket_implementations</a>. If maintained properly, this will always be a very good reference to get an overview about WebSocket implementations on different platforms.</p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2011/10/python-websocket-server-powered-by-gevent-and-ws4py/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>LaTeX: jump/skip to the bottom of a page</title><link>http://gehrcke.de/2011/08/latex-jumpskip-to-the-bottom-of-a-page/</link> <comments>http://gehrcke.de/2011/08/latex-jumpskip-to-the-bottom-of-a-page/#comments</comments> <pubDate>Tue, 16 Aug 2011 12:24:39 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[LaTeX]]></category> <category><![CDATA[Technical Stuff]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=1755</guid> <description><![CDATA[<p>I could not remember the command to &#8220;jump&#8221; to the bottom of a page in LaTeX. I searched, but my keywords turned out to be crap. This post is just for the people searching with the same keywords (latex, jump, skip, bottom, page)</p><p>The solution is \vfill. You can also use \vfil, and \vfilll), [...]]]></description> <content:encoded><![CDATA[<p>I could not remember the command to &#8220;jump&#8221; to the bottom of a page in LaTeX.<span
id="more-1755"></span> I searched, but my keywords turned out to be crap. This post is just for the people searching with the same keywords (latex, jump, skip, bottom, page) <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p><p>The solution is <code>\vfill</code>. You can also use <code>\vfil</code>, and <code>\vfilll)</code>, having a lower/higher priority than <code>\vfill</code>, respectively.</p><p>German reference:<br
/> <a
href="http://www.golatex.de/wiki/index.php?title=\vfill" title="http://www.golatex.de/wiki/index.php?title=\vfill">http://www.golatex.de/wiki/index.php?title=\vfill</a></p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2011/08/latex-jumpskip-to-the-bottom-of-a-page/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>The best and simplest tools to create a basic WebSocket application with Flash fallback and Python on the server side</title><link>http://gehrcke.de/2011/06/the-best-and-simplest-tools-to-create-a-basic-websocket-application-with-flash-fallback-and-python-on-the-server-side/</link> <comments>http://gehrcke.de/2011/06/the-best-and-simplest-tools-to-create-a-basic-websocket-application-with-flash-fallback-and-python-on-the-server-side/#comments</comments> <pubDate>Sun, 26 Jun 2011 15:56:46 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[Javascript]]></category> <category><![CDATA[Python]]></category> <category><![CDATA[Technical Stuff]]></category> <category><![CDATA[WebSockets]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=1694</guid> <description><![CDATA[<p>Currently, I am playing around with WebSockets. This is due to an application idea I have in my mind which requires a bidirectional connection between browser and server with low latency. The communication will happen in a stream-like fashion at low bandwidth. Real network sockets using TCP/UDP are often the desired optimum for things like [...]]]></description> <content:encoded><![CDATA[<p>Currently, I am playing around with <a
href="http://en.wikipedia.org/wiki/WebSockets">WebSockets</a>. This is due to an application idea I have in my mind which requires a <strong>bidirectional connection</strong> between  browser and server with <strong>low latency</strong>. The communication will happen in a stream-like fashion at low bandwidth. Real network sockets using TCP/UDP are often the desired optimum for things like that, but within a browser they can only be provided by Java or Flash plugins. <a
href="http://www.slideshare.net/rmoriz/pushing-the-web-websockets">The future belongs to WebSockets</a>. Implemented directly in the browser they are providing a much lower level network connection between the browser and the server than HTTP &#8212; without any plugin. WebSockets still work on a layer above TCP, but low latency and efficient data transport in both directions is warranted in a TCP-like fashion. Therefore, real-time application developers are very keen to use WebSockets. As this still is a young development, there is only few browser support and a lot of non-mature client/server libraries. Most importantly, there is a huge lack of documentation how to use these.</p><p>In this blog post, I present a very simple &#8220;echo application&#8221; where the user sends a message from his browser to the server &#8212; which in turn sends this message back to the client. Simple so far, but <strong>the main focus while realizing this is on selecting the right tools for the task</strong>, as there is already a lot of stuff out there &#8212; some very good, undocumented and hidden things, some totally overloaded things, and some bad things. I tried to fulfill the following conditions:</p><ul><li>Make use of a &#8220;<strong>Flashbridge</strong>&#8221; to realize a fallback when WebSockets are not available in a browser (which is true for Firefox at the moment)</li><li>use <strong>Python</strong> on the server side</li><li>use the <strong>best / most solid tools</strong> available</li><li>at the same time, use the <strong>simplest tools</strong> available that do not bring along loads of stuff that you do not need for simple applications or if you want to desing your own communication protocol anyway.</li></ul><p><span
id="more-1694"></span></p><h3>Tools for the client side</h3><p>First thing, the client running in the browser has to be implemented in JavaScript, of course. I do not like JavaScript, but this is the language of the browsers <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' /> (is there a chance to revolutionize this to Python syntax?). So, when you search the web you will at first find <a
href="http://socket.io/">Socket.IO</a> and <a
href="http://jwebsocket.org/">jWebSocket</a>. Socket.IO is consisting of the client &#8220;Socket.IO&#8221; and the server &#8220;Socket.IO-node&#8221; &#8212; designed to be used together. Yes, it is great &#8212; it seems mature, it already provides a lot of different protocols on top of WebSockets and it provides loads of fallback mechanisms if there are no WebSockets available on the client side. There even is a Python implementation of the server side. But my feeling was: Nope. I have no experience so far with WebSockets, so I would like to start implementing a basic core. I don&#8217;t want to use any communication protocol so far. Just sending messages as designed by the specification of WebSockets. Furthermore, I only want Flash fallback as an <strong>equivalently fast</strong> replacement, no slow backup techniques like xhr-multipart, xhr-polling, and jsonp-polling. Socket.IO overwhelms with a lot of functionality and few of documentation &#8212; making it difficult for newcomers to grasp the essentials.</p><p>A lot of above also applies for jWebSocket, so they have many things in common why I did not want to use them. And as it turned out while browsing through their sources, they have one very special thing in common:</p><h4>web-socket-js</h4><p>Socket.IO as well as jWebSocket both base their core WebSocket-Flash-switch on a very neat and small JavaScript library called <a
href="https://github.com/simb/web-socket-js">web-socket-js</a> written by <a
href="http://gimite.net/en/">Hiroshi Ichikawa</a>. You can check this yourself by looking at <a
href="https://github.com/LearnBoost/Socket.IO/tree/master/lib/vendor/web-socket-js">Socket.IO&#8217;s repository</a> and <a
href="http://code.google.com/p/jwebsocket/source/browse/trunk/jWebSocket/jWebSocketClient/web/res/js/flash-bridge/web_socket.js">jWebSocket&#8217;s repository</a>. If required, web-socket-js replaces the standard JavaScript <code>new WebSocket</code> constructor by its own implementation. This does the trick:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="javascript"><pre class="de1"><span class="br0">&#40;</span><span class="kw2">function</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
&nbsp;
  <span class="kw1">if</span> <span class="br0">&#40;</span>window.<span class="me1">WebSocket</span><span class="br0">&#41;</span> <span class="kw1">return</span><span class="sy0">;</span>
 <span class="br0">&#91;</span>...<span class="br0">&#93;</span>
<span class="br0">&#125;</span></pre></div></div></div></div></div></div></div><p> If WebSockets are supported by the browser natively, then they are used natively and web-socket-js does not do anything more than <code>return;</code>. If the browser lacks support, a Flash plugin Hiroshi wrote is loaded, providing <strong>the same API</strong> as the original JavaScript WebSocket. This API is implemented in the <code>[...]</code> of the code snippet above.</p><p>This is the way I want to start learning WebSockets: using the original API and with a lot of browser support. <strong>Decision made: on the client side, I use web-socket-js.</strong></p><h3>Tools for the server side</h3><p>Answer upfront this time: <a
href="http://www.gevent.org/">gevent</a> in combination with <a
href="https://bitbucket.org/Jeffrey/gevent-websocket">gevent-websocket</a>. Gevent is a <a
href="http://nichol.as/benchmark-of-python-web-servers">nicely performing</a> networking library for Python, based on the<a
href="http://www.stackless.com/pipermail/stackless-dev/2004-March/000022.html"> brilliant idea</a> of <a
href="http://packages.python.org/greenlet/">greenlets</a>. It allows you to write extremely simple server code. Gevent-websocket is just a very small piece of code on top of this, basically saving you from implementing the WebSocket handshake yourself.</p><h3>The code</h3><p>Let&#8217;s start with <code>websocket_echoserver.py</code>:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="python"><pre class="de1"><span class="kw1">from</span> geventwebsocket.<span class="me1">handler</span> <span class="kw1">import</span> WebSocketHandler
<span class="kw1">from</span> gevent <span class="kw1">import</span> pywsgi
<span class="kw1">import</span> gevent
&nbsp;
&nbsp;
<span class="kw1">def</span> handle_echo_client<span class="br0">&#40;</span>ws<span class="br0">&#41;</span>:
    <span class="kw1">while</span> <span class="kw2">True</span>:
        msg <span class="sy0">=</span> ws.<span class="me1">wait</span><span class="br0">&#40;</span><span class="br0">&#41;</span>
        <span class="kw1">if</span> msg <span class="sy0">==</span> <span class="st0">&quot;quit&quot;</span>:
            ws.<span class="me1">close_connection</span><span class="br0">&#40;</span><span class="br0">&#41;</span>
            <span class="kw1">break</span>
        ws.<span class="me1">send</span><span class="br0">&#40;</span>msg<span class="br0">&#41;</span>
&nbsp;
&nbsp;
<span class="kw1">def</span> app<span class="br0">&#40;</span>environ<span class="sy0">,</span> start_response<span class="br0">&#41;</span>:
    <span class="kw1">if</span> environ<span class="br0">&#91;</span><span class="st0">'PATH_INFO'</span><span class="br0">&#93;</span> <span class="sy0">==</span> <span class="st0">&quot;/echo&quot;</span>:
        handle_echo_client<span class="br0">&#40;</span>environ<span class="br0">&#91;</span><span class="st0">'wsgi.websocket'</span><span class="br0">&#93;</span><span class="br0">&#41;</span>
    <span class="kw1">else</span>:
        <span class="kw1">print</span> <span class="st0">&quot;404, PATH_INFO: %s&quot;</span> %  environ<span class="br0">&#91;</span><span class="st0">&quot;PATH_INFO&quot;</span><span class="br0">&#93;</span>
        start_response<span class="br0">&#40;</span><span class="st0">&quot;404 Not Found&quot;</span><span class="sy0">,</span> <span class="br0">&#91;</span><span class="br0">&#93;</span><span class="br0">&#41;</span>
        <span class="kw1">return</span> <span class="br0">&#91;</span><span class="br0">&#93;</span>
&nbsp;
&nbsp;
server <span class="sy0">=</span> pywsgi.<span class="me1">WSGIServer</span><span class="br0">&#40;</span><span class="br0">&#40;</span><span class="st0">''</span><span class="sy0">,</span> <span class="nu0">8087</span><span class="br0">&#41;</span><span class="sy0">,</span> app<span class="sy0">,</span> handler_class<span class="sy0">=</span>WebSocketHandler<span class="br0">&#41;</span>
server.<span class="me1">serve_forever</span><span class="br0">&#40;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div><p>It listens on localhost:8087 for incoming connections. WebSocket connections are accepted and handled. If the requested resource is <code>/echo</code>, the greenlet <code>handle_echo_client()</code> deals with the client. I think it is obvious what this function does. To reach this server, the client has to connect to <code>ws://yourhost:8087/echo</code>.</p><p>Now the client side. Call this HTML code in your browser (make sure to set the paths to the JavaScript sources and SWF file correctly and keep everything <em>under the same domain</em>):</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="html4strict"><pre class="de1"><span class="sc0">&lt;!DOCTYPE HTML PUBLIC &quot;-//W3C//DTD HTML 4.01 Transitional//EN&quot;</span>
<span class="sc0">&quot;http://www.w3.org/TR/html4/transitional.dtd&quot;&gt;</span>
<span class="sc2">&lt;<span class="kw2">html</span>&gt;</span>
<span class="sc2">&lt;<span class="kw2">head</span>&gt;</span>
<span class="sc2">&lt;<span class="kw2">title</span>&gt;</span>websocket client: send/receive test<span class="sc2">&lt;<span class="sy0">/</span><span class="kw2">title</span>&gt;</span>
<span class="sc2">&lt;<span class="kw2">script</span> <span class="kw3">type</span><span class="sy0">=</span><span class="st0">&quot;text/javascript&quot;</span> <span class="kw3">src</span><span class="sy0">=</span><span class="st0">&quot;.........../web-socket-js/swfobject.js&quot;</span>&gt;&lt;<span class="sy0">/</span><span class="kw2">script</span>&gt;</span>
<span class="sc2">&lt;<span class="kw2">script</span> <span class="kw3">type</span><span class="sy0">=</span><span class="st0">&quot;text/javascript&quot;</span> <span class="kw3">src</span><span class="sy0">=</span><span class="st0">&quot;.........../web-socket-js/web_socket.js&quot;</span>&gt;&lt;<span class="sy0">/</span><span class="kw2">script</span>&gt;</span>
<span class="sc2">&lt;<span class="kw2">script</span> <span class="kw3">type</span><span class="sy0">=</span><span class="st0">&quot;text/javascript&quot;</span>&gt;</span>
WEB_SOCKET_SWF_LOCATION = &quot;............/web-socket-js/WebSocketMain.swf&quot;;
function log(msg) {
    var d = new Date();
    var hour = d.getHours();
    var min = d.getMinutes();
    var sec = d.getSeconds();
    var msec = d.getMilliseconds();
    var time = hour + &quot;:&quot; + min + &quot;:&quot; + sec + &quot;:&quot; + msec
    document.getElementById('foo').innerHTML += time + &quot;: &quot; + msg + &quot;<span class="sc2">&lt;<span class="kw2">br</span> <span class="sy0">/</span>&gt;</span>&quot;;
    }
&nbsp;
function ws_init(url) {
    log(&quot;connecting to &quot; + url + &quot;...&quot;)
    ws = new WebSocket(url);
    ws.onopen = function() {
        log(&quot;connection established.&quot;);
        };
    ws.onmessage = function(e) {
        log(&quot;message received: '&quot; + e.data + &quot;'&quot;);
        };
    ws.onclose = function() {
        log(&quot;connection closed.&quot;);
        };
    }
&nbsp;
function ws_send(msg) {
    log(&quot;send: &quot; + msg);
    ws.send(msg);
    }
&nbsp;
function ws_close() {
    log(&quot;closing connection..&quot;);
    ws.close();
    }
<span class="sc2">&lt;<span class="sy0">/</span><span class="kw2">script</span>&gt;</span>
<span class="sc2">&lt;<span class="sy0">/</span><span class="kw2">head</span>&gt;</span>
<span class="sc2">&lt;<span class="kw2">body</span>&gt;</span>
<span class="sc2">&lt;<span class="kw2">form</span> <span class="kw3">name</span><span class="sy0">=</span><span class="st0">&quot;input_form&quot;</span>&gt;</span>
<span class="sc2">&lt;<span class="kw2">table</span>&gt;</span>
<span class="sc2">&lt;<span class="kw2">tr</span>&gt;</span>
    <span class="sc2">&lt;<span class="kw2">td</span>&gt;</span>host/resource:<span class="sc2">&lt;<span class="sy0">/</span><span class="kw2">td</span>&gt;</span>
    <span class="sc2">&lt;<span class="kw2">td</span>&gt;&lt;<span class="kw2">input</span> <span class="kw3">type</span><span class="sy0">=</span><span class="st0">&quot;text&quot;</span> <span class="kw3">name</span><span class="sy0">=</span><span class="st0">&quot;host&quot;</span> <span class="kw3">size</span><span class="sy0">=</span><span class="st0">&quot;50&quot;</span> <span class="kw3">value</span><span class="sy0">=</span><span class="st0">&quot;ws://host:port/resource&quot;</span>&gt;&lt;<span class="sy0">/</span><span class="kw2">td</span>&gt;</span>
    <span class="sc2">&lt;<span class="kw2">td</span>&gt;&lt;<span class="kw2">input</span> <span class="kw3">type</span><span class="sy0">=</span><span class="st0">&quot;button&quot;</span> <span class="kw3">value</span><span class="sy0">=</span><span class="st0">&quot;connect&quot;</span> <span class="kw3">onclick</span><span class="sy0">=</span><span class="st0">&quot;ws_init(document.input_form.host.value)&quot;</span>&gt;&lt;<span class="sy0">/</span><span class="kw2">td</span>&gt;</span>
    <span class="sc2">&lt;<span class="kw2">td</span>&gt;&lt;<span class="kw2">input</span> <span class="kw3">type</span><span class="sy0">=</span><span class="st0">&quot;button&quot;</span> <span class="kw3">value</span><span class="sy0">=</span><span class="st0">&quot;disconnect&quot;</span> <span class="kw3">onclick</span><span class="sy0">=</span><span class="st0">&quot;ws_close()&quot;</span>&gt;&lt;<span class="sy0">/</span><span class="kw2">td</span>&gt;</span>
<span class="sc2">&lt;<span class="sy0">/</span><span class="kw2">tr</span>&gt;</span>
<span class="sc2">&lt;<span class="kw2">tr</span>&gt;</span>
    <span class="sc2">&lt;<span class="kw2">td</span>&gt;</span>msg:<span class="sc2">&lt;<span class="sy0">/</span><span class="kw2">td</span>&gt;</span>
    <span class="sc2">&lt;<span class="kw2">td</span>&gt;&lt;<span class="kw2">input</span> <span class="kw3">type</span><span class="sy0">=</span><span class="st0">&quot;text&quot;</span> <span class="kw3">name</span><span class="sy0">=</span><span class="st0">&quot;msg&quot;</span> <span class="kw3">size</span><span class="sy0">=</span><span class="st0">&quot;50&quot;</span>&gt;&lt;<span class="sy0">/</span><span class="kw2">td</span>&gt;</span>
    <span class="sc2">&lt;<span class="kw2">td</span>&gt;&lt;<span class="kw2">input</span> <span class="kw3">type</span><span class="sy0">=</span><span class="st0">&quot;button&quot;</span> <span class="kw3">value</span><span class="sy0">=</span><span class="st0">&quot;send&quot;</span> <span class="kw3">onclick</span><span class="sy0">=</span><span class="st0">&quot;ws_send(document.input_form.msg.value)&quot;</span>&gt;&lt;<span class="sy0">/</span><span class="kw2">td</span>&gt;</span>
<span class="sc2">&lt;<span class="sy0">/</span><span class="kw2">tr</span>&gt;</span>
<span class="sc2">&lt;<span class="sy0">/</span><span class="kw2">table</span>&gt;</span>
<span class="sc2">&lt;<span class="sy0">/</span><span class="kw2">form</span>&gt;</span>
<span class="sc2">&lt;<span class="kw2">div</span> <span class="kw3">id</span><span class="sy0">=</span><span class="st0">&quot;foo&quot;</span>&gt;</span> <span class="sc2">&lt;<span class="sy0">/</span><span class="kw2">div</span>&gt;</span>
<span class="sc2">&lt;<span class="sy0">/</span><span class="kw2">body</span>&gt;</span>
<span class="sc2">&lt;<span class="sy0">/</span><span class="kw2">html</span>&gt;</span></pre></div></div></div></div></div></div></div><p>What is used here is the standard JavaScript WebSocket API. For me, this HTML renders as:<br
/> <img
src="http://gehrcke.de/wp/blog_content/websocket_test_empty.png" alt="websocket test" /></p><p>Now&#8230; maybe you opened this in Chrome. Then, WebSocket support is natively included and activated. In Firefox, it is disabled. There the Flashbridge would come into play. But Flash does not just send and receive data to and from anywhere. It at first asks the host who delivered the SWF file with whom it is allowed to exchange data: it expects to get a so-called socket policy file from the server, after asking him for this on port 843. Yeah, just accept this &#8212; it is not a bad idea that the internet communication of Flash plugins is restricted. Anyway, you have to set up a server listening on port 843 to provide this policy. I have made a <a
href="http://gehrcke.de/2011/06/flash-socket-policy-server-in-python-based-on-gevent/">blog post for setting up a Flash socket policy server in Python based on gevent</a>.  I just assume that you have set up such a server. Then, our HTML code from above works (at least for me) under Firefox 4+, Chrome 6+ and Internet Explorer 9. Enter your echo server&#8217;s details (<code>ws://yourhost:8087/echo</code>) and play around with it. For me, it looks like that:<br
/> <img
src="http://gehrcke.de/wp/blog_content/websocket_test.png" alt="websocket test" /></p><p>Isn&#8217;t that great? It works. The latency times between send and receive are pretty low (at least for such small words to send): they are comparable with the ping latencies I measured against the same server. The Flashbridge transparantly does its job in Firefox and Internet Explorer and introduces around 5 ms more latency.</p><h3>Conclusion</h3><p>The tools used for this test (gevent, gevent-websocket and web-socket-js) are solid, perform very good and are very simple to use. The code executed in the browser uses the standard WebSocket API. But, due to web-socket-js and its nice Flashbridge, it works in all browsers supporting Flash. gevent-websocket and web-socket-js do not provide more functionality than simple message passing over WebSockets, which was desired.</p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2011/06/the-best-and-simplest-tools-to-create-a-basic-websocket-application-with-flash-fallback-and-python-on-the-server-side/feed/</wfw:commentRss> <slash:comments>5</slash:comments> </item> <item><title>Reading files in C++ using ifstream: dealing correctly with badbit, failbit, eofbit, and perror()</title><link>http://gehrcke.de/2011/06/reading-files-in-c-using-ifstream-dealing-correctly-with-badbit-failbit-eofbit-and-perror/</link> <comments>http://gehrcke.de/2011/06/reading-files-in-c-using-ifstream-dealing-correctly-with-badbit-failbit-eofbit-and-perror/#comments</comments> <pubDate>Sat, 25 Jun 2011 20:28:27 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[C/C++]]></category> <category><![CDATA[Technical Stuff]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=1639</guid> <description><![CDATA[<p>Because of this POV-Ray issue, I wanted to figure out what is the most secure and stable way to read a file line by line in C++ using a std::ifstream in combination with std::getline(). Furthermore, it was the aim to provide as precise error messages as possible. It turned out that dealing with the error [...]]]></description> <content:encoded><![CDATA[<p>Because of <a
href="http://bugs.povray.org/task/213">this POV-Ray issue</a>, I wanted to figure out what is the most secure and stable way to read a file line by line in C++ using a <code>std::ifstream</code> in combination with <code>std::getline()</code>. Furthermore, it was the aim to provide as precise error messages as possible. It turned out that dealing with the error bits <code>eofbit</code>, <code>failbit</code>, and <code>badbit</code> is already challenging, as discussed for e.g. <a
href="http://ubuntuforums.org/archive/index.php/t-61766.html">here</a>, <a
href="http://www.parashift.com/c++-faq-lite/input-output.html#faq-15.2">here</a>, and <a
href="http://www.daniweb.com/forums/post155265-18.html">here</a>, and finally at <a
href="http://www.cplusplus.com/doc/tutorial/files/">cplusplus.com</a>. Even in the latter case you are not provided with all details and the optimal solution. Regarding error messages, it is getting more complicated. Evaluating <code>errno</code>, respectively <code>perror()</code>, at the same time as the error bits is not trivial as can already be inferred from discussions like <a
href="http://www.phwinfo.com/forum/comp-lang-cplus/251919-ifstream-errors.html">this</a> and <a
href="http://bytes.com/topic/c/answers/739116-getting-error-text-ifstream">this</a>. But after some testing and carrying together all available information it turns out that there are good recipes to follow.<br
/> <strong><br
/> Update (July, 7th, 2011): I revised the whole article due to an important insight provided by Alexandre Duret-Lutz (confer comments).</strong></p><p><span
id="more-1639"></span></p><p>If you just want to use/see the results of the investigation, scroll down to the <a
href="#ideal">ideal solutions</a>. Otherwise, you should make yourself briefly familiar with <a
href="http://www.cplusplus.com/reference/iostream/ios/"><code>eofbit</code>, <code>failbit</code>, <code>badbit</code></a> of the <code>ios</code> class.</p><p>All the code shown in this post can be <a
href="http://gehrcke.de/files/perm/20110706_cpp_readfile_test.tar.gz">downloaded</a>.</p><h2>What we want and have to do: obeying the two rules of ifstream iteration (ideal solution snippets)</h2><p>We want to iteratively process the lines read from a file by means of an <code>ifstream</code> (reasons <a
href="http://www.parashift.com/c++-faq-lite/input-output.html#faq-15.1">here</a>). Therefore, we <strong>try</strong> to open a file by invoking <code>ifstream s (&quot;file&quot;)</code>. To <strong>try</strong> to get a line from the file, we use <code>std::getline(s, line)</code>, where <code>line</code> is a <code>std::string</code> to store the data to. In a loop we want to process the data read from the file, line by line, via <code>process(line)</code>. Of course, we only want to call <code>process(line)</code>, if the last <code>getline()</code> stored meaningful data in <code>line</code>. We also want to get the <strong>last</strong> line of the file, even if it is not terminated by a newline character.</p><p>After the investigation described below, I am pretty sure that the simplest rock-solid language construct for this task is:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="cpp"><pre class="de1">string line<span class="sy4">;</span>
ifstream f <span class="br0">&#40;</span><span class="st0">&quot;file&quot;</span><span class="br0">&#41;</span><span class="sy4">;</span>
<span class="kw1">while</span><span class="br0">&#40;</span>getline<span class="br0">&#40;</span>f, line<span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
    process<span class="br0">&#40;</span><span class="sy3">&amp;</span>line<span class="br0">&#41;</span><span class="sy4">;</span>
    <span class="br0">&#125;</span></pre></div></div></div></div></div></div></div><p>This is so simple and at the same time good, because it is the shortest approach following the <strong>two basic rules we must follow</strong> when using <code>std::getline()</code> (or any other IO operation):</p><ul><li><strong>Before</strong> processing data obtained from the stream, check for errors reported by <code>getline()</code> (or other IO operations).</li></ul><ul><li><strong>If <code>getline()</code> (or &#8230;) has set <code>failbit</code> or <code>badbit</code>, do not process the data. <code>eofbit</code> is not required to be checked in the loop and does not necessarily have to prevent data processing</strong>.</li></ul><p>The origin of these rules will become clearer while reading the rest of the article.</p><p>Only two rules, to follow &#8212; isn&#8217;t that easy? Anyway, this often is not done, as you can infer from the links in the introduction. In fact, not following these rules lead to the bug in POV-Ray also mentioned in the introduction.</p><p>How does the simple code snippet above follow these rules? The loop, in fact, at first tries to obtain data from the stream via IO operation <code>getline()</code>. It is totally okay to try this even on a bad/empty/non-existing file, because it just <strong>tries</strong> and afterwards sets the stream&#8217;s error bits correctly, as defined <a
href="http://www.cplusplus.com/reference/string/getline/">here</a>. <strong>After</strong> <code>getline()</code>, <code>failbit</code> and <code>badbit</code> are checked via the ifstream&#8217;s <a
href="http://www.cplusplus.com/reference/iostream/ios/operatornot/">bool operator</a> (this works because getline() returns the stream object). Only if none of these bits are set you can be sure that there is data in <code>line</code>. In this case the loop body is evaluated. It processes the data obtained from the stream. Then, it is tried to read the next line&#8230; and so on. <strong>The point is that for each iteration the chronological order of IO operation, error check, and data processing is correct.</strong> If you wonder why we do not have to check the <code>eofbit</code> within the loop: this will be answered during the discussion below.</p><p>Now, what happens if the file is empty? Or if it does not even exist? If it is a directory? If we are not allowed to access it? <strong>The code snippet above can deal with all types of errors transparently. That is very good so far. But what to do if we want to additionally get some meaningful error messages?</strong> This is the best we can do:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="cpp"><pre class="de1">string line<span class="sy4">;</span>
ifstream f <span class="br0">&#40;</span><span class="st0">&quot;file&quot;</span><span class="br0">&#41;</span><span class="sy4">;</span>
<span class="kw1">if</span> <span class="br0">&#40;</span><span class="sy3">!</span>f.<span class="me1">is_open</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="kw3">perror</span><span class="br0">&#40;</span><span class="st0">&quot;error while opening file&quot;</span><span class="br0">&#41;</span><span class="sy4">;</span>
<span class="kw1">while</span><span class="br0">&#40;</span>getline<span class="br0">&#40;</span>f, line<span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
    process<span class="br0">&#40;</span><span class="sy3">&amp;</span>line<span class="br0">&#41;</span><span class="sy4">;</span>
    <span class="br0">&#125;</span>
<span class="kw1">if</span> <span class="br0">&#40;</span>f.<span class="me1">bad</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="kw3">perror</span><span class="br0">&#40;</span><span class="st0">&quot;error while reading file&quot;</span><span class="br0">&#41;</span><span class="sy4">;</span></pre></div></div></div></div></div></div></div><p>Why? Discussed in the next part.</p><h2>How to catch errors specifically? Testing ifstream&#8217;s behavior</h2><p>Let me start with</p><h3>Two important things to know:</h3><ul><li>Consider a call to <code>std::getline()</code> detecting the end of file. It then sets <code>eofbit</code>. But: &#8220;Notice that some eofbit cases will also set failbit.&#8221; (<a
href="http://www.cplusplus.com/reference/string/getline/">reference</a>). This will be very important and we will figure out in which cases exactly we have either only <code>eofbit</code> or both, <code>eofbit</code> and <code>failbit</code> set.</li></ul><ul><li><code>perror()</code> evaluates the current setting of <code>errno</code> and prints a meaningful error message. <code>errno</code> is a global error variable which is set by low-level functions of your current operating system. This <code>errno</code> setting is sticky: it stays until the next error is happening, overwriting the state of the last error. Therefore, <code>perror()</code> must only be called in a context that for sure has updated <code>errno</code> right before. Otherwise, the printed error message may not make any sense at all in the current context.</li></ul><p>As you already can imagine, for providing meaningful error messages, it is required to understand when exactly the <code>eof</code>, <code>fail</code> and <code>badbit</code> are set. Also, one has to know when exactly it is safe to call <code>perror()</code> in the context of stream methods. Unfortunately, at this point we enter system-dependecy and the documentations are bad. Hence, I implemented a test to see how my test system behaves, which is a 2.6.27 Linux. I will provide all source files and it will be very easy for you to reproduce this test on your system.</p><h3>The test:</h3><p>The test can be summarized as follows:<br
/> Consider <code>ifstream s (&quot;file&quot;)</code>. It checks the state of</p><ul><li><code>s.is_open()</code></li><li><code>s.fail()</code> (same as <code>!s</code>: check for <code>failbit</code> and <code>badbit</code>)</li><li><code>s.bad()</code> (check for only <code>badbit</code>)</li><li><code>s.eof() </code>(check for only <code>eofbit</code>)</li><li><code>errno</code> (evaluated via<code> perror()</code>)</li></ul><p>while opening/reading</p><ul><li>a non-existent file</li><li>an empty file</li><li>an existing file with content</li><li>an existing file with the last line <strong>not</strong> ended by a newline character (can be considered as invalid file format, since lines are mostly considered to be newline-terminated, not newline-separated).</li><li>a file with content that is opened by another process for reading</li><li>a file with content that is opened by another process for writing</li><li>a file that the test program has no access to</li><li>a directory</li></ul><p>Basically, the test evaluates the named quantities at all interesting points and especially after calls to <code>std::getline()</code>.</p><p>Technically, the test consists of:</p><ul><li>The C++ source of a test program with a lot of debug output expecting an input filename as first commandline argument.</li></ul><ul><li>A shellscript that is compiling the C++ source code of the test program and setting up the test files for the test. It then runs the test program for all different input filenames.</li></ul><p>This is the source of the shellscript <code>readfile_tests.sh</code>:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1"><span class="co0">#!/bin/bash</span>
&nbsp;
<span class="re2">COMPILATION_SOURCE</span>=<span class="re4">$1</span>
<span class="re2">NE_FILE</span>=<span class="st0">&quot;na&quot;</span>
<span class="re2">EMPTY_FILE</span>=<span class="st0">&quot;empty_file&quot;</span>
<span class="re2">ONE_LINE_FILE</span>=<span class="st0">&quot;one_line_file&quot;</span>
<span class="re2">INVALID_LINE_FILE</span>=<span class="st0">&quot;invalid_line_file&quot;</span>
<span class="re2">FILE_READ</span>=<span class="st0">&quot;file_read&quot;</span>
<span class="re2">FILE_WRITTEN</span>=<span class="st0">&quot;file_written&quot;</span>
<span class="re2">FILE_DENIED</span>=<span class="st0">&quot;/root/.bashrc&quot;</span>
<span class="re2">DIR</span>=<span class="st0">&quot;dir&quot;</span>
&nbsp;
<span class="co0"># compile test program, resulting in a.out executable</span>
<span class="kw2">g++</span> <span class="re1">$COMPILATION_SOURCE</span>
&nbsp;
<span class="co0"># create test files / directories and put them in the desired state</span>
<span class="kw2">touch</span> <span class="re1">$EMPTY_FILE</span>
<span class="kw1">if</span> <span class="br0">&#91;</span><span class="br0">&#91;</span> <span class="sy0">!</span> <span class="re5">-d</span> <span class="re1">$DIR</span> <span class="br0">&#93;</span><span class="br0">&#93;</span>; <span class="kw1">then</span>
    <span class="kw2">mkdir</span> <span class="re1">$DIR</span>
<span class="kw1">fi</span>
<span class="kw3">echo</span> <span class="st0">&quot;rofl&quot;</span> <span class="sy0">&gt;</span> <span class="re1">$ONE_LINE_FILE</span>
<span class="kw3">echo</span> <span class="re5">-ne</span> <span class="st0">&quot;validline<span class="es1">\n</span>invalidline&quot;</span> <span class="sy0">&gt;</span> <span class="re1">$INVALID_LINE_FILE</span>
<span class="kw3">echo</span> <span class="st0">&quot;i am opened to read from&quot;</span> <span class="sy0">&gt;</span> <span class="re1">$FILE_READ</span>
python <span class="re5">-c</span> <span class="st_h">'import time; f = open(&quot;'</span><span class="re1">$FILE_READ</span><span class="st_h">'&quot;); time.sleep(4)'</span> <span class="sy0">&amp;</span>
<span class="kw3">echo</span> <span class="st0">&quot;i am opened to write to&quot;</span> <span class="sy0">&gt;</span> <span class="re1">$FILE_WRITTEN</span>
python <span class="re5">-c</span> <span class="st_h">'import time; f = open(&quot;'</span><span class="re1">$FILE_WRITTEN</span><span class="st_h">'&quot;, &quot;a&quot;); time.sleep(4)'</span> <span class="sy0">&amp;</span>
&nbsp;
<span class="co0"># execute test cases</span>
<span class="kw3">echo</span> <span class="st0">&quot;******** testing on non-existent file..&quot;</span>
.<span class="sy0">/</span>a.out <span class="re1">$NE_FILE</span>
<span class="kw3">echo</span>
<span class="kw3">echo</span> <span class="st0">&quot;******** testing on empty file..&quot;</span>
.<span class="sy0">/</span>a.out <span class="re1">$EMPTY_FILE</span>
<span class="kw3">echo</span>
<span class="kw3">echo</span> <span class="st0">&quot;******** testing on valid file with one line content&quot;</span>
.<span class="sy0">/</span>a.out <span class="re1">$ONE_LINE_FILE</span>
<span class="kw3">echo</span>
<span class="kw3">echo</span> <span class="st0">&quot;******** testing on a file with one valid and one invalid line&quot;</span>
.<span class="sy0">/</span>a.out <span class="re1">$INVALID_LINE_FILE</span>
<span class="kw3">echo</span>
<span class="kw3">echo</span> <span class="st0">&quot;******** testing on a file that is read by another process&quot;</span>
.<span class="sy0">/</span>a.out <span class="re1">$FILE_READ</span>
<span class="kw3">echo</span>
<span class="kw3">echo</span> <span class="st0">&quot;******** testing on a file that is written to by another process&quot;</span>
.<span class="sy0">/</span>a.out <span class="re1">$FILE_WRITTEN</span>
<span class="kw3">echo</span>
<span class="kw3">echo</span> <span class="st0">&quot;******** testing on a /root/.bashrc (access should be denied)&quot;</span>
.<span class="sy0">/</span>a.out <span class="re1">$FILE_DENIED</span>
<span class="kw3">echo</span>
<span class="kw3">echo</span> <span class="st0">&quot;******** testing on a directory&quot;</span>
.<span class="sy0">/</span>a.out <span class="re1">$DIR</span></pre></div></div></div></div></div></div></div><p>This is the source of the C++ program <code>readfile_debug.cpp</code>:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="cpp"><pre class="de1"><span class="co2">#include &lt;iostream&gt;</span>
<span class="co2">#include &lt;fstream&gt;</span>
<span class="co2">#include &lt;string&gt;</span>
<span class="kw2">using</span> <span class="kw2">namespace</span> std<span class="sy4">;</span>
&nbsp;
&nbsp;
<span class="kw4">int</span> check_error_bits<span class="br0">&#40;</span>ifstream<span class="sy2">*</span> f<span class="br0">&#41;</span> <span class="br0">&#123;</span>
    <span class="kw4">int</span> stop <span class="sy1">=</span> <span class="nu0">0</span><span class="sy4">;</span>
    <span class="kw1">if</span> <span class="br0">&#40;</span>f<span class="sy2">-</span><span class="sy1">&gt;</span>eof<span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
        <span class="kw3">perror</span><span class="br0">&#40;</span><span class="st0">&quot;stream eofbit. error state&quot;</span><span class="br0">&#41;</span><span class="sy4">;</span>
        <span class="co1">// EOF after std::getline() is not the criterion to stop processing</span>
        <span class="co1">// data: In case there is data between the last delimiter and EOF,</span>
        <span class="co1">// getline() extracts it and sets the eofbit.</span>
        stop <span class="sy1">=</span> <span class="nu0">0</span><span class="sy4">;</span>
        <span class="br0">&#125;</span>
    <span class="kw1">if</span> <span class="br0">&#40;</span>f<span class="sy2">-</span><span class="sy1">&gt;</span>fail<span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
        <span class="kw3">perror</span><span class="br0">&#40;</span><span class="st0">&quot;stream failbit (or badbit). error state&quot;</span><span class="br0">&#41;</span><span class="sy4">;</span>
        stop <span class="sy1">=</span> <span class="nu0">1</span><span class="sy4">;</span>
        <span class="br0">&#125;</span>
    <span class="kw1">if</span> <span class="br0">&#40;</span>f<span class="sy2">-</span><span class="sy1">&gt;</span>bad<span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
        <span class="kw3">perror</span><span class="br0">&#40;</span><span class="st0">&quot;stream badbit. error state&quot;</span><span class="br0">&#41;</span><span class="sy4">;</span>
        stop <span class="sy1">=</span> <span class="nu0">1</span><span class="sy4">;</span>
        <span class="br0">&#125;</span>
    <span class="kw1">return</span> stop<span class="sy4">;</span>
    <span class="br0">&#125;</span>
&nbsp;
&nbsp;
<span class="kw4">int</span> main<span class="br0">&#40;</span><span class="kw4">int</span> argc, <span class="kw4">char</span><span class="sy2">*</span> argv<span class="br0">&#91;</span><span class="br0">&#93;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
    string line<span class="sy4">;</span>
    <span class="kw4">int</span> getlinecount <span class="sy1">=</span> <span class="nu0">1</span><span class="sy4">;</span>
    <span class="kw1">if</span><span class="br0">&#40;</span>argc <span class="sy3">!</span><span class="sy1">=</span> <span class="nu0">2</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
        <span class="kw3">cerr</span> <span class="sy1">&lt;&lt;</span> <span class="st0">&quot;provide one argument&quot;</span> <span class="sy1">&lt;&lt;</span> endl<span class="sy4">;</span>
        <span class="kw1">return</span> <span class="nu0">1</span><span class="sy4">;</span>
        <span class="br0">&#125;</span>
    <span class="kw3">cout</span> <span class="sy1">&lt;&lt;</span> <span class="st0">&quot;* trying to open and read: &quot;</span> <span class="sy1">&lt;&lt;</span> argv<span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span> <span class="sy1">&lt;&lt;</span> endl<span class="sy4">;</span>
    ifstream f <span class="br0">&#40;</span>argv<span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy4">;</span>
    <span class="kw3">perror</span><span class="br0">&#40;</span><span class="st0">&quot;error state after ifstream constructor&quot;</span><span class="br0">&#41;</span><span class="sy4">;</span>
    <span class="kw1">if</span> <span class="br0">&#40;</span><span class="sy3">!</span>f.<span class="me1">is_open</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
        <span class="kw3">perror</span><span class="br0">&#40;</span><span class="st0">&quot;is_open() returned false. error state&quot;</span><span class="br0">&#41;</span><span class="sy4">;</span>
    <span class="kw1">else</span>
        <span class="kw3">cout</span> <span class="sy1">&lt;&lt;</span> <span class="st0">&quot;is_open() returned true.&quot;</span> <span class="sy1">&lt;&lt;</span> endl<span class="sy4">;</span>
    <span class="kw3">cout</span> <span class="sy1">&lt;&lt;</span> <span class="st0">&quot;* checking error bits once before first getline&quot;</span> <span class="sy1">&lt;&lt;</span> endl<span class="sy4">;</span>
    check_error_bits<span class="br0">&#40;</span><span class="sy3">&amp;</span>f<span class="br0">&#41;</span><span class="sy4">;</span>
    <span class="kw1">while</span><span class="br0">&#40;</span><span class="nu0">1</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
        <span class="kw3">cout</span> <span class="sy1">&lt;&lt;</span> <span class="st0">&quot;* perform getline() # &quot;</span> <span class="sy1">&lt;&lt;</span> getlinecount <span class="sy1">&lt;&lt;</span> endl<span class="sy4">;</span>
        getline<span class="br0">&#40;</span>f, line<span class="br0">&#41;</span><span class="sy4">;</span>
        <span class="kw3">cout</span> <span class="sy1">&lt;&lt;</span> <span class="st0">&quot;* checking error bits after getline&quot;</span> <span class="sy1">&lt;&lt;</span> endl<span class="sy4">;</span>
        <span class="kw1">if</span> <span class="br0">&#40;</span>check_error_bits<span class="br0">&#40;</span><span class="sy3">&amp;</span>f<span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
            <span class="kw3">cout</span> <span class="sy1">&lt;&lt;</span> <span class="st0">&quot;* skip operation on data, break loop&quot;</span> <span class="sy1">&lt;&lt;</span> endl<span class="sy4">;</span>
            <span class="kw1">break</span><span class="sy4">;</span>
            <span class="br0">&#125;</span>
        <span class="co1">// This is the actual operation on the data obtained and we want to</span>
        <span class="co1">// protect it from errors during the last IO operation on the stream</span>
        <span class="kw3">cout</span> <span class="sy1">&lt;&lt;</span> <span class="st0">&quot;data line &quot;</span> <span class="sy1">&lt;&lt;</span> getlinecount <span class="sy1">&lt;&lt;</span> <span class="st0">&quot;: &quot;</span> <span class="sy1">&lt;&lt;</span> line <span class="sy1">&lt;&lt;</span> endl<span class="sy4">;</span>
        getlinecount<span class="sy2">++</span><span class="sy4">;</span>
        <span class="br0">&#125;</span>
    f.<span class="me1">close</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy4">;</span>
    <span class="kw1">return</span> <span class="nu0">0</span><span class="sy4">;</span>
    <span class="br0">&#125;</span></pre></div></div></div></div></div></div></div><p>Let&#8217;s run it:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1"><span class="co4">$ </span>.<span class="sy0">/</span>readfile_tests.sh readfile_debug.cpp</pre></div></div></div></div></div></div></div><p>The output:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">******** testing on non-existent file..
* trying to open and read: na
error state after ifstream constructor: No such file or directory
is_open() returned false. error state: No such file or directory
* checking error bits once before first getline
stream failbit (or badbit). error state: No such file or directory
* perform getline() # 1
* checking error bits after getline
stream failbit (or badbit). error state: No such file or directory
* skip operation on data, break loop
&nbsp;
******** testing on empty file..
* trying to open and read: empty_file
error state after ifstream constructor: Success
is_open() returned true.
* checking error bits once before first getline
* perform getline() # 1
* checking error bits after getline
stream eofbit. error state: Success
stream failbit (or badbit). error state: Success
* skip operation on data, break loop
&nbsp;
******** testing on valid file with one line content
* trying to open and read: one_line_file
error state after ifstream constructor: Success
is_open() returned true.
* checking error bits once before first getline
* perform getline() # 1
* checking error bits after getline
data line 1: rofl
* perform getline() # 2
* checking error bits after getline
stream eofbit. error state: Success
stream failbit (or badbit). error state: Success
* skip operation on data, break loop
&nbsp;
******** testing on a file with one valid and one invalid line
* trying to open and read: invalid_line_file
error state after ifstream constructor: Success
is_open() returned true.
* checking error bits once before first getline
* perform getline() # 1
* checking error bits after getline
data line 1: validline
* perform getline() # 2
* checking error bits after getline
stream eofbit. error state: Success
data line 2: invalidline
* perform getline() # 3
* checking error bits after getline
stream eofbit. error state: Success
stream failbit (or badbit). error state: Success
* skip operation on data, break loop
&nbsp;
******** testing on a file that is read by another process
* trying to open and read: file_read
error state after ifstream constructor: Success
is_open() returned true.
* checking error bits once before first getline
* perform getline() # 1
* checking error bits after getline
data line 1: i am opened to read from
* perform getline() # 2
* checking error bits after getline
stream eofbit. error state: Success
stream failbit (or badbit). error state: Success
* skip operation on data, break loop
&nbsp;
******** testing on a file that is written to by another process
* trying to open and read: file_written
error state after ifstream constructor: Success
is_open() returned true.
* checking error bits once before first getline
* perform getline() # 1
* checking error bits after getline
data line 1: i am opened to write to
* perform getline() # 2
* checking error bits after getline
stream eofbit. error state: Success
stream failbit (or badbit). error state: Success
* skip operation on data, break loop
&nbsp;
******** testing on a /root/.bashrc (access should be denied)
* trying to open and read: /root/.bashrc
error state after ifstream constructor: Permission denied
is_open() returned false. error state: Permission denied
* checking error bits once before first getline
stream failbit (or badbit). error state: Permission denied
* perform getline() # 1
* checking error bits after getline
stream failbit (or badbit). error state: Permission denied
* skip operation on data, break loop
&nbsp;
******** testing on a directory
* trying to open and read: dir
error state after ifstream constructor: Success
is_open() returned true.
* checking error bits once before first getline
* perform getline() # 1
* checking error bits after getline
stream failbit (or badbit). error state: Is a directory
stream badbit. error state: Is a directory
* skip operation on data, break loop</pre></div></div></div></div></div></div></div><h3>The test results (important things to know 2):</h3><p>There are many things to learn from this output. The following conclusions are only a subset. All this makes makes a lot of sense:</p><ul><li>The <code>ifstream s (&quot;file&quot;)</code> constructor sets <code>errno</code> in case of a non existing file.</li></ul><ul><li><code>is_open()</code> does not set <code>errno</code>.</li></ul><ul><li><code>is_open()</code> does not catch the case when trying to open a directory.</li></ul><ul><li><code>is_open()</code> only catches the non-existing-file-case.</li></ul><p><strong>-> <code>is_open()</code> after <code>ifstream</code> constructor in combination with<code> perror()</code> works, but only for catching this one case: non existing file.</strong></p><ul><li><code>getline()</code> does only change <code>errno</code> in case of trying to read from a directory. In all other test cases it does not change <code>errno</code>.</li></ul><ul><li>In almost any case, the <code>eofbit</code> hast been set <strong>at the same time</strong> as the <code>failbit</code> (which may happen as stated above). A closer look reveals that the <code>failbit</code> is only set by <code>getline()</code> if it did not manage to extract any data at all. The <code>eofbit</code> on the other hand means that <code>getline()</code> reached EOF while searching the next line delimiter: If there is data between the last delimiter and EOF, <code>getline()</code> extracts this data and sets <code>eofbit</code>.</li></ul><ul><li>The <code>badbit</code> was only set for the directory case.</li></ul><p><strong>-> <code>!s</code> or <code>s.fail()</code> (equivalent, both are sensitive to <code>badbit</code> <strong>and <code>failbit</code></strong>), must not be used in combination with <code>perror()</code>, because it may print error messages which are wrong in the current context.</strong></p><p><strong>-> Only a <code>badbit</code> check and therefore <code>s.bad()</code> seems to be safe to use together with <code>perror()</code>.</strong></p><p><strong>-> In order to process residual data between the last line delimiter and EOF, a positive <code>eofbit</code> must not prevent data processing.</strong></p><h2><a
name="ideal">Ideal solutions</a></h2><p>With the knowledge from above, ideal code solutions in form of ready-to-compile-examples can be proposed for two cases:</p><ul><li>one, in which error messages are not important</li></ul><ul><li>one, in which we do the best we can to extract error messages</li></ul><h3>Ideal solution including meaningful error messages</h3><p>Basically, the result from the above&#8217;s test is that with C++&#8217;s standard means there is not so much we can do to catch specific errors. The following source of <code>readfile_stable_errors.cpp</code> seems to be the best:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="cpp"><pre class="de1"><span class="co2">#include &lt;iostream&gt;</span>
<span class="co2">#include &lt;fstream&gt;</span>
<span class="co2">#include &lt;string&gt;</span>
<span class="kw2">using</span> <span class="kw2">namespace</span> std<span class="sy4">;</span>
&nbsp;
&nbsp;
<span class="kw4">void</span> process<span class="br0">&#40;</span>string<span class="sy2">*</span> line<span class="br0">&#41;</span> <span class="br0">&#123;</span>
    <span class="kw3">cout</span> <span class="sy1">&lt;&lt;</span> <span class="st0">&quot;line read: &quot;</span> <span class="sy1">&lt;&lt;</span> <span class="sy2">*</span>line <span class="sy1">&lt;&lt;</span> endl<span class="sy4">;</span>
    <span class="br0">&#125;</span>
&nbsp;
&nbsp;
<span class="kw4">int</span> main<span class="br0">&#40;</span><span class="kw4">int</span> argc, <span class="kw4">char</span><span class="sy2">*</span> argv<span class="br0">&#91;</span><span class="br0">&#93;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
    string line<span class="sy4">;</span>
    <span class="kw1">if</span><span class="br0">&#40;</span>argc <span class="sy3">!</span><span class="sy1">=</span> <span class="nu0">2</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
        <span class="kw3">cerr</span> <span class="sy1">&lt;&lt;</span> <span class="st0">&quot;provide one argument&quot;</span> <span class="sy1">&lt;&lt;</span> endl<span class="sy4">;</span>
        <span class="kw1">return</span> <span class="nu0">1</span><span class="sy4">;</span>
        <span class="br0">&#125;</span>
    string filename<span class="br0">&#40;</span>argv<span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy4">;</span>
    <span class="kw3">cout</span> <span class="sy1">&lt;&lt;</span> <span class="st0">&quot;* trying to open and read: &quot;</span> <span class="sy1">&lt;&lt;</span> filename <span class="sy1">&lt;&lt;</span> endl<span class="sy4">;</span>
    ifstream f <span class="br0">&#40;</span>argv<span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy4">;</span>
    <span class="co1">// After this opening attempt, we can safely use perror() in case </span>
    <span class="co1">// f.is_open() returns false.</span>
    <span class="kw1">if</span> <span class="br0">&#40;</span><span class="sy3">!</span>f.<span class="me1">is_open</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
        <span class="kw3">perror</span><span class="br0">&#40;</span><span class="br0">&#40;</span><span class="st0">&quot;error while opening file &quot;</span> <span class="sy2">+</span> filename<span class="br0">&#41;</span>.<span class="me1">c_str</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy4">;</span>
    <span class="co1">// Now read the file via std::getline, while trying to provide as</span>
    <span class="co1">// meaningful error messages as possible. Rules obeyed:</span>
    <span class="co1">//   - first the IO operation, then error check, then data processing</span>
    <span class="co1">//   - failbit and badbit prevent data processing, eofbit does not</span>
    <span class="kw1">while</span><span class="br0">&#40;</span>getline<span class="br0">&#40;</span>f, line<span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
        process<span class="br0">&#40;</span><span class="sy3">&amp;</span>line<span class="br0">&#41;</span><span class="sy4">;</span>
        <span class="br0">&#125;</span>
    <span class="co1">// Only in case of the badbit, we can assume that some lower</span>
    <span class="co1">// level system API function has set errno. Then, perror() can be used.</span>
    <span class="kw1">if</span> <span class="br0">&#40;</span>f.<span class="me1">bad</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
        <span class="kw3">perror</span><span class="br0">&#40;</span><span class="br0">&#40;</span><span class="st0">&quot;error while reading file &quot;</span> <span class="sy2">+</span> filename<span class="br0">&#41;</span>.<span class="me1">c_str</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy4">;</span>
    f.<span class="me1">close</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy4">;</span>
    <span class="kw1">return</span> <span class="nu0">0</span><span class="sy4">;</span>
    <span class="br0">&#125;</span></pre></div></div></div></div></div></div></div><p>Of course this can be run against the test shellscript from above:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1">.<span class="sy0">/</span>readfile_tests.sh readfile_stable_errors.cpp</pre></div></div></div></div></div></div></div><p>The output:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">******** testing on non-existent file..
* trying to open and read: na
error while opening file na: No such file or directory
&nbsp;
******** testing on empty file..
* trying to open and read: empty_file
&nbsp;
******** testing on valid file with one line content
* trying to open and read: one_line_file
line read: rofl
&nbsp;
******** testing on a file with one valid and one invalid line
* trying to open and read: invalid_line_file
line read: validline
line read: invalidline
&nbsp;
******** testing on a file that is read by another process
* trying to open and read: file_read
line read: i am opened to read from
&nbsp;
******** testing on a file that is written to by another process
* trying to open and read: file_written
line read: i am opened to write to
&nbsp;
******** testing on a /root/.bashrc (access should be denied)
* trying to open and read: /root/.bashrc
error while opening file /root/.bashrc: Permission denied
&nbsp;
******** testing on a directory
* trying to open and read: dir
error while reading file dir: Is a directory</pre></div></div></div></div></div></div></div><p>Congratulations, &#8220;no such file or directory&#8221;, &#8220;is a directory&#8221;, and &#8220;permission denied&#8221; are catched. Also, the data in the &#8220;invalid&#8221; line was read.</p><h3>Ideal solution without printing error messages:</h3><p>The following source of <code>readfile_stable_no_errors.cpp</code> deals with all errors transparently and extracts residual data from an &#8220;invalid&#8221; last line:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="cpp"><pre class="de1"><span class="co2">#include &lt;iostream&gt;</span>
<span class="co2">#include &lt;fstream&gt;</span>
<span class="co2">#include &lt;string&gt;</span>
<span class="kw2">using</span> <span class="kw2">namespace</span> std<span class="sy4">;</span>
&nbsp;
&nbsp;
<span class="kw4">void</span> process<span class="br0">&#40;</span>string<span class="sy2">*</span> line<span class="br0">&#41;</span> <span class="br0">&#123;</span>
    <span class="kw3">cout</span> <span class="sy1">&lt;&lt;</span> <span class="st0">&quot;line read: &quot;</span> <span class="sy1">&lt;&lt;</span> <span class="sy2">*</span>line <span class="sy1">&lt;&lt;</span> endl<span class="sy4">;</span>
    <span class="br0">&#125;</span>
&nbsp;
&nbsp;
<span class="kw4">int</span> main<span class="br0">&#40;</span><span class="kw4">int</span> argc, <span class="kw4">char</span><span class="sy2">*</span> argv<span class="br0">&#91;</span><span class="br0">&#93;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
    string line<span class="sy4">;</span>
    <span class="kw1">if</span><span class="br0">&#40;</span>argc <span class="sy3">!</span><span class="sy1">=</span> <span class="nu0">2</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
        <span class="kw3">cerr</span> <span class="sy1">&lt;&lt;</span> <span class="st0">&quot;provide one argument&quot;</span> <span class="sy1">&lt;&lt;</span> endl<span class="sy4">;</span>
        <span class="kw1">return</span> <span class="nu0">1</span><span class="sy4">;</span>
        <span class="br0">&#125;</span>
    <span class="kw3">cout</span> <span class="sy1">&lt;&lt;</span> <span class="st0">&quot;* trying to open and read: &quot;</span> <span class="sy1">&lt;&lt;</span> argv<span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span> <span class="sy1">&lt;&lt;</span> endl<span class="sy4">;</span>
    ifstream f <span class="br0">&#40;</span>argv<span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy4">;</span>
    <span class="co1">// Note that we can omit checking for f.is_open(), because</span>
    <span class="co1">// all errors will be catched correctly by f.fail() (!f) and</span>
    <span class="co1">// we do not want to print error messages here.</span>
    <span class="co1">// Also note that during the loop, the following rules are obeyed:</span>
    <span class="co1">//   - first the IO operation, then error check, then data processing</span>
    <span class="co1">//   - failbit and badbit prevent data processing, eofbit does not</span>
    <span class="kw1">while</span><span class="br0">&#40;</span>getline<span class="br0">&#40;</span>f, line<span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
        process<span class="br0">&#40;</span><span class="sy3">&amp;</span>line<span class="br0">&#41;</span><span class="sy4">;</span>
        <span class="br0">&#125;</span>
    f.<span class="me1">close</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy4">;</span>
    <span class="kw1">return</span> <span class="nu0">0</span><span class="sy4">;</span>
    <span class="br0">&#125;</span></pre></div></div></div></div></div></div></div><p>The test:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1">.<span class="sy0">/</span>readfile_tests.sh readfile_stable_no_errors.cpp</pre></div></div></div></div></div></div></div><p>Output:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">******** testing on non-existent file..
* trying to open and read: na
&nbsp;
******** testing on empty file..
* trying to open and read: empty_file
&nbsp;
******** testing on valid file with one line content
* trying to open and read: one_line_file
line read: rofl
&nbsp;
******** testing on a file with one valid and one invalid line
* trying to open and read: invalid_line_file
line read: validline
line read: invalidline
&nbsp;
******** testing on a file that is read by another process
* trying to open and read: file_read
line read: i am opened to read from
&nbsp;
******** testing on a file that is written to by another process
* trying to open and read: file_written
line read: i am opened to write to
&nbsp;
******** testing on a /root/.bashrc (access should be denied)
* trying to open and read: /root/.bashrc
&nbsp;
******** testing on a directory
* trying to open and read: dir</pre></div></div></div></div></div></div></div><p>The intention of this program is to never fail due to stream opening/IO errors. This succeeds: whenever there is data to extract, it extracts data. All other cases are transparently handled correctly.</p><p>Remember, all code shown here can be <a
href="http://gehrcke.de/files/perm/20110706_cpp_readfile_test.tar.gz">downloaded</a>.</p><p>I hope that this article helps someone. Please let me know if I have to correct some things and if we can do better (thanks again to Alexandre at this point).</p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2011/06/reading-files-in-c-using-ifstream-dealing-correctly-with-badbit-failbit-eofbit-and-perror/feed/</wfw:commentRss> <slash:comments>12</slash:comments> </item> <item><title>PyMOL: Remove hydrogens and water</title><link>http://gehrcke.de/2011/06/pymol-remove-hydrogens-and-water/</link> <comments>http://gehrcke.de/2011/06/pymol-remove-hydrogens-and-water/#comments</comments> <pubDate>Mon, 20 Jun 2011 08:48:28 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[PhD thesis]]></category> <category><![CDATA[PyMOL]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=1625</guid> <description><![CDATA[<p>It took me much too long to find a proper solution while searching for &#8220;pymol remove hydrogens&#8221; and &#8220;pymol remove water&#8220;. It&#8217;s pretty easy&#8230;</p><p>Although there is a command for adding hydrogens (h_add), there is no counter part h_remove. For removing hydrogens, best use selections like</p> remove (hydro)<p>or</p> remove hydrogens<p>The same works for [...]]]></description> <content:encoded><![CDATA[<p>It took me much too long to find a proper solution while searching for &#8220;<strong>pymol remove hydrogens</strong>&#8221; and &#8220;<strong>pymol remove water</strong>&#8220;. It&#8217;s pretty easy&#8230;<span
id="more-1625"></span></p><p>Although there is a command for adding hydrogens (<code>h_add</code>), there is no counter part <code>h_remove</code>. For removing <strong>hydrogens</strong>, best use selections like</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">remove (hydro)</pre></div></div></div></div></div></div></div><p>or</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">remove hydrogens</pre></div></div></div></div></div></div></div><p>The same works for <strong>water</strong>. You have at least the following options:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">remove resn hoh</pre></div></div></div></div></div></div></div><p>or, more to remove <strong>solvent</strong> in general:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">remove solvent</pre></div></div></div></div></div></div></div> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2011/06/pymol-remove-hydrogens-and-water/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Flash socket policy server in Python based on gevent</title><link>http://gehrcke.de/2011/06/flash-socket-policy-server-in-python-based-on-gevent/</link> <comments>http://gehrcke.de/2011/06/flash-socket-policy-server-in-python-based-on-gevent/#comments</comments> <pubDate>Sat, 18 Jun 2011 16:25:30 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[Python]]></category> <category><![CDATA[Technical Stuff]]></category> <category><![CDATA[WebSockets]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=1612</guid> <description><![CDATA[<p>Currently, I am looking into gevent &#8212; a nicely performing networking library for Python, based on the brilliant idea of greenlets. I try to use this for WebSocket communication with browsers. As of today, WebSockets are still not available or disabled in some browsers. That&#8217;s why there are Javascript implementations like web-socket-js providing a transparent [...]]]></description> <content:encoded><![CDATA[<p>Currently, I am looking into <a
href="http://www.gevent.org/">gevent</a> &#8212; a <a
href="http://nichol.as/benchmark-of-python-web-servers">nicely performing</a> networking library for Python, based on the<a
href="http://www.stackless.com/pipermail/stackless-dev/2004-March/000022.html"> brilliant idea</a> of <a
href="http://packages.python.org/greenlet/">greenlets</a>. I try to use this for <a
href="http://de.wikipedia.org/wiki/WebSocket">WebSocket</a> communication with browsers. As of today, WebSockets are still not available or <a
href="http://hacks.mozilla.org/2010/12/websockets-disabled-in-firefox-4/">disabled</a> in some browsers. That&#8217;s why there are Javascript implementations like <a
href="https://github.com/gimite/web-socket-js">web-socket-js</a> providing a transparent &#8220;Flashbridge&#8221; as a fallback for the WebSocket communication. But the Flash plugin in the client&#8217;s browser does not simply communicate with any server on any port in the world. As a first step, it always tries to receive a so-called <a
href="http://www.adobe.com/devnet/flashplayer/articles/fplayer9_security.html">socket policy file</a> from the same server the SWF file was received from; on port 843. This policy file tells the client <a
href="http://www.adobe.com/devnet/flashplayer/articles/cross_domain_policy.html">from whom he is allowed to receive data</a>. Today, I present a simple server based on gevent providing this policy file for connecting Flash clients.<span
id="more-1612"></span></p><p>By using gevent, you automatically get one of the highest performing Python servers you can get for this purpose. Due to the greenlet concept, it is able to handle a huge amount of concurrent connections without any threading overhead. Due to benchmarks <a
href="http://oddments.org/?p=494">like this</a>, the server should be faster than comparable Python solutions (based on e.g. <a
href="http://twistedmatrix.com">Twisted</a> or <a
href="http://eventlet.net/">Eventlet</a>) while almost reaching the level of optimized C/C++ solutions. Of course, all this performance is not really necessary for a Flash policy server, but it is good to know what we are dealing with.</p><p>Now, the code. It is already &#8220;bloated&#8221; with comments and logging functionality. Due to the use of gevent, the server code itself is very simple:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="python"><pre class="de1"><span class="co1">#!/usr/bin/env python</span>
&nbsp;
<span class="co1"># Flash access policy server based on gevent</span>
<span class="co1"># Jan-Philip Gehrcke, June 2011</span>
&nbsp;
<span class="co1"># Listen on port 843; send acess policy to client; disconnect.</span>
&nbsp;
&nbsp;
<span class="kw1">from</span> gevent.<span class="me1">server</span> <span class="kw1">import</span> StreamServer
<span class="kw1">import</span> <span class="kw3">datetime</span>
<span class="kw1">import</span> <span class="kw3">socket</span>
&nbsp;
&nbsp;
<span class="co1"># Should we log something? Where to?</span>
LOG <span class="sy0">=</span> <span class="nu0">1</span>
LOGFILE <span class="sy0">=</span> <span class="st0">&quot;flash_access_policy_server.log&quot;</span>
&nbsp;
<span class="co1"># The policy that is sent to the clients.</span>
POLICY <span class="sy0">=</span> <span class="st0">&quot;&quot;&quot;&lt;cross-domain-policy&gt;&lt;allow-access-from domain=&quot;*&quot; to-ports=&quot;*&quot; /&gt;&lt;/cross-domain-policy&gt;<span class="es0">\0</span>&quot;&quot;&quot;</span>
&nbsp;
<span class="co1"># The string the client has to send in order to receive the policy.</span>
POLICYREQUEST <span class="sy0">=</span> <span class="st0">&quot;&lt;policy-file-request/&gt;&quot;</span>
&nbsp;
&nbsp;
<span class="co1"># This function is called for each incoming connection</span>
<span class="co1"># (in a non-blocking fashion in a greenlet)</span>
<span class="kw1">def</span> client_handle<span class="br0">&#40;</span>sock<span class="sy0">,</span> address<span class="br0">&#41;</span>:
    log<span class="br0">&#40;</span><span class="st0">&quot;%s:%s: Connection accepted.&quot;</span> % address<span class="br0">&#41;</span>
    <span class="co1"># send and read functions should not wait longer than three seconds</span>
    sock.<span class="me1">settimeout</span><span class="br0">&#40;</span><span class="nu0">3</span><span class="br0">&#41;</span>
    <span class="kw1">try</span>:
        <span class="co1"># try to receive at most 128 bytes (`POLICYREQUEST` is shorter)</span>
        <span class="kw2">input</span> <span class="sy0">=</span> sock.<span class="me1">recv</span><span class="br0">&#40;</span><span class="nu0">128</span><span class="br0">&#41;</span>
        <span class="kw1">if</span> <span class="kw2">input</span>.<span class="me1">startswith</span><span class="br0">&#40;</span>POLICYREQUEST<span class="br0">&#41;</span>:
            sock.<span class="me1">sendall</span><span class="br0">&#40;</span>POLICY<span class="br0">&#41;</span>
            log<span class="br0">&#40;</span><span class="st0">&quot;%s:%s: Policy sent. Closing connection.&quot;</span> % address<span class="br0">&#41;</span>
        <span class="kw1">else</span>:
            log<span class="br0">&#40;</span><span class="st0">&quot;%s:%s: Crap received. Closing connection.&quot;</span> % address<span class="br0">&#41;</span>
    <span class="kw1">except</span> <span class="kw3">socket</span>.<span class="me1">timeout</span>:
        log<span class="br0">&#40;</span><span class="st0">&quot;%s:%s: Timed out. Closing.&quot;</span> % address<span class="br0">&#41;</span>
    sock.<span class="me1">close</span><span class="br0">&#40;</span><span class="br0">&#41;</span>
&nbsp;
&nbsp;
<span class="co1"># Write `msg` to file and stdout, prepended by a date/time string</span>
<span class="kw1">def</span> log<span class="br0">&#40;</span>msg<span class="br0">&#41;</span>:
    <span class="kw1">if</span> LOG:
        l <span class="sy0">=</span> <span class="st0">&quot;%s: %s&quot;</span> % <span class="br0">&#40;</span><span class="kw3">datetime</span>.<span class="kw3">datetime</span>.<span class="me1">now</span><span class="br0">&#40;</span><span class="br0">&#41;</span>.<span class="me1">isoformat</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">,</span> msg<span class="br0">&#41;</span>
        lf.<span class="me1">write</span><span class="br0">&#40;</span><span class="st0">&quot;%s<span class="es0">\n</span>&quot;</span> % l<span class="br0">&#41;</span>
        <span class="kw1">print</span> l
&nbsp;
&nbsp;
<span class="kw1">if</span> __name__ <span class="sy0">==</span> <span class="st0">'__main__'</span>:
    <span class="kw1">if</span> LOG:
        lf <span class="sy0">=</span> <span class="kw2">open</span><span class="br0">&#40;</span>LOGFILE<span class="sy0">,</span> <span class="st0">&quot;a&quot;</span><span class="br0">&#41;</span>
    server <span class="sy0">=</span> StreamServer<span class="br0">&#40;</span><span class="br0">&#40;</span><span class="st0">'0.0.0.0'</span><span class="sy0">,</span> <span class="nu0">843</span><span class="br0">&#41;</span><span class="sy0">,</span> client_handle<span class="br0">&#41;</span>
    log<span class="br0">&#40;</span><span class="st0">'Starting server...'</span><span class="br0">&#41;</span>
    server.<span class="me1">serve_forever</span><span class="br0">&#40;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div><p>Note that the policy in this case is not very sophisticated and basically allows everything;-)</p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2011/06/flash-socket-policy-server-in-python-based-on-gevent/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item><title>Compile grace with PDF support (via pdflib)</title><link>http://gehrcke.de/2011/06/compile-grace-with-pdf-support-via-pdflib/</link> <comments>http://gehrcke.de/2011/06/compile-grace-with-pdf-support-via-pdflib/#comments</comments> <pubDate>Thu, 09 Jun 2011 12:55:40 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[Linux]]></category> <category><![CDATA[Technical Stuff]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=1595</guid> <description><![CDATA[<p>If you build (xm)grace without the presence of libpdf, you won&#8217;t be able to save figures as PDF files. I compiled both &#8212; libpdf and grace &#8212; as non-privileged user:</p> Build libpdf<p>If you want to keep it cheap, you should use the lite version of libpdf. I used version 7.0.5 lite. The build process [...]]]></description> <content:encoded><![CDATA[<p>If you build <a
href="http://plasma-gate.weizmann.ac.il/Grace">(xm)grace</a> without the presence of <a
href="http://www.pdflib.com">libpdf</a>, you won&#8217;t be able to save figures as PDF files. I compiled both &#8212; libpdf and grace &#8212; as non-privileged user:<span
id="more-1595"></span></p><h5>Build libpdf</h5><p>If you want to keep it cheap, you should use the lite version of libpdf. I used version <a
href="http://www.pdflib.com/binaries/PDFlib/705/PDFlib-Lite-7.0.5.tar.gzf">7.0.5 lite</a>. The build process worked very uncomplicated. As installation directory tree, I chose a place where I have full write access to and where I already placed other self-built software.</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1">.<span class="sy0">/</span>configure <span class="re5">--prefix</span>=<span class="sy0">/</span>home<span class="sy0">/</span>bioinfp<span class="sy0">/</span>jang<span class="sy0">/</span>apps
<span class="kw2">make</span>
<span class="kw2">make</span> <span class="kw2">install</span></pre></div></div></div></div></div></div></div><h5>Build grace</h5><p>I chose to built the <a
href="ftp://plasma-gate.weizmann.ac.il/pub/grace/src/grace5/grace-5.1.22.tar.gz">5.1.22 version</a>. Now, we have to make sure that</p><ul><li>the test scripts during <code>configure</code> and</li><li>the compiler and linker</li></ul><p>will find the newly built pdflib (namely, the header and shared object files). Therefore , we will make use of several environment variables. At first, store the actual paths, which looked like that for me:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1"><span class="re2">PDFINCLUDE</span>=<span class="sy0">/</span>home<span class="sy0">/</span>bioinfp<span class="sy0">/</span>jang<span class="sy0">/</span>apps<span class="sy0">/</span>include
<span class="re2">PDFLIB</span>=<span class="sy0">/</span>home<span class="sy0">/</span>bioinfp<span class="sy0">/</span>jang<span class="sy0">/</span>apps<span class="sy0">/</span>lib</pre></div></div></div></div></div></div></div><p>Then, during <code>configure</code>, we will tell the compiler and linker to look there for header and library files via the environment variables <code>CFLAGS</code>, <code>CPPFLAGS</code>, and <code>LDFLAGS</code>. Additionally, we will add the path stored in PDFLIB to the search path for libraries during binary execution via <code>LD_LIBRARY_PATH</code>. This is required already during <code>configure</code> due to some specific testing method for the existence of pdflib. The resulting <code>configure</code> command, as it worked for me, looks like:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1">.<span class="sy0">/</span>configure <span class="re5">--prefix</span>=<span class="sy0">/</span>home<span class="sy0">/</span>bioinfp<span class="sy0">/</span>jang<span class="sy0">/</span>apps<span class="sy0">/</span>grace <span class="re5">--enable-pdfdrv</span> <span class="re2">CPPFLAGS</span>=<span class="st0">&quot;-I <span class="es2">$PDFINCLUDE</span>&quot;</span> <span class="re2">CFLAGS</span>=<span class="st0">&quot;-I <span class="es2">$PDFINCLUDE</span>&quot;</span> <span class="re2">LDFLAGS</span>=<span class="st0">&quot;-L <span class="es2">$PDFLIB</span>&quot;</span> <span class="re2">LD_LIBRARY_PATH</span>=<span class="re1">$PDFLIB</span>:<span class="re1">$LD_LIBRARY_PATH</span></pre></div></div></div></div></div></div></div><p>If <code>configure</code> was successfull, compile:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1"><span class="kw2">make</span></pre></div></div></div></div></div></div></div><p>Test it:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1"><span class="re2">LD_LIBRARY_PATH</span>=<span class="re1">$PDFLIB</span>:<span class="re1">$LD_LIBRARY_PATH</span> ; .<span class="sy0">/</span>src<span class="sy0">/</span>xmgrace <span class="re5">-version</span></pre></div></div></div></div></div></div></div><p><strong>Side note:</strong> Here, you already see that from now on, whenever you want to start the grace executable, you at first have to add the path of libpdf&#8217;s shared object files to the environment variable <code>LD_LIBRARY_PATH</code>. This is because we were not able to place the libpdf library files to the proper place (like to <code>/usr/local/lib</code>). You can inform yourself about this topic quite well at <a
href="http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html">this</a> point. And <a
href="http://xahlee.org/UnixResource_dir/_/ldpath.html">here</a> you can read why this approach, in general, is not a good approach &#8212; but we have no other choice here.</p><p>So, the output of the command above should contain something like that:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">Registered devices:
X11 PostScript EPS PDF MIF SVG PNM JPEG PNG Metafile</pre></div></div></div></div></div></div></div><p>There should definitely be the PDF device now <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> For me it obviously worked, so I proceeded with the installation:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="bash"><pre class="de1"><span class="kw2">make</span> <span class="kw2">install</span></pre></div></div></div></div></div></div></div><p>To automatically set <code>LD_LIBRARY_PATH</code> correctly whenever you call <code>xmgrace</code> (or any of the other executables the installation created &#8212; yes, there are some more: <code>convcal, fdf2fit, gracebat, grconvert</code>), you can follow two approaches:</p><ul><li>edit your <code>~/.bashrc</code> file (or the comparable file for other shells) and change your <code>LD_LIBRARY_PATH</code> globally. This is considered to be the &#8220;worst&#8221; approach, but should work fine under normal circumstances.</li><li>for each executable, create a &#8220;wrapper shell script&#8221; that at first sets <code>LD_LIBRARY_PATH</code> and then executes the original executable. Make sure that these wrapper scripts are in <code>PATH</code> before the directory where the original executables are in.</li></ul> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2011/06/compile-grace-with-pdf-support-via-pdflib/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Rename an object in PyMOL</title><link>http://gehrcke.de/2011/06/rename-an-object-in-pymol/</link> <comments>http://gehrcke.de/2011/06/rename-an-object-in-pymol/#comments</comments> <pubDate>Tue, 07 Jun 2011 13:57:34 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[PhD thesis]]></category> <category><![CDATA[PyMOL]]></category> <category><![CDATA[Technical Stuff]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=1529</guid> <description><![CDATA[<p>Long time ago since my last blog post&#8230; maybe I should just write the small things down. For my PhD, I&#8217;m currently working with PyMOL and was wondering how to rename an object there. So I googled searched for &#8220;pymol rename object&#8221; and did not find the proper solution on the first page. But there [...]]]></description> <content:encoded><![CDATA[<p>Long time ago since my last blog post&#8230; maybe I should just write the small things down. For my PhD, I&#8217;m currently working with <a
href="http://www.pymol.org/">PyMOL</a> and was wondering how to rename an object there. So I <del
datetime="2011-06-07T13:41:54+00:00">googled</del> searched for <strong>&#8220;pymol rename object&#8221;</strong> and did not find the proper solution on the first page. But there is one. Maybe this post will show up for the next one searching for &#8220;pymol rename object&#8221;.<span
id="more-1529"></span></p><p>Just use the command <code>set_name</code>, as it is described in the <a
href="http://www.pymolwiki.org/index.php/Set_Name">PyMOL wiki</a>:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">set_name old_name, new_name</pre></div></div></div></div></div></div></div> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2011/06/rename-an-object-in-pymol/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Average date strings using Python</title><link>http://gehrcke.de/2011/02/average-date-strings-using-python/</link> <comments>http://gehrcke.de/2011/02/average-date-strings-using-python/#comments</comments> <pubDate>Sun, 06 Feb 2011 19:31:23 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[Python]]></category> <category><![CDATA[Technical Stuff]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=1501</guid> <description><![CDATA[<p>Assume, you&#8217;ve measured values over time and now you want to average your data. This means you have to a) average your measured values &#8212; which is a trivial task &#8212; and b) average your points in time. Here, I present a solution how to average arbitrary date strings in Python.</p><p>Let&#8217;s talk about a [...]]]></description> <content:encoded><![CDATA[<p>Assume, you&#8217;ve measured values over time and now you want to average your data. This means you have to a) average your measured values &#8212; which is a trivial task &#8212; and <strong>b) average your points in time</strong>. Here, I present a solution how to average arbitrary date strings in Python.<span
id="more-1501"></span></p><p>Let&#8217;s talk about a specific example. This is the content of <code>dates.dat</code>, our input:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">20110130_195243
20110130_200003
20110130_200803
20110130_200909
20110130_201003
20110130_202004
20110130_203003
20110130_204003
20110130_205003
20110130_210003
20110130_211003
20110130_212004
20110130_213003
20110130_214003
20110130_215003
20110130_220003
20110130_221003</pre></div></div></div></div></div></div></div><p>Each of the 17 lines contains a string representing a point in time using some distinct format.</p><p><strong>Now, let&#8217;s say that the goal is to build the mean of every 3 points in time</strong>. An output date string representing a mean time should have the same format as the date strings in <code>dates.dat</code>. Hence, the outputfile <code>dates_meanof3values.dat</code> should look like this:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="text"><pre class="de1">20110130_200016
20110130_201305
20110130_204003
20110130_211003
20110130_214003</pre></div></div></div></div></div></div></div><p>These 5 date strings represent the average points in time of the first 5*3 date strings in our input.</p><p>The following Python code accomplishes this:</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="python"><ol><li
class="li1"><pre class="de1"><span class="kw1">import</span> <span class="kw3">time</span></pre></li><li
class="li1"><pre class="de1">&nbsp;</pre></li><li
class="li1"><pre class="de1">meanof <span class="sy0">=</span> <span class="nu0">3</span> <span class="co1"># number of dates taken into account for averaging</span></pre></li><li
class="li1"><pre class="de1">inputfile <span class="sy0">=</span> <span class="kw2">open</span><span class="br0">&#40;</span><span class="st0">'dates.dat'</span><span class="br0">&#41;</span></pre></li><li
class="li1"><pre class="de1">outputfile <span class="sy0">=</span> <span class="kw2">open</span><span class="br0">&#40;</span><span class="st0">&quot;dates_meanof%svalues.dat&quot;</span> % meanof<span class="sy0">,</span><span class="st0">'w'</span><span class="br0">&#41;</span></pre></li><li
class="li1"><pre class="de1">&nbsp;</pre></li><li
class="li1"><pre class="de1"><span class="kw1">def</span> datestring_to_timestamp<span class="br0">&#40;</span><span class="kw2">str</span><span class="br0">&#41;</span>:</pre></li><li
class="li1"><pre class="de1">    <span class="st0">&quot;&quot;&quot;</span></pre></li><li
class="li1"><pre class="de1"><span class="st0">    Assume `str` representing a time in local time and convert it to a timestamp</span></pre></li><li
class="li1"><pre class="de1"><span class="st0">    (time as a floating point number expressed in seconds since the epoch, in</span></pre></li><li
class="li1"><pre class="de1"><span class="st0">    UTC) using the format given below.</span></pre></li><li
class="li1"><pre class="de1"><span class="st0">    &quot;&quot;&quot;</span></pre></li><li
class="li1"><pre class="de1">    <span class="kw1">return</span> <span class="kw3">time</span>.<span class="me1">mktime</span><span class="br0">&#40;</span><span class="kw3">time</span>.<span class="me1">strptime</span><span class="br0">&#40;</span><span class="kw2">str</span><span class="sy0">,</span> <span class="st0">&quot;%Y%m%d_%H%M%S&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></li><li
class="li1"><pre class="de1">&nbsp;</pre></li><li
class="li1"><pre class="de1"><span class="kw1">def</span> timestamp_to_datestring<span class="br0">&#40;</span>timestamp<span class="br0">&#41;</span>:</pre></li><li
class="li1"><pre class="de1">    <span class="st0">&quot;&quot;&quot;</span></pre></li><li
class="li1"><pre class="de1"><span class="st0">    Inverse of the function `datestring_to_timestamp`</span></pre></li><li
class="li1"><pre class="de1"><span class="st0">    &quot;&quot;&quot;</span></pre></li><li
class="li1"><pre class="de1">    <span class="kw1">return</span> <span class="kw3">time</span>.<span class="me1">strftime</span><span class="br0">&#40;</span><span class="st0">&quot;%Y%m%d_%H%M%S&quot;</span><span class="sy0">,</span> <span class="kw3">time</span>.<span class="me1">localtime</span><span class="br0">&#40;</span>timestamp<span class="br0">&#41;</span><span class="br0">&#41;</span></pre></li><li
class="li1"><pre class="de1">&nbsp;</pre></li><li
class="li1"><pre class="de1"><span class="kw1">def</span> chunks<span class="br0">&#40;</span><span class="kw2">list</span><span class="sy0">,</span> n<span class="sy0">,</span> strict<span class="sy0">=</span><span class="kw2">False</span><span class="br0">&#41;</span>:</pre></li><li
class="li1"><pre class="de1">    <span class="st0">&quot;&quot;&quot;</span></pre></li><li
class="li1"><pre class="de1"><span class="st0">    Split `list` in sub-lists: yield successive `n`-sized chunks from `list`.</span></pre></li><li
class="li1"><pre class="de1"><span class="st0">    If `strict` is True, the last chunk is only yielded if its length is `n`.</span></pre></li><li
class="li1"><pre class="de1"><span class="st0">    &quot;&quot;&quot;</span></pre></li><li
class="li1"><pre class="de1">    <span class="kw1">for</span> i <span class="kw1">in</span> <span class="kw2">xrange</span><span class="br0">&#40;</span><span class="nu0">0</span><span class="sy0">,</span> <span class="kw2">len</span><span class="br0">&#40;</span><span class="kw2">list</span><span class="br0">&#41;</span><span class="sy0">,</span> n<span class="br0">&#41;</span>:</pre></li><li
class="li1"><pre class="de1">        <span class="kw1">if</span> <span class="kw1">not</span> strict <span class="kw1">or</span> <span class="kw2">len</span><span class="br0">&#40;</span><span class="kw2">list</span><span class="br0">&#91;</span>i:i+n<span class="br0">&#93;</span><span class="br0">&#41;</span> <span class="sy0">==</span> n:</pre></li><li
class="li1"><pre class="de1">            <span class="kw1">yield</span> <span class="kw2">list</span><span class="br0">&#91;</span>i:i+n<span class="br0">&#93;</span></pre></li><li
class="li1"><pre class="de1">&nbsp;</pre></li><li
class="li1"><pre class="de1"><span class="co1"># read lines from file and remove trailing spaces; don't consider empty lines          </span></pre></li><li
class="li1"><pre class="de1">cleanlines <span class="sy0">=</span> <span class="br0">&#91;</span>line.<span class="me1">strip</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="kw1">for</span> line <span class="kw1">in</span> inputfile.<span class="me1">readlines</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="kw1">if</span> line.<span class="me1">strip</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#93;</span></pre></li><li
class="li1"><pre class="de1"><span class="co1"># devide data into chunks and interate over them</span></pre></li><li
class="li1"><pre class="de1"><span class="kw1">for</span> <span class="kw3">chunk</span> <span class="kw1">in</span> chunks<span class="br0">&#40;</span>cleanlines<span class="sy0">,</span> meanof<span class="sy0">,</span> <span class="kw2">True</span><span class="br0">&#41;</span>:</pre></li><li
class="li1"><pre class="de1">    <span class="co1"># build timestamp of each datestring in current chunk and build the median</span></pre></li><li
class="li1"><pre class="de1">    mean_timestamp <span class="sy0">=</span> <span class="kw2">sum</span><span class="br0">&#40;</span><span class="kw2">map</span><span class="br0">&#40;</span>datestring_to_timestamp<span class="sy0">,</span> <span class="kw3">chunk</span><span class="br0">&#41;</span><span class="br0">&#41;</span> / meanof</pre></li><li
class="li1"><pre class="de1">    <span class="co1"># convert mean timestamp back to datestring and write this to file</span></pre></li><li
class="li1"><pre class="de1">    outputfile.<span class="me1">write</span><span class="br0">&#40;</span><span class="st0">&quot;%s<span class="es0">\n</span>&quot;</span> % timestamp_to_datestring<span class="br0">&#40;</span>mean_timestamp<span class="br0">&#41;</span><span class="br0">&#41;</span></pre></li></ol></div></div></div></div></div></div></div><p>Timestamps are linearly related in the decimal system, so that time can be easily averaged by summation and division of timestamps. The functions <code>datestring_to_timestamp()</code> and <code>timestamp_to_datestring()</code> perform the conversion of date strings from/to timestamps, using a user-given date string format (you can edit lines 13 and 19 corresponding to <a
href="http://docs.python.org/library/time.html#time.strftime"> these format specifiers</a>).</p><p>The function <code>chunks()</code>, which makes use of <a
href="http://wiki.python.org/moin/Generators">Python generators</a>, devides a given list into sub-lists (&#8220;chunks&#8221;). This is very useful at this point &#8212; the mean date string of a chunk then is calculated as simple as</p><div
class="wp-geshi-highlight-wrap5"><div
class="wp-geshi-highlight-wrap4"><div
class="wp-geshi-highlight-wrap3"><div
class="wp-geshi-highlight-wrap2"><div
class="wp-geshi-highlight-wrap"><div
class="wp-geshi-highlight"><div
class="python"><pre class="de1">timestamp_to_datestring<span class="br0">&#40;</span><span class="kw2">sum</span><span class="br0">&#40;</span><span class="kw2">map</span><span class="br0">&#40;</span>datestring_to_timestamp<span class="sy0">,</span> <span class="kw3">chunk</span><span class="br0">&#41;</span><span class="br0">&#41;</span> / meanof<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2011/02/average-date-strings-using-python/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Website update</title><link>http://gehrcke.de/2010/11/website-update/</link> <comments>http://gehrcke.de/2010/11/website-update/#comments</comments> <pubDate>Mon, 22 Nov 2010 00:57:31 +0000</pubDate> <dc:creator>Jan-Philip Gehrcke</dc:creator> <category><![CDATA[Technical Stuff]]></category> <category><![CDATA[Wordpress]]></category> <guid
isPermaLink="false">http://gehrcke.de/?p=1459</guid> <description><![CDATA[<p>The appearance of this website just changed, as you perhaps have noticed.</p><p>I did this today in the framework of updating the Atahualpa theme from version 3.3.3 to 3.5.3. Big action including vanilla WordPress test installation, PHP code changes, manual database changes etc. Furthermore, this update introduced some incompatibilities regarding my existing CSS code. [...]]]></description> <content:encoded><![CDATA[<p>The appearance of this website just changed, as you perhaps have noticed.<br
/> <span
id="more-1459"></span></p><p>I did this today in the framework of updating the <a
href="http://wordpress.bytesforall.com/?cat=9">Atahualpa theme</a> from version 3.3.3 to 3.5.3. Big action including <a
href="http://en.wikipedia.org/wiki/Vanilla_software">vanilla</a> WordPress test installation, PHP code changes, manual database changes etc. Furthermore, this update introduced some incompatibilities regarding my existing CSS code. So I had to tune this manually, too. While doing so, I optimized the menu structure and made use of the new <a
href="http://en.blog.wordpress.com/2010/05/29/new-custom-menus-feature-and-a-few-others/">customisable WordPress3 menus</a>. And I took the chance to make the page look cleaner. Perhaps better. Hopefully <img
src='http://gehrcke.de/wp/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> I like the blue. But I would have liked green. And pink? <a
href="http://jan.mendzigall.de">Jan</a> said <code>blau = seriös</code>. Whatever..</p> ]]></content:encoded> <wfw:commentRss>http://gehrcke.de/2010/11/website-update/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> </channel> </rss>
