<?xml version="1.0" encoding="utf-8" ?>

<rss version="2.0" 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   >
<channel>
    
    <title>CleverBlogName - Technical</title>
    <link>http://israel.diaspora.gen.nz/~rodgerd/</link>
    <description>poo propelling primate</description>
    <dc:language>en</dc:language>
    <generator>Serendipity 1.5.5 - http://www.s9y.org/</generator>
    <pubDate>Fri, 27 Apr 2012 19:53:19 GMT</pubDate>

    <image>
        <url>http://israel.diaspora.gen.nz/~rodgerd/templates/default/img/s9y_banner_small.png</url>
        <title>RSS: CleverBlogName - Technical - poo propelling primate</title>
        <link>http://israel.diaspora.gen.nz/~rodgerd/</link>
        <width>100</width>
        <height>21</height>
    </image>

<item>
    <title>The Wierdness of Hardware, or Why the Whole Stack Matters</title>
    <link>http://israel.diaspora.gen.nz/~rodgerd/archives/1419-The-Wierdness-of-Hardware,-or-Why-the-Whole-Stack-Matters.html</link>
            <category>Technical</category>
    
    <comments>http://israel.diaspora.gen.nz/~rodgerd/archives/1419-The-Wierdness-of-Hardware,-or-Why-the-Whole-Stack-Matters.html#comments</comments>
    <wfw:comment>http://israel.diaspora.gen.nz/~rodgerd/wfwcomment.php?cid=1419</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://israel.diaspora.gen.nz/~rodgerd/rss.php?version=2.0&amp;type=comments&amp;cid=1419</wfw:commentRss>
    

    <author>nospam@example.com (Rodger Donaldson)</author>
    <content:encoded>
    &lt;p&gt;Over the last wee while I&amp;#8217;ve been testing JBoss apps virtualised under RHEV, and this week I had a bizarre experience: my team-mates and I had been puzzling over the high standard deviations (and hence eccentric behaviour) of our web app, which wasn&amp;#8217;t even using all the available JVM heap or virtual processors assigned to it.  While I was off in meetings, the rest of the team doubled the number of vCPUs, and the SD improved significantly, but more importantly, the utilisation of each vCPU improved.  This was odd, and, on the face of it, inexplicable.  If you&amp;#8217;re only half-to-three-quarters utilising 4 vCPUs, why would you get better utilisation when you doubled that number?  And if you weren&amp;#8217;t CPU-bound before, why would increasing the amount of virtual processors improve matters?&lt;/p&gt;

&lt;p&gt;We threw around some hypothesis and worked up some lines of investigation, which boiled down to &amp;#8220;more stats from the hypervisors, please&amp;#8221;, when I had a thought.&lt;/p&gt;

&lt;p&gt;These symptoms tickled my memory banks: a few weeks ago I&amp;#8217;d been reading about bizarre misbehaviour of large MySQL instances on modern x86 NUMA architectures, when the processes got to the point that they were so large that they grew larger than the bank of memory with affinity for a given processor; there&amp;#8217;s some write-ups &lt;a href=&quot;http://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/&quot;&gt;here&lt;/a&gt;, but it boils down to this: if you don&amp;#8217;t tell the kernel that you want it to ignore its normal best-guess behaviour about the penalties involved in the NUMA topologies, you&amp;#8217;ll see weird performance problems.  So, for shits and giggles, I suggested we shrink the JVM and guest under the size of a single bank of memory: almost halved the heap and guest sizes, and, at the same time, took the number of vCPUs back to 4.  Result?  It ran faster, and with a substantially better standard deviation of results.&lt;/p&gt;

&lt;p&gt;(Of course, to confirm this theory the real test will be what happens if we use numactl hints to force the KVM process to behave as we want.)&lt;/p&gt;

&lt;p&gt;Less, it would appear, is more.  And you need to understand what lies beneath your virtual layer.&lt;/p&gt;
 
    </content:encoded>

    <pubDate>Sat, 28 Apr 2012 07:53:19 +1200</pubDate>
    <guid isPermaLink="false">http://israel.diaspora.gen.nz/~rodgerd/archives/1419-guid.html</guid>
    
</item>
<item>
    <title>Speaking for the first time at linux.conf.au</title>
    <link>http://israel.diaspora.gen.nz/~rodgerd/archives/1418-Speaking-for-the-first-time-at-linux.conf.au.html</link>
            <category>Technical</category>
    
    <comments>http://israel.diaspora.gen.nz/~rodgerd/archives/1418-Speaking-for-the-first-time-at-linux.conf.au.html#comments</comments>
    <wfw:comment>http://israel.diaspora.gen.nz/~rodgerd/wfwcomment.php?cid=1418</wfw:comment>

    <slash:comments>1</slash:comments>
    <wfw:commentRss>http://israel.diaspora.gen.nz/~rodgerd/rss.php?version=2.0&amp;type=comments&amp;cid=1418</wfw:commentRss>
    

    <author>nospam@example.com (Rodger Donaldson)</author>
    <content:encoded>
    &lt;p&gt;I wasn&amp;#8217;t a keynoter.  Or even a regular presenter.  I was just doing a talk at a miniconf.  It was still an un-nerving enough experience that I went to see my doctor on the Thursday before I flew to Tullamarine to make sure the chest pains I was having weren&amp;#8217;t the onset of a heart attack.  It almost would have been a relief if they had been.&lt;/p&gt;

&lt;p&gt;As you &lt;a href=&quot;http://www.youtube.com/watch?v=hF3jHrCod3U&quot;&gt;can see&lt;/a&gt;, I did make it, and I didn&amp;#8217;t drop dead on stage.&lt;/p&gt;

&lt;p&gt;Normally I&amp;#8217;m pretty comfortable about speaking in front of people.  To the point where, for example, last year I needed to double the time
I&amp;#8217;d been told I would be allocated, and spoke extemporaneously from the bullet-points I&amp;#8217;d listed on a bit of paper, only looking at them
once.  Or, 4 years ago, spoke at a funeral after leaving my speech at home.  Give me a run-up and I can usually stand up and talk about most
anything I care about on short notice, and probably for longer than you envisaged when you asked.&lt;/p&gt;

&lt;p&gt;So why so nervous?  There&amp;#8217;s a simple reason: the audience.&lt;/p&gt;
 &lt;br /&gt;&lt;a href=&quot;http://israel.diaspora.gen.nz/~rodgerd/archives/1418-Speaking-for-the-first-time-at-linux.conf.au.html#extended&quot;&gt;Continue reading &quot;Speaking for the first time at linux.conf.au&quot;&lt;/a&gt;
    </content:encoded>

    <pubDate>Wed, 25 Jan 2012 22:12:29 +1300</pubDate>
    <guid isPermaLink="false">http://israel.diaspora.gen.nz/~rodgerd/archives/1418-guid.html</guid>
    <category>lca2012</category>

</item>
<item>
    <title>Bloat: How and Why UNIX Grew Up (and Out)</title>
    <link>http://israel.diaspora.gen.nz/~rodgerd/archives/1417-Bloat-How-and-Why-UNIX-Grew-Up-and-Out.html</link>
            <category>Technical</category>
    
    <comments>http://israel.diaspora.gen.nz/~rodgerd/archives/1417-Bloat-How-and-Why-UNIX-Grew-Up-and-Out.html#comments</comments>
    <wfw:comment>http://israel.diaspora.gen.nz/~rodgerd/wfwcomment.php?cid=1417</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://israel.diaspora.gen.nz/~rodgerd/rss.php?version=2.0&amp;type=comments&amp;cid=1417</wfw:commentRss>
    

    <author>nospam@example.com (Rodger Donaldson)</author>
    <content:encoded>
    &lt;p&gt;Rusty Russell &amp;amp; Matt Evans &lt;/p&gt;

&lt;h4&gt;Three Cool Projects&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Spark - command-line tool that generates sparks&lt;/li&gt;
&lt;li&gt;Plover - An open-source stenography alternative.&lt;/li&gt;
&lt;li&gt;Homebrew Cray-1A http://chrisfenton/homebrew-cray-1a/&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;In The Beginning&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&amp;#8220;The year was 1976, the hair was long, the shoes were tall.&amp;#8221;&lt;/li&gt;
&lt;li&gt;It&amp;#8217;s a multi-user machine, it has &lt;strong&gt;two&lt;/strong&gt; teletype machines in front of it!&lt;/li&gt;
&lt;li&gt;Bring on the PDP-11 simulator!&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Comparisons&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;cat, grep, and ls are the punching bags.&lt;/li&gt;
&lt;li&gt;&amp;#8220;Bigger is better.  grep is 20 times bigger.&amp;#8221;&lt;/li&gt;
&lt;li&gt;&amp;#8220;cd only came in in V7.&amp;#8221; &amp;#8220;I edited the shell so I could have cd.&amp;#8221;&lt;/li&gt;
&lt;li&gt;cat was written in assembler in 1976.&lt;/li&gt;
&lt;li&gt;In V6 arguments were in-line.&lt;/li&gt;
&lt;li&gt;The first thing you notice is that they are memory-concious.  Although Rusty points out that they don&amp;#8217;t bother with system calls; they also just use assembler because it&amp;#8217;s natural.&lt;/li&gt;
&lt;li&gt;Rusty re-implemented cat with bug/behaviour and it was only twice is big in C.  Modern cat is big in part becuase of more features, error messages, and so on.&lt;/li&gt;
&lt;li&gt;We pay a 30% memory penalty if we use -O2 instead of -Os.&lt;/li&gt;
&lt;li&gt;But -Os is slower by about 6% for these simple utilities.&lt;/li&gt;
&lt;li&gt;Automated runtime analysis tells us 99% of the instructions are used at some point, with only one instruction ever being used.  1% bloat!&lt;/li&gt;
&lt;li&gt;Even going to V7 in 1979 ls has doubled in size.  cat only uses 57% of its instructions.&lt;/li&gt;
&lt;li&gt;...but if you built static cat instead of shared libraries it pulls in another 700KB of glibc dependcies!&lt;/li&gt;
&lt;li&gt;There&amp;#8217;s a dependency graph.  It looks like scribble.&lt;/li&gt;
&lt;li&gt;It includes TLS (in case you need to fetch from Reddit).&lt;/li&gt;
&lt;li&gt;When we instrument cat on x86 we find that we use... um... 2% of it.  Bugger.
0 &lt;/li&gt;
&lt;li&gt;On a whole-system analysis there&amp;#8217;s 33 MB of wasted RAM.  Not much compared to all the memory.&lt;/li&gt;
&lt;li&gt;But there may be a TLB hit.&lt;/li&gt;
&lt;li&gt;Of course, 16-bit vs 64-bit is unfair.  So Rusty guessimated the change in Text and Data segments.  There&amp;#8217;s some big growth, around 50%.&lt;/li&gt;
&lt;li&gt;By way of comparison 32-bit to 64-bit Ubuntu is only 9%.&lt;/li&gt;
&lt;li&gt;If you pull the old code to an Ubuntu system you actually cut the text segment (if you&amp;#8217;ve got a stripped-down glibc).  ELF, on the other hand, adds stuff, mostly to force on-page alignment.  Even so, cat is only marginally bigger that 64-bit PDP cat.  Not too bad.&lt;/li&gt;
&lt;li&gt;grep embiggened significantly.&lt;/li&gt;
&lt;li&gt;ls was already complex, with 10 flags. We also have to grow buffers for moving from 14 byte filenames to 255 bytes.  It doesn&amp;#8217;t use malloc, doing funky magic to grow the program when it starts running out.  You kind of need it to use malloc() nowadays.  So you grow 120%, because of a combination the changes.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Backporting&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Turns out GNU ls has 60-odd options.  A survey of Rusty&amp;#8217;s friends says 11 of them were never used.&lt;/li&gt;
&lt;li&gt;So some of this size are the extra options.&lt;/li&gt;
&lt;li&gt;For cat it&amp;#8217;s easy: remove all the options and error reporting.&lt;/li&gt;
&lt;li&gt;Cat does some odd malloc() behaviour to have aliged, page-size buffers.&lt;/li&gt;
&lt;li&gt;Backported cat is  still bigger than forward-ported cat.&lt;/li&gt;
&lt;li&gt;ls required Vast Surgery.&lt;/li&gt;
&lt;li&gt;It grabs system to the nanosecond so it can show entries more than 6 months old differently.&lt;/li&gt;
&lt;li&gt;It&amp;#8217;s much faster, although it&amp;#8217;s probably down to LOCALE complexity slowing up non-backported.&lt;/li&gt;
&lt;li&gt;There&amp;#8217;s a 60% penality.  But that&amp;#8217;s for portability, 64-bit and so on.&lt;/li&gt;
&lt;li&gt;400-odd% bloat?  That&amp;#8217;s the extra features.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Most people aren&amp;#8217;t prepared to go to the same lengths to keep things small.&lt;/li&gt;
&lt;li&gt;asmutils - reimplementation of *ix utils, but it&amp;#8217;s not actually that efficient: it loses all the gains in BSS bloat that they don&amp;#8217;t botherr measuring.  Bummber.&lt;/li&gt;
&lt;li&gt;Features are the reason for growth.&lt;/li&gt;
&lt;/ul&gt;
 
    </content:encoded>

    <pubDate>Fri, 20 Jan 2012 14:08:07 +1300</pubDate>
    <guid isPermaLink="false">http://israel.diaspora.gen.nz/~rodgerd/archives/1417-guid.html</guid>
    <category>lca2012</category>

</item>
<item>
    <title>A (Mostly) Gentle Introduction to Computer Security</title>
    <link>http://israel.diaspora.gen.nz/~rodgerd/archives/1416-A-Mostly-Gentle-Introduction-to-Computer-Security.html</link>
            <category>Technical</category>
    
    <comments>http://israel.diaspora.gen.nz/~rodgerd/archives/1416-A-Mostly-Gentle-Introduction-to-Computer-Security.html#comments</comments>
    <wfw:comment>http://israel.diaspora.gen.nz/~rodgerd/wfwcomment.php?cid=1416</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://israel.diaspora.gen.nz/~rodgerd/rss.php?version=2.0&amp;type=comments&amp;cid=1416</wfw:commentRss>
    

    <author>nospam@example.com (Rodger Donaldson)</author>
    <content:encoded>
    &lt;ul&gt;
&lt;li&gt;Security needs to be a first-class design concern.&lt;/li&gt;
&lt;li&gt;You don&amp;#8217;t need to fix all the bugs, you just need to be better than the other guy.&lt;/li&gt;
&lt;li&gt;Code is growing in complexity (11 million+ lines in the kernel) and more people using tools (750,000+ Android devices activate per day).&lt;/li&gt;
&lt;li&gt;Security is an arms race, where good guys and bad guys compete.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Why do Hackers Attack?&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Build botnets.  Botnets can do things you can&amp;#8217;t do otherwise; bitcoin generation, spam, etc.&lt;/li&gt;
&lt;li&gt;Gain control ofuseful private information (creditcards)&lt;/li&gt;
&lt;li&gt;Punish/embarrassing people, e.g. Sony, Church of Scientology, etc&lt;/li&gt;
&lt;li&gt;To educate and advocate, e.g. Firesheep.&lt;/li&gt;
&lt;li&gt;Earn a reputation.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Lulz.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Win the Bear Race!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;The attackers are the bear.  Don&amp;#8217;t be something the bear wants to eat.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Attacks&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Buffer Overflows: the daddy of attacks.  The classic stack-smasher.&lt;/li&gt;
&lt;li&gt;The cure is to refuse to allow execution on the stack; e.g. the NX bit.&lt;/li&gt;
&lt;li&gt;The countermeasure to that is the &amp;#8220;heap spray&amp;#8221; attack; the buffer overwrite gets initial access, but instead of stack-smashing, you inject huge amounts of data into the heap, which is also valid code.  Then you jump to a random address in the heap space, which will give you a decent chance of hitting executable code in the heap, and away you go.&lt;/li&gt;
&lt;li&gt;People have tried having ROM-only execution, refusing to execute out of memory.&lt;/li&gt;
&lt;li&gt;The counter to this is &amp;#8220;return oriented programming&amp;#8221;.  You overrun, inspect the ROM, and then use the functions and function fragments in the ROM, chaining them together to build what you want. e.g. one team were able to implement a Turing-complete VM for themselves.&lt;/li&gt;
&lt;li&gt;Hardware attacks.  e.g. Adding 1400 gates as a rogue designer to make a hardware Linux backdoor; bug in Intel processor to make an OS-independent attack.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Defences&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;No-execute stacks.  Can be thwarted by heap-spray in a naive implementation; that is countered by heap address randomisation.  Implemented in recent Linux, Windows, and MacOS.  This means you need to jump to the right address for the heap overflow on the first time, greatly reducing the likelihood.&lt;/li&gt;
&lt;li&gt;&amp;#8220;Stack Canaries&amp;#8221; are embedded in the stack frame, as a random value embedded in each function.  If the canary is wrong, the execution simply halts.&lt;/li&gt;
&lt;li&gt;Only fully available in OpenBSD.&lt;/li&gt;
&lt;li&gt;Encrypted pointers with StackGuard.  Every pointer is encypted with a different, simple XOR on every execution.&lt;/li&gt;
&lt;li&gt;All of these techniques have a cost in memory or CPU.&lt;/li&gt;
&lt;li&gt;Sandboxing.  Classic technique from virtual machines (in the Smalltalk/Java sense).  For example, Chrome, Firefox, and IE 9 all implement this.  All tabs talk to a policy manager, rather than to the underlying operating system.&lt;/li&gt;
&lt;li&gt;The &amp;#8220;Browser in the Middle&amp;#8221; attack - an attack where visiting a car forum would trigger JS that would check for an open tab to an HSBC internet banking session, and would do a $1 funds transfer.&lt;/li&gt;
&lt;li&gt;What can you do about buffer overflows? Safe languages.  Eliminates Spacial and Temporal buffer overflows.&lt;/li&gt;
&lt;li&gt;But in May 2011 the two top attack vectors were Flash and Oracle Java, both of which are managed code.  Ooops.&lt;/li&gt;
&lt;li&gt;Languages can be safe.  But implementations can be unforgivably broken.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Developers Need to Adopt Better Development Strategys&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&amp;#8220;For every piece of software there is a trail of abused users.&amp;#8221;&lt;/li&gt;
&lt;li&gt;So employ tools to find vulnerabilities.&lt;/li&gt;
&lt;li&gt;We fix earlier before we release, and we release cleaner code.&lt;/li&gt;
&lt;li&gt;Constrain inputs. If you can&amp;#8217;t put garbage into the program, you&amp;#8217;ve reatly reduced the possibility of an attack.&lt;/li&gt;
&lt;li&gt;The tools that can check and produce reommendations on vulnerabilities can be used by the bad guys.&lt;/li&gt;
&lt;li&gt;Dynamic taint checking can be used to uncover paths of external inputs that need to be sanity-checked.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Dataflow Analaysis 101&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;We build a graph on how the data flows through the program, propogating the taint information.&lt;/li&gt;
&lt;li&gt;This will only work on buffer overflows.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Side-Channel Attacks and Protections&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Depressing because your system qua system is secure, well-thought out, and well-implemented.  but someone creates e.g. an environmental condition that violates your assumptions.&lt;/li&gt;
&lt;li&gt;For example, a power fluctuation can indication whether you are mostly doing 1s or 0s.&lt;/li&gt;
&lt;li&gt;The classic RSA attack was based on how long it reuquire to encrypt information.&lt;/li&gt;
&lt;li&gt;Keyboard noise detection.  Tempest devices can read your screen.&lt;/li&gt;
&lt;li&gt;Punishing a system.&lt;/li&gt;
&lt;li&gt;Cash.  Blackmail.&lt;/li&gt;
&lt;li&gt;Other human engineering, e.g. the Red-Headed League.&lt;/li&gt;
&lt;li&gt;An old attack on DES relied on the fact that DES would flush the cache when it was processing data.  Because the placement of the code and data in the cache is consistent, forcing invalidation of the cache can reveal enough timing infomation to let you work out the 1s and 0s.&lt;/li&gt;
&lt;li&gt;Differential Power Analysis.  You know XOR takes less power than ADDs, while MUL requires more.  So based on the operations you can derive the key; for example, AES is almost all XOR, so watching the power fluctuations will tell you if it&amp;#8217;s XOR 0,0 XOR 1,1 XOR 0,1.&lt;/li&gt;
&lt;li&gt;Timing-based attacks.  Daniel Bernstein demonstrated a timing attack against AES based on cache timings.&lt;/li&gt;
&lt;li&gt;Fault-based attacks: this is Valeria&amp;#8217;s attack from the morning.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Tools for More Secure Software&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Valgrind is a platform that runs as a VM on Linux, with a suite of plugins that find memory leaks, dangling pointers, races, and so on.&lt;/li&gt;
&lt;li&gt;Programs run slooooooooooooooower while running under Valgrind.&lt;/li&gt;
&lt;li&gt;Apparently it&amp;#8217;s pronounced Val-grin-d not Val-grind.&lt;/li&gt;
&lt;li&gt;OpenBSD ProPolice.  Also becoming available in GCC.&lt;/li&gt;
&lt;li&gt;Fuzz testing will find all sorts of crap, albeit shallow crap.&lt;/li&gt;
&lt;li&gt;Google&amp;#8217;s browser fuzz tester found hundreds of defects when it was first released.&lt;/li&gt;
&lt;li&gt;Klee is a more thorough fuzz tester, checking the code coverage as it fuzzes, rather than operating purely randomly.  It tries to execute every path that exists, allowing it to come up with deep bugs.&lt;/li&gt;
&lt;li&gt;Metasploit will package up attacks and go after your machine(s).  &lt;/li&gt;
&lt;li&gt;NMAP.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Q&amp;amp;A&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;The embedded space is incredibly immature.  Headed a panel with teams who have remote-owned cars, UAVs, and Pacemakers.  He described it as the most terrifying panel he&amp;#8217;s run.&lt;/li&gt;
&lt;li&gt;There are even PHP static alanalysers now.&lt;/li&gt;
&lt;/ul&gt;
 
    </content:encoded>

    <pubDate>Thu, 19 Jan 2012 17:09:44 +1300</pubDate>
    <guid isPermaLink="false">http://israel.diaspora.gen.nz/~rodgerd/archives/1416-guid.html</guid>
    
</item>
<item>
    <title>Torturing OpenSSL</title>
    <link>http://israel.diaspora.gen.nz/~rodgerd/archives/1415-Torturing-OpenSSL.html</link>
            <category>Technical</category>
    
    <comments>http://israel.diaspora.gen.nz/~rodgerd/archives/1415-Torturing-OpenSSL.html#comments</comments>
    <wfw:comment>http://israel.diaspora.gen.nz/~rodgerd/wfwcomment.php?cid=1415</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://israel.diaspora.gen.nz/~rodgerd/rss.php?version=2.0&amp;type=comments&amp;cid=1415</wfw:commentRss>
    

    <author>nospam@example.com (Rodger Donaldson)</author>
    <content:encoded>
    &lt;p&gt;Valeria Bertacco&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Valeria has a talk, and a demo, but of course the hardware isn&amp;#8217;t co-operating.&lt;/li&gt;
&lt;li&gt;Cryptography is pervasive.  It&amp;#8217;s also big business.  The direct value of companies like RSA and Verisign is tens of billions.  The value of ecommerce companies is hundreds of billions.&lt;/li&gt;
&lt;li&gt;Asymmetric cryptography, RSA keys, rely on two large primes, with which ou perform clever maths.&lt;/li&gt;
&lt;li&gt;Cryptanalysis: poking the cryptography with a stick.&lt;/li&gt;
&lt;li&gt;2009 we proved you could brute-force 768-bit keys, but it required computation-years to do reliably.&lt;/li&gt;
&lt;li&gt;Side-channel attacks: you measure the time required to encrypt, and guess the key form that.  We no pad encryption to avoid this.&lt;/li&gt;
&lt;li&gt;Fault-based: a faulty CPU may leak information in the form of errors.&lt;/li&gt;
&lt;li&gt;Attacks via Transient Faults: when transistors give the wrong values intermittently.&lt;/li&gt;
&lt;li&gt;These are normal events, but normally last &amp;lt;1 clock cycle, but in bad cases will propogate up the stack to the software.&lt;/li&gt;
&lt;li&gt;The probability is very low.  But they can be triggered by solar particles (alpha particles, which is dependent on altitude), and are non-predictable.&lt;/li&gt;
&lt;li&gt;As transistors have shrunk, they have become more susceptible to this sort of fault, because they&amp;#8217;ve become more fragile.&lt;/li&gt;
&lt;li&gt;As we get smaller we may even get to the point where a sinle alpha particle can flip many transistors.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Forcing Faults Reliably&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;A transient fault that occurs when performing the handshake may leak information through the corrupt response.  If you can do enough corrupted handshakes, you may get a lot of informaiton.&lt;/li&gt;
&lt;li&gt;Our testbed is an FPGA board running a SPARC v8@40MHz, running Debian.  We munt it with a voltage controller to induce faults.&lt;/li&gt;
&lt;li&gt;When you drop voltage, the multiplier, which is the most sensitive component, will have problems.  If we gofrom 1.5V to 1.0V always fails.  But if you drop it to, say, 1.3V, it will fail intermittently.&lt;/li&gt;
&lt;li&gt;OpenSSL uses a fast algorithm to encrypt; it then verifies the encrypted data, and falls back to a slower, more reliable algorithm if that doesn&amp;#8217;t work.&lt;/li&gt;
&lt;li&gt;The attacker collects the faulty signatures over the time.  It relies on the fact the slow, reliable algorithm breaks the message into windows.  The attacker can collect leaked information window-by-window.&lt;/li&gt;
&lt;li&gt;This ends as a window-by-window brute force.&lt;/li&gt;
&lt;li&gt;But it means you are brute-forcing only, e.g. 4 bits instead of 1024 bits.  100 seconds per check, 2^6 checks in the worst case.&lt;/li&gt;
&lt;li&gt;This makes it quite a lot easier to break the server&amp;#8217;s private key.&lt;/li&gt;
&lt;li&gt;In the example, 8,800 signatures were collected in a few hou rs, and then analyzed.  They 1024-bit private key was cracked in 100 HOURS.&lt;/li&gt;
&lt;li&gt;Apparently 60 degrees C is about optimal for creating key-cracking errors with the hardware she&amp;#8217;s using.  Temperature is harder to control that voltage, though.&lt;/li&gt;
&lt;li&gt;OpenSSL 0.9.8i was the victim in this case; before giving the talk, Valeria supplied a patch to stop using the fall-back algorithm, which helps avoid that specific attack.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;General Advice&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Keep crypto libraries up to date.&lt;/li&gt;
&lt;li&gt;Overclocking &lt;em&gt;IS A SECURITY RISK&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Overheating &lt;em&gt;IS A SECURITY RISK&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Unreliable power &lt;em&gt;IS A SECURITY RISK&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Brilliant session.&lt;/p&gt;
 
    </content:encoded>

    <pubDate>Thu, 19 Jan 2012 11:50:55 +1300</pubDate>
    <guid isPermaLink="false">http://israel.diaspora.gen.nz/~rodgerd/archives/1415-guid.html</guid>
    <category>lca2012 openssl rsa cryptography crypto</category>

</item>
<item>
    <title>The Samba tour of scripting languages</title>
    <link>http://israel.diaspora.gen.nz/~rodgerd/archives/1414-The-Samba-tour-of-scripting-languages.html</link>
            <category>Technical</category>
    
    <comments>http://israel.diaspora.gen.nz/~rodgerd/archives/1414-The-Samba-tour-of-scripting-languages.html#comments</comments>
    <wfw:comment>http://israel.diaspora.gen.nz/~rodgerd/wfwcomment.php?cid=1414</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://israel.diaspora.gen.nz/~rodgerd/rss.php?version=2.0&amp;type=comments&amp;cid=1414</wfw:commentRss>
    

    <author>nospam@example.com (Rodger Donaldson)</author>
    <content:encoded>
    &lt;p&gt;Andrew Bartlett and Amitay Isaacs&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Samba has hardcore portability requirements.&lt;/li&gt;
&lt;li&gt;m4, sh, and other bare-bones tools.  autoconf gone mad: 4,000 lines of m4 code.&lt;/li&gt;
&lt;li&gt;Scripting language of the month club: Python, then TCL, then Lua were all put in and pulled out.  None of them were loved or portable enough.&lt;/li&gt;
&lt;li&gt;Then Perl went in.&lt;/li&gt;
&lt;li&gt;Became used for all manner of build and testing tasks.&lt;/li&gt;
&lt;li&gt;awk was then used to try and develop an IDL to autogenerate code to spec.  It didn&amp;#8217;t really work as you&amp;#8217;d like, needing tweaking by hand.&lt;/li&gt;
&lt;li&gt;This lead to PIDL, the Perl IDL compiler.  It worked far better, with IDL code being used as-is, no tweaking required.  Use is now pervasive, generating both server and client code.&lt;/li&gt;
&lt;li&gt;This has been hugely productive and important in allow significant, rapid change.&lt;/li&gt;
&lt;li&gt;Then it caught JavaScript before it was cool.  Tride gave many convincingsounding reasons as to why it&amp;#8217;s a great idea.&lt;/li&gt;
&lt;li&gt;It was very easy to embed, with minimal dependencies to make it work.&lt;/li&gt;
&lt;li&gt;JS could even make RPC calls.&lt;/li&gt;
&lt;li&gt;But something went wrong.  The cool kids were using Python.  So they went back to Python.  There may have been chloroform and lies to subdue tridge.&lt;/li&gt;
&lt;li&gt;These lies may have revolved around embedded python and debugging.&lt;/li&gt;
&lt;li&gt;Tridge is now a python fanboi.  There&amp;#8217;s a general love of Python permeating the project.&lt;/li&gt;
&lt;li&gt;IDL generated bindings are everywhere, with bindings into every component: ldb, tdb, and so on.  If it&amp;#8217;s a useful part of Samba, you can probably access it directly from within Python.&lt;/li&gt;
&lt;li&gt;Things that were being done in C are being migrated to Python; e.g. samba-tool has migrated from a pure C tool to a Python tool with C extensions.&lt;/li&gt;
&lt;li&gt;Many small tasks are now fork()ed from the core Samba processes and run as Pyhon tools - which makes it trivial to debug bad cases by running the tool from the command line with the same parameters.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Some Examples&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;The Samba3 migration tools were clunky; in 2 weeks they were (re)-written in Python with C bindings.  The business logic was re-written in Python.&lt;/li&gt;
&lt;li&gt;Python is now the core of the build system, via WAF.&lt;/li&gt;
&lt;li&gt;Does ABI checking: checks that all the contracts are consistent, and alerts developers when they aren&amp;#8217;t.  Maps all the dependencies.&lt;/li&gt;
&lt;li&gt;Testing Samba: both unit testing and environment testing.  The latter is the more challenging, because it requires a running server.  And there are many, many different options for Samba 4 when running  as an AD server.  So it ends up creating 7, 8, or more environments for the test suites to run.&lt;/li&gt;
&lt;li&gt;These tests are now run as part of the commit process - continuous integration.  9,000 tests in 1,300 test suites.&lt;/li&gt;
&lt;/ul&gt;
 
    </content:encoded>

    <pubDate>Wed, 18 Jan 2012 19:11:26 +1300</pubDate>
    <guid isPermaLink="false">http://israel.diaspora.gen.nz/~rodgerd/archives/1414-guid.html</guid>
    <category>lca2012 samba python</category>

</item>
<item>
    <title>I Can't Believe This is Butter! A tour of btrfs.</title>
    <link>http://israel.diaspora.gen.nz/~rodgerd/archives/1412-I-Cant-Believe-This-is-Butter!-A-tour-of-btrfs..html</link>
            <category>Technical</category>
    
    <comments>http://israel.diaspora.gen.nz/~rodgerd/archives/1412-I-Cant-Believe-This-is-Butter!-A-tour-of-btrfs..html#comments</comments>
    <wfw:comment>http://israel.diaspora.gen.nz/~rodgerd/wfwcomment.php?cid=1412</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://israel.diaspora.gen.nz/~rodgerd/rss.php?version=2.0&amp;type=comments&amp;cid=1412</wfw:commentRss>
    

    <author>nospam@example.com (Rodger Donaldson)</author>
    <content:encoded>
    &lt;p&gt;Avi Miller&lt;/p&gt;

&lt;p&gt;Some key points (having lost many notes due to Firefox being fucking useless).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;There&amp;#8217;s a bunch of stuff still working badly or not, and optimisations.&lt;/li&gt;
&lt;li&gt;e.g. metadata is fixed at 4K blocks for metadata, and that hurts performance.  This is being fixed.&lt;/li&gt;
&lt;li&gt;RAID is block redundency across disks.  So a RAID-1 mirror with 5 different-sized disks will simply make sure that blocks are duped &lt;strong&gt;somewhere&lt;/strong&gt; in the array.&lt;/li&gt;
&lt;li&gt;Scrubbing is great, and will auto-fix on read.  There are some important caveats, though; the biggest is that btr prefers to always read from the same device if it can.  This means that if you don&amp;#8217;t force scrubs occasionally you can have a drive crap itself, pull the drive, and then discover your alternate block was corrupt.  And be unable to find a good copy.  Oops.&lt;/li&gt;
&lt;li&gt;Chris M recommends scrubbing periodically with the sum tool from time to time (say a week for busy filesystem).&lt;/li&gt;
&lt;li&gt;You can mount any device in an array and everything mounts.&lt;/li&gt;
&lt;li&gt;No idea what happens if you try mounting multiple devices in the array.&lt;/li&gt;
&lt;li&gt;Disk replacement is working smoothly, and Just Works.&lt;/li&gt;
&lt;li&gt;btr send/recieve is working.  It sends a &amp;#8220;neutral&amp;#8221; stream, so it ought to scrub and dump errors.&lt;/li&gt;
&lt;li&gt;btr is friendlier to small machines that ZFS, but not to small disks - it tends to allocate heaps of metadata.&lt;/li&gt;
&lt;li&gt;RAID 0, 1, and 10 are there, but RAID 5, 6 and triple mirroring are still sitting in the merge queue, thanks to Intel.&lt;/li&gt;
&lt;li&gt;You can mix RAID levels in the same disks, because, hey, it&amp;#8217;s just block duplication.&lt;/li&gt;
&lt;li&gt;Unfortunately df and the like just Don&amp;#8217;t Work.  e.g. until you force sync, the filesystem will report the wrong utilisation, and it will always tell you the FS size is the sum of all the disks in an array.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;When Bad Things Happen to Good Data&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;There&amp;#8217;s a read-only btrfs tool, so you can try and save your data when btr goes bad.  It works well.&lt;/li&gt;
&lt;li&gt;Chris Mason will be talking about btrfs on Saturday.  You may choose to assume that btrfsck will be announced then.  If you want.&lt;/li&gt;
&lt;li&gt;Oracle have publicly stated that they will take it into production with btrfs.&lt;/li&gt;
&lt;li&gt;Even if the filesystem isn&amp;#8217;t changing, the metadata rolls its root backup (every 30 seconds).  You can switch off.&lt;/li&gt;
&lt;li&gt;Avi has some amusing tools to corrupt files and filesystems.&lt;/li&gt;
&lt;li&gt;And &amp;#8220;mount -o recovery&amp;#8221; just fixed the checksum corruption he inflicted on his test filessytem.  Worst case scenario you&amp;#8217;ve lost 30 seconds of data per write.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Beeeellions of files&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;ext4, xfs, and btrfs all have problem with lots of files.&lt;/li&gt;
&lt;li&gt;ext4 is journal-bound&lt;/li&gt;
&lt;li&gt;xfs has fixed this in head.  It spams files all over the place and gets generally good performance, bt generates many seeks.&lt;/li&gt;
&lt;li&gt;btr load-levels across the disk, not isn&amp;#8217;t seek-thrashing the disk.&lt;/li&gt;
&lt;li&gt;btr and xfs are both CPU-limited on SSDs.&lt;/li&gt;
&lt;li&gt;&amp;#8220;seekwatcher&amp;#8221; is one of Chris M&amp;#8217;s tools that shows what&amp;#8217;s doing on.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;yum upgrade and snapshots&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Requires btrfs root, and allows you to snapshot on upgrade and rollback in one hit.&lt;/li&gt;
&lt;li&gt;It&amp;#8217;s easier to use Fedora than OEL to convert the FS from ext4.  Since ext4 is stores as a conversion snapshot, you can rollback to ext4 later.&lt;/li&gt;
&lt;li&gt;Avi no longer uses the 3D accelerators for VirtualBox so he never has to use GNOME 3.&lt;/li&gt;
&lt;li&gt;When you convert ext4 -&gt; btrfs remember to edit /etc/fstab at change the FS type!&lt;/li&gt;
&lt;li&gt;You need the yum snapshot plugin to be installed.&lt;/li&gt;
&lt;li&gt;Then yum install just creates a snapshot.&lt;/li&gt;
&lt;li&gt;New Fujitsu logging has improved the speed of apt-get and yum, both of which generate a lot of fsync() calls.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Questions&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Some people do md-raid and btr-RAID.&lt;/li&gt;
&lt;li&gt;Dedupe?  Not on the roadmap right now.  Disks are so big; the cost of CPU and RAM to dedupe is huge.&lt;/li&gt;
&lt;/ul&gt;
 
    </content:encoded>

    <pubDate>Wed, 18 Jan 2012 12:50:08 +1300</pubDate>
    <guid isPermaLink="false">http://israel.diaspora.gen.nz/~rodgerd/archives/1412-guid.html</guid>
    <category>lca2012 btrfs</category>

</item>
<item>
    <title>Where is Your Data Cached and Where Should It Be Cached?</title>
    <link>http://israel.diaspora.gen.nz/~rodgerd/archives/1411-Where-is-Your-Data-Cached-and-Where-Should-It-Be-Cached.html</link>
            <category>Technical</category>
    
    <comments>http://israel.diaspora.gen.nz/~rodgerd/archives/1411-Where-is-Your-Data-Cached-and-Where-Should-It-Be-Cached.html#comments</comments>
    <wfw:comment>http://israel.diaspora.gen.nz/~rodgerd/wfwcomment.php?cid=1411</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://israel.diaspora.gen.nz/~rodgerd/rss.php?version=2.0&amp;type=comments&amp;cid=1411</wfw:commentRss>
    

    <author>nospam@example.com (Rodger Donaldson)</author>
    <content:encoded>
    &lt;p&gt;Sarah Novotny&lt;/p&gt;

&lt;p&gt;Origin of the talk was when a customer rang with a complaint that a site was wrong, but Sarah couldn&amp;#8217;t find a problem, and this provoked her into thinking about where data can ad should be cached.&lt;/p&gt;

&lt;h4&gt;Why Cache?&lt;/h4&gt;

&lt;p&gt;We want to move data as close to the end used, while retaining ACID-style guarantees.  The abandonment rate after 7 seconds is huge.  We need reliable speed.&lt;/p&gt;

&lt;h4&gt;Count Them.&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;CPU Caches - L1, L2, L3.&lt;/li&gt;
&lt;li&gt;Filesystem/VM caches.&lt;/li&gt;
&lt;li&gt;Controller caching.&lt;/li&gt;
&lt;li&gt;Disk caches.  Great, but disks now lie about whether the write is there or on the platter.&lt;/li&gt;
&lt;li&gt;SSD hybrids.&lt;/li&gt;
&lt;li&gt;RAM on disks.&lt;/li&gt;
&lt;li&gt;DB caching.&lt;/li&gt;
&lt;li&gt;memcached or other application level caching.&lt;/li&gt;
&lt;li&gt;Protocol level caching, e.g. HTTP, DNS.&lt;/li&gt;
&lt;li&gt;Transparent proxies.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Browser caching.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A lot of this is transparent to you.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;Sometimes stuff ignores semantics around caching, too.&lt;/li&gt;
&lt;li&gt;Users often don&amp;#8217;t have the knowledge to bypass bad caching by e.g. doing a browser force-refresh.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;A Short Diversion&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;DBA/SA background means Sarah cares a lot about ACID demantics around data.&lt;/li&gt;
&lt;li&gt;Will therefore focus on the DB &lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Which Caches are Redundant&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Some caching is redundant.&lt;/li&gt;
&lt;li&gt;The tackle the same functions, but are either redundant or even harmful.  Battery-backed controller caches are good and cache disks.  Disk caches cache, but are unlikely to be &amp;#8220;safe&amp;#8221;.&lt;/li&gt;
&lt;li&gt;You need to ensure durability in those cases.&lt;/li&gt;
&lt;li&gt;For MySQL you also have InnoDB query caches and buffer caches.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why do we keep doing this?  Because we want things to go faster!  But there&amp;#8217;s a conflict between the DB cache and filesystem cache, too.  You&amp;#8217;re double-buffering.  They aren&amp;#8217;t particular dangerous on modern filesystems, but it&amp;#8217;s an inefficient use of memory and CPU to manage both sets of caches.&lt;/p&gt;

&lt;h4&gt;Which Caches are Risky&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Expiries not set well on memcached will result in data being lost; Sarah is of the opinion you should only use this for temp data.&lt;/li&gt;
&lt;li&gt;Hypervisors often cache disk in memory, without advising the guest what happens here.&lt;/li&gt;
&lt;li&gt;Disks lie!  They are reporting writes suceeding when they aren&amp;#8217;t on the platter for reals.&lt;/li&gt;
&lt;li&gt;RAID controllers lie, but at least they lie with battery backup (if you spent the money), so you&amp;#8217;re probably OK.&lt;/li&gt;
&lt;li&gt;The last two are really toxic, because you can end up losing data on power failure.  Sarah recommends controlled power failures to test this.&lt;/li&gt;
&lt;li&gt;TURN YOUR DISK CACHE OFF IF YOU VALUE YOUR DATA.&lt;/li&gt;
&lt;li&gt;MySQL generally does better if it bypasses the FS cache for direct-attached storage.  However, for SAN-attached disk you should leave FS caching on.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Benchmarking&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;You need to be careful when benchmarking, but in general it&amp;#8217;s good and you can never do enough.&lt;/li&gt;
&lt;li&gt;It&amp;#8217;s not magic.  You just need to do it right.&lt;/li&gt;
&lt;li&gt;Don&amp;#8217;t do bad benchmarks that just e.g. exercise your cache.&lt;/li&gt;
&lt;li&gt;You need to touch the slowest part of the system.  Force pessimistic scenarios, e.g. when your controller cache goes offline.&lt;/li&gt;
&lt;li&gt;You also want to test the normal production case, with real data sets and a workload that looks like production behaviour.&lt;/li&gt;
&lt;li&gt;You can test in prod, but you should use proper staging hardware that&amp;#8217;s similar.&lt;/li&gt;
&lt;li&gt;Benchmarking with real data also exercises your backups if you populate from them.&lt;/li&gt;
&lt;li&gt;You can also use a replica/DR server on a short-term basis.  Breaking replication and then restoring is good practise for this.&lt;/li&gt;
&lt;/ul&gt;

&lt;h5&gt;Monitoring&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;Only monitor the stuff you want.&lt;/li&gt;
&lt;li&gt;Test multiple layers in your infrastructure  and that you test both what the end customer sees, as well as each touch point along the way.&lt;/li&gt;
&lt;li&gt;Monitoring is an evolving case; treat it like you&amp;#8217;d treat unit testing in software.&lt;/li&gt;
&lt;li&gt;There&amp;#8217;s no boilerplate.  Every system is unique.  &lt;/li&gt;
&lt;li&gt;So many tools.&lt;/li&gt;
&lt;/ul&gt;
 
    </content:encoded>

    <pubDate>Tue, 17 Jan 2012 18:39:26 +1300</pubDate>
    <guid isPermaLink="false">http://israel.diaspora.gen.nz/~rodgerd/archives/1411-guid.html</guid>
    <category>lca2012</category>

</item>
<item>
    <title>Extracting Metrics from Logs for Realtime Trending and Alerting</title>
    <link>http://israel.diaspora.gen.nz/~rodgerd/archives/1409-Extracting-Metrics-from-Logs-for-Realtime-Trending-and-Alerting.html</link>
            <category>Technical</category>
    
    <comments>http://israel.diaspora.gen.nz/~rodgerd/archives/1409-Extracting-Metrics-from-Logs-for-Realtime-Trending-and-Alerting.html#comments</comments>
    <wfw:comment>http://israel.diaspora.gen.nz/~rodgerd/wfwcomment.php?cid=1409</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://israel.diaspora.gen.nz/~rodgerd/rss.php?version=2.0&amp;type=comments&amp;cid=1409</wfw:commentRss>
    

    <author>nospam@example.com (Rodger Donaldson)</author>
    <content:encoded>
    &lt;p&gt;Jamie Wilkinson&lt;/p&gt;

&lt;h3&gt;The Problem&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Cluster of slapd/bind/rsync/etc machines.&lt;/li&gt;
&lt;li&gt;How do we monitor these systems?&lt;/li&gt;
&lt;li&gt;Google has their own proprietary monitoring system.  It&amp;#8217;s basically pervasive to everything they write and any internal libraries etc you use.&lt;/li&gt;
&lt;li&gt;They wanted to maximise reuse.&lt;/li&gt;
&lt;li&gt;Whitebox: the app produces enough data to let you inspect the internal state of the application.&lt;/li&gt;
&lt;li&gt;Most open source apps are good about doing this.&lt;/li&gt;
&lt;li&gt;LDAP gives you lots of data.  Too much data.&lt;/li&gt;
&lt;li&gt;BUT they&amp;#8217;re all special snowflakes.  Bugger.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;emtail&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;BUT hey have stuff in the logs.  So we use emtail: &amp;#8220;exporting, modular tail&amp;#8221;.  Reads logs, runs modules/plugins for extracting useful data, and produces a standardised set of metrics.&lt;/li&gt;
&lt;li&gt;A &amp;#8220;metric&amp;#8221; is stuff-over-time.&lt;/li&gt;
&lt;li&gt;The Google version exports to the Google DB, the open source version exports JSON dumps.&lt;/li&gt;
&lt;li&gt;The dad is exported via an HTTP server, using JSO or CSV, discarding the historical data; storage is the problem of your collecting tool; cacti, collectd, etc.&lt;/li&gt;
&lt;li&gt;Current version is written in Python, which is the closd version, and the open source rewrite is written in Go.  Not up on code.google yet, but &amp;#8220;by the end of the day&amp;#8221;.  It&amp;#8217;s not complete yet.&lt;/li&gt;
&lt;li&gt;An awk-like language to express the matching/aggregation rules.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Editorial: This seems kind of dead in the water to me.  Kind of NIH-ish.&lt;/p&gt;
 
    </content:encoded>

    <pubDate>Tue, 17 Jan 2012 13:29:46 +1300</pubDate>
    <guid isPermaLink="false">http://israel.diaspora.gen.nz/~rodgerd/archives/1409-guid.html</guid>
    <category>lca2012 analysis</category>

</item>
<item>
    <title>Smashing a square peg into a round hole</title>
    <link>http://israel.diaspora.gen.nz/~rodgerd/archives/1408-Smashing-a-square-peg-into-a-round-hole.html</link>
            <category>Technical</category>
    
    <comments>http://israel.diaspora.gen.nz/~rodgerd/archives/1408-Smashing-a-square-peg-into-a-round-hole.html#comments</comments>
    <wfw:comment>http://israel.diaspora.gen.nz/~rodgerd/wfwcomment.php?cid=1408</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://israel.diaspora.gen.nz/~rodgerd/rss.php?version=2.0&amp;type=comments&amp;cid=1408</wfw:commentRss>
    

    <author>nospam@example.com (Rodger Donaldson)</author>
    <content:encoded>
    &lt;p&gt;Smashing a square peg into a round hole: Automagically building and configuring Linux systems that are wildly different.&lt;/p&gt;

&lt;p&gt;David Basden and Chris Collins&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Why is this a big deal?  Don&amp;#8217;t Puppet et al deal with these?&lt;/li&gt;
&lt;li&gt;Well, with many wildly varying customers things vary so much you can&amp;#8217;t rely on Puppet.&lt;/li&gt;
&lt;li&gt;make-magic: automatic your automation.&lt;/li&gt;
&lt;li&gt;We &lt;strong&gt;are&lt;/strong&gt; using puppet.  It&amp;#8217;s so big it&amp;#8217;s basically sentient.  It&amp;#8217;s not adequate in and of itself.&lt;/li&gt;
&lt;li&gt;PXE,debconf, etc.&lt;/li&gt;
&lt;li&gt;A &amp;#8220;simple&amp;#8221; new customer server builod took over a day for an experienced sysadmin.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Now it&amp;#8217;s 10 minutes, most of which is Debian install.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&amp;#8220;Simple steps&amp;#8221; are more complex than they appear, you just don&amp;#8217;t think about the steps you&amp;#8217;re used to taking.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;Dependencies help solve this problem.&lt;/li&gt;
&lt;li&gt;Things can be represented as an acyclic directed graph; this means some steps can never be retraced.&lt;/li&gt;
&lt;li&gt;The Anchor graph looks like a plot of the Milky Way Galaxy.&lt;/li&gt;
&lt;li&gt;When you don&amp;#8217;t do ALL THE THINGS, you can prune the graph to a managable amount.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;So make-magic &lt;strong&gt;filters&lt;/strong&gt; the steps, then &lt;strong&gt;orders&lt;/strong&gt; the steps, then &lt;strong&gt;tracks&lt;/strong&gt; the steps.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;So what&amp;#8217;s a step?  A unit of work: create a VM; Allocate an IP address; run puppet; Accounting/Billing System records.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;Things like the Billing System are outside of the sorts of things most automation tools (like Puppet) can handle.&lt;/li&gt;
&lt;li&gt;So what does ALL THE THINGS?!&lt;/li&gt;
&lt;li&gt;Used to be Hu-mans.&lt;/li&gt;
&lt;li&gt;make-magic, slaved by Hu-mans.&lt;/li&gt;
&lt;li&gt;mudpuppy is an agent that alks to make-magic.&lt;/li&gt;
&lt;li&gt;This then talks to things like Orchestra, Internal APIs, and Python code.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Orchestra&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Does ALL THE THINGS in ALL THE PLACES.&lt;/li&gt;
&lt;li&gt;Management of async tasks.&lt;/li&gt;
&lt;li&gt;Most existing systems who perform this task assumes everything lives in the same security domain.  This does not describe David and Chris&amp;#8217; life.&lt;/li&gt;
&lt;li&gt;Multiple destination dispatch.&lt;/li&gt;
&lt;li&gt;Heterogenous environments.&lt;/li&gt;
&lt;li&gt;Conventional, DB centric + agents can scale poorly, and more to the point the agents don&amp;#8217;t have segreated access to different bits of the DB.&lt;/li&gt;
&lt;li&gt;Orchestra uses a conductor (queue manager) which doesn&amp;#8217;t have complete knowledge of the system.&lt;/li&gt;
&lt;li&gt;The Player is priveleged code that hands off tasks, which themselves are forked into a clean context.&lt;/li&gt;
&lt;li&gt;Security: Minimum disclosure, minimum trust, minimum features.&lt;/li&gt;
&lt;li&gt;Defining tasks: simple definition = task name, an executable, and a set of arguments.&lt;/li&gt;
&lt;li&gt;The audience protocol is simple and well-defined, a JSON-over-Unix domain socket protocol, and is very polling friendly.&lt;/li&gt;
&lt;li&gt;Tasks are pretty simple to spin up and write JSON for.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Demo&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Generating root passwords on untrusted systems for secure systems and passing them securely to a third party is kind of annoying.&lt;/li&gt;
&lt;li&gt;Demo Just Works.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Questions&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Do you manage systems on an ongoing basis, or is it chucked over the fence? They manage a lot of things, most of what they do is management-on-behalf.&lt;/li&gt;
&lt;li&gt;Do you use Orchestra? Nope, this is a one-off bring-everything-up, and then they use tools like Puppet and &amp;#8220;Other Stuff&amp;#8221;?&lt;/li&gt;
&lt;li&gt;Do you have systems who are dependent upon one another (e.g. web and DB server)? Yes, they are currently set up as two seperate builds at the moment.&lt;/li&gt;
&lt;li&gt;How much of this feeds into the monitoring services? Monitoring is autoconfigured for all standard services and configs, but custom monitoring is by hand.&lt;/li&gt;
&lt;li&gt;Is any of this available? Yes, yes it all is.  It&amp;#8217;s on &lt;a href=&quot;https://github.com/anchor&quot;&gt;GitHub&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Can it produce an automated build document?  Yes, yes it can.  You can pull the JSON history out, and they can report stuff automagically, but you&amp;#8217;ll have to tweak them to get the PDFs and whatnot.&lt;/li&gt;
&lt;li&gt;What&amp;#8217;s your workflow around Puppet code management? Git, syntax checking, you can ask for functional review.  They rely on smart staff.&lt;/li&gt;
&lt;li&gt;Multiple servers?  A for loop around the command-line tool.&lt;/li&gt;
&lt;/ul&gt;
 
    </content:encoded>

    <pubDate>Tue, 17 Jan 2012 11:58:55 +1300</pubDate>
    <guid isPermaLink="false">http://israel.diaspora.gen.nz/~rodgerd/archives/1408-guid.html</guid>
    <category>lca2012 automation</category>

</item>

</channel>
</rss>
