Jekyll2019-02-24T13:34:48+00:00http://blog.jamiejquinn.com/feed.xmlJamieJQuinnPhD Student | Programmer | MusicianJamie QuinnRunning Fluid Simulations in WebGL I - Simple Convection2018-07-29T12:09:13+00:002018-07-29T12:09:13+00:00http://blog.jamiejquinn.com/webgl-fluid-1<p>Years ago I worked my way through Lorena Barba’s <a href="http://lorenabarba.com/blog/cfd-python-12-steps-to-navier-stokes/">12 steps to Navier-Stokes</a> in Python, but recently I’ve been getting more and more into GPU programming and figured that it would be an interesting exercise to redo the steps in WebGL. Really when I say GPU programming I mean using general purpose tech like CUDA, but CUDA and WebGL are similar enough (the boilerplate is of course totally different but the idea of writing a kernel to act on many pixels/fluid cells is the same). Plus you get easy, automatic visualisation with WebGL!</p>
<p>If you want to dive straight in, check out the simulation <a href="http://jamiejquinn.com/George-GL/01-non-linear-convection/">here</a> or the code <a href="https://github.com/JamieJQuinn/George-GL">here</a>.</p>
<h2 id="who-is-this-for">Who Is This For?</h2>
<p>Just like Barba has for her course, I’m going to assume everyone reading this has a very basic understanding of fluid mechanics, partial differential equations, and numerical methods. By this I mean you should know what a partial derivative is, how you can model fluid behaviour using partial derivatives, and why numerical methods are used to solve fluid equations.</p>
<p>The code will be fairly simple as well, there’s not too much software engineering that goes into these small numerical experiments, although I’ll say now that the amount of code required to set up even a simple WebGL program is reasonably involved. As a benchmark, as someone who has never used WebGL before, I managed to set all this up in probably around 3 hours. I’ll be using a little bit of:</p>
<ul>
<li>Javascript - A web programming language. A good basic learning resource is the classic <a href="https://www.w3schools.com/jS/default.asp">w3schools</a>.</li>
<li>WebGL - A library written for javascript. Check out <a href="https://webgl2fundamentals.org/">https://webgl2fundamentals.org/</a> for a good tutorial and reference.</li>
<li>GLSL - This is the shader language used by WebGL, see <a href="https://webgl2fundamentals.org/">https://webgl2fundamentals.org/</a> again.</li>
</ul>
<p>By the end of this, you should have the basic understanding of fluid equations, WebGL programming, and numerical methods to create something that looks unfortunately rather boring:</p>
<div style="text-align: center"><video src="/assets/videos/george-gl/simple-convection.webm" width="300" height="300" autoplay="" loop="" preload=""></video></div>
<p>However, don’t despair, simulating simple linear convection in 1 dimensions is just the foundation of computational fluid dynamics. We will very quickly and very easily move into more interesting 1D models, and then finally into simulating the fully 2D Navier-Stokes equations, the main set of fluid equations still used today. Beyond that, and beyond Barba’s course, we might even end up exploring more complex methods, such as <a href="http://www.dgp.toronto.edu/people/stam/reality/Research/pdf/ns.pdf">Stam’s classic method</a> for quickly and stably simulating real time fluids in real time.</p>
<h2 id="the-fluid-mechanics">The Fluid Mechanics</h2>
<p>As in Prof. Barba’s <a href="http://nbviewer.jupyter.org/github/barbagroup/CFDPython/blob/master/lessons/01_Step_1.ipynb">first lesson</a> I’ll take the 1-dimensional linear convection (or advection) equation, where we have a quantity <script type="math/tex">u</script> sitting in a fluid that’s advected at a speed <script type="math/tex">c</script>:</p>
<script type="math/tex; mode=display">\frac{\partial u}{\partial t} + c\frac{\partial{u}}{\partial x} = 0.</script>
<p>For simplicity we’ll only deal with this fluid between <script type="math/tex">x=0</script> and <script type="math/tex">x=1</script>.</p>
<p>The simplest way to transform this equation into a problem solvable by a computer is to <strong>discretise</strong> the derivatives. We start by splitting our 1D space into <script type="math/tex">N_x</script> points, each separated by <script type="math/tex">\Delta x=1/N_x</script>, and then splitting time in a similar way, stepping forward by a <strong>timestep</strong> <script type="math/tex">\Delta t</script>. The solution <script type="math/tex">u</script> at a point <script type="math/tex">x=i\Delta x</script> and a time <script type="math/tex">t=n\Delta t</script> can then be written as <script type="math/tex">u_i^n</script>. Then, using a <strong>backward difference</strong> formula, the spatial derivative can be approximated as,</p>
<script type="math/tex; mode=display">\frac{\partial{u}}{\partial x} \approx \frac{u_i^n - u_{i-1}^n}{\Delta x},</script>
<p>and, using <strong>forward differences</strong>, the time derivative becomes,</p>
<script type="math/tex; mode=display">\frac{\partial{u}}{\partial t} \approx \frac{u_i^{n+1} - u_{i}^{n}}{\Delta t}.</script>
<p>It should be fairly obvious why these difference formulae are called forward and backwards differences. Forward differencing calculates derivatives from the <script type="math/tex">k</script> and <script type="math/tex">k+1</script> states, while backward differences are calculated from the <script type="math/tex">k</script> and <script type="math/tex">k-1</script> states. Central differences are also an option but create strange numerical oscillations when applied to this particular problem, I’ll discuss them when dealing with a diffusion problem.</p>
<p>There is an <a href="https://en.wikipedia.org/wiki/Numerical_partial_differential_equations">entire field of mathematics</a> dedicated to the development and analysis of these approximation of derivatives. This particular technique of combining temporal forward differences with spatial backward differences is a form of the <strong>first-order upwind method</strong>. It’s shown to be <strong>stable</strong> when the fluid is moving to the right (and can be modified so the correct spatial difference is chosen based on which direction the fluid is moving) and can be shown to have <strong>first-order accuracy</strong>, meaning the errors that appear will be roughly the same size as <script type="math/tex">\Delta t</script> and <script type="math/tex">\Delta x</script>. Methods of <script type="math/tex">j</script>-th order accuracy have errors proportional to <script type="math/tex">\Delta x^j</script> and/or <script type="math/tex">\Delta t^j</script> and higher. The upwind method, though not used much in practice due to it not being terribly accurate, is certainly useful for playing about with. After reading the rest of this post, try implementing <a href="https://en.wikipedia.org/wiki/Upwind_scheme#First-order_upwind_scheme">the scheme</a> for both right-moving and left-moving fluids, or increasing the accuracy using the <a href="https://en.wikipedia.org/wiki/Upwind_scheme#Second-order_upwind_scheme">second-order upwind scheme</a>.</p>
<p>Using these approximations the partial differential equation becomes a finite difference equation,</p>
<script type="math/tex; mode=display">\frac{u_i^{n+1} - u_{i}^{n}}{\Delta t} + c\frac{u_i^n - u_{i-1}^n}{\Delta x} = 0,</script>
<p>which can be rearranged to give,</p>
<script type="math/tex; mode=display">u_i^{n+1} = u_i^{n} - c\frac{\Delta t}{\Delta x}(u_i^n - u_{i-1}^n).</script>
<p>So, if we know an initial state <script type="math/tex">u^0_i</script> at every point <script type="math/tex">i</script>, we can work out <script type="math/tex">u^1_i</script>, then <script type="math/tex">u^2_i</script>, and so forth. The upwind scheme is part of a family of methods called <strong>explicit schemes</strong>. This means the future state <script type="math/tex">u_i^{n+1}</script> can be written <strong>explicitly</strong> in terms of current or past states <script type="math/tex">u_i^{n,n-1,n-2,...}</script>. The alternative is to incorporate future states into the calculation in a more intricate way, producing an <strong>implicit</strong> scheme. These are usually much more stable, often more accurate, but certainly a little harder to understand and code, and can really only be used on <strong>linear</strong> problems. I’ll discuss implicit schemes a little more in future posts!</p>
<p>Now it should be quite clear that we’ve finally found an equation suitable for a computer to solve, an equation that ultimately helps us simulate how a 1D fluid will behave, as long as we know the initial state <script type="math/tex">u^0_i</script> and the boundary behaviour. It should be said our original PDE does actually have a known <strong>analytical</strong> solution, that is a fully accurate solution we can write as a function <script type="math/tex">u(x, t)</script> for any time <script type="math/tex">t</script>, at any point <script type="math/tex">x</script>. It can be found using the <a href="https://services.math.duke.edu/education/joma/sarra/sarra2.html">method of characteristics</a>. However, this is a particularly simple case and as we add complexity to the fluid equation, it becomes usually impossible to find anything other than a <strong>numerical</strong> solution through the application of this kind of numerical method.</p>
<h2 id="the-code">The Code</h2>
<p>Now technically, I could take that final equation, give myself a starting state, and manually, by hand, work out every little calculation until I run out of time, food or any kind of semblance of sanity, but I won’t because computers exists. I’m not going to go through the code in detail because, frankly, it’s not very interesting and mostly copied from the <a href="https://webgl2fundamentals.org/webgl/lessons/webgl-image-processing.html">WebGL2 Fundamentals image processing tutorial</a> anyway.</p>
<p>We’re going to represent the grid of points as a texture in WebGL, with one texture representing the state at time <script type="math/tex">n-1</script> and another at <script type="math/tex">n</script>, so by rendering between them and applying our finite difference formula we simulate the fluid from one time to the next. Of course a texture is a 2D grid of points, good for later when we finally move to interesting 2D simulations, but our problem right now is only 1D, so I’m only going to deal with the <script type="math/tex">x</script> direction right now.</p>
<p>The only part that differs greatly from a more typical use of textures is that, because we’re trying to run a decently accurate simulation, I’ve instructed the texture to be created with an internal format of <code class="highlighter-rouge">RGBA32I</code> which assigns 32 bits per colour channel. Not as precise as the standard of 64 bits in true high performance computing, but good enough for us!</p>
<p>Every part of our simulation then uses shaders that act on these textures. The initial conditions are encoded in a shader, the finite difference formula that advances the simulation is a shader, even the boundary conditions are encoded in a shader that acts only at the edges of the texture.</p>
<p>The object that actually gets rendered is just a square of size <script type="math/tex">2\times2</script>, linked to the shaders via a <a href="https://webgl2fundamentals.org/webgl/lessons/webgl-fundamentals.html">vertex array object</a>. Most of the shaders render this square using two triangles that cover the entire square. The odd shader out is the one encoding the boundary conditions, which simply draws lines around the square.</p>
<p>The full pseudocode looks a little like this:</p>
<ol>
<li>Set up WebGL context</li>
<li>Create rendering surfaces:
<ol>
<li>Create two textures of simulation size</li>
<li>Link to two framebuffers for rendering</li>
</ol>
</li>
<li>Compile and link shaders:
<ol>
<li>Initial condition shader</li>
<li>Simulation shader</li>
<li>Boundary shader</li>
<li>Screen rendering shader</li>
</ol>
</li>
<li>Load initial conditions into a texture</li>
<li>Run main loop:
<ol>
<li>Render using main simulation shader from one texture to another</li>
<li>Render boundary conditions</li>
<li>Render result to screen</li>
<li>Swap textures</li>
</ol>
</li>
</ol>
<p>The actual code can be found on <a href="https://github.com/JamieJQuinn/George-GL/tree/master/01-non-linear-convection">github</a>. Start reading <code class="highlighter-rouge">main.js</code> and it should be fairly self-explanatory.</p>
<h3 id="the-shaders">The Shaders</h3>
<p>We have 3 interesting shaders and 1 very boring shader. The screen rendering shader simply copies a texture directly to the screen, allowing us to see what state our simulation is in. It’s common to visualise the fluid motion using something that flows around with the fluid like ink. In this example I’ve set the ink to be the variable <script type="math/tex">u</script>. The only thing the screen shader is used for is interpolating from the colour of the background to the colour of this ink.</p>
<p>Since we’re constantly rendering a square with four simple vertices, the vertex shader isn’t doing anything particularly interesting, the meat is all in the fragment shaders. The only thing the vertex shader is doing is interpolating the texture coordinate to be used in the fragment shader.</p>
<h4 id="initial-conditions">Initial Conditions</h4>
<p>The first interesting shader is the initial conditions fragment shader:</p>
<div class="language-glsl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#version 300 es
</span><span class="k">precision</span> <span class="kt">mediump</span> <span class="kt">float</span><span class="p">;</span>
<span class="k">in</span> <span class="kt">vec2</span> <span class="n">vTextureCoord</span><span class="p">;</span>
<span class="k">out</span> <span class="kt">vec4</span> <span class="n">outColour</span><span class="p">;</span>
<span class="kt">void</span> <span class="nf">main</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span>
<span class="kt">vec2</span> <span class="n">pos</span> <span class="o">=</span> <span class="n">vTextureCoord</span><span class="p">.</span><span class="n">xy</span><span class="p">;</span>
<span class="k">if</span><span class="p">(</span><span class="n">pos</span><span class="p">.</span><span class="n">x</span> <span class="o">></span> <span class="mi">0</span><span class="p">.</span><span class="mi">1</span> <span class="o">&&</span> <span class="n">pos</span><span class="p">.</span><span class="n">x</span> <span class="o"><</span> <span class="mi">0</span><span class="p">.</span><span class="mi">3</span><span class="p">)</span> <span class="p">{</span>
<span class="n">outColour</span> <span class="o">=</span> <span class="kt">vec4</span><span class="p">(</span><span class="mi">1</span><span class="p">.</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">.</span><span class="mi">0</span><span class="p">);</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="n">outColour</span> <span class="o">=</span> <span class="kt">vec4</span><span class="p">(</span><span class="mi">1</span><span class="p">.</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">.</span><span class="mi">0</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Here we can see how the texture is being used to store the state of the system,</p>
<script type="math/tex; mode=display">(R,G,B) = (c, 0, u),</script>
<p>where we’re using the red, green, blue and alpha channels to store the <script type="math/tex">x</script>-velocity or <script type="math/tex">c</script> (set to <script type="math/tex">1</script> for simplicity), <script type="math/tex">y</script>-velocity (<script type="math/tex">0</script> for now), and the ink level, <script type="math/tex">u</script>.</p>
<p>What this shader does is it sets the <script type="math/tex">x</script>-velocity to be <script type="math/tex">1</script> everywhere, and creates a little pocket of ink between <script type="math/tex">x=0.1</script> and <script type="math/tex">x=0.3</script>, letting the ink level be <script type="math/tex">0</script> everywhere else. That is, it encodes the function</p>
<script type="math/tex; mode=display">% <![CDATA[
u(x, 0) = \begin{cases}
1, & x \in (0.1, 0.3) \\[2ex]
0, & \text{otherwise}
\end{cases} %]]></script>
<h4 id="main-simulation">Main Simulation</h4>
<p>The main simulation fragment shader is written as</p>
<div class="language-glsl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#version 300 es
</span>
<span class="k">precision</span> <span class="kt">highp</span> <span class="kt">float</span><span class="p">;</span>
<span class="k">in</span> <span class="kt">vec2</span> <span class="n">vTextureCoord</span><span class="p">;</span>
<span class="k">uniform</span> <span class="kt">sampler2D</span> <span class="n">uSampler</span><span class="p">;</span>
<span class="k">uniform</span> <span class="kt">float</span> <span class="n">dt</span><span class="p">;</span>
<span class="k">uniform</span> <span class="kt">vec2</span> <span class="n">dxy</span><span class="p">;</span>
<span class="k">out</span> <span class="kt">vec4</span> <span class="n">outColour</span><span class="p">;</span>
<span class="kt">void</span> <span class="nf">main</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// Get variables
</span> <span class="kt">vec2</span> <span class="n">u</span> <span class="o">=</span> <span class="n">texture</span><span class="p">(</span><span class="n">uSampler</span><span class="p">,</span> <span class="n">vTextureCoord</span><span class="p">).</span><span class="n">xy</span><span class="p">;</span>
<span class="kt">float</span> <span class="n">ink</span> <span class="o">=</span> <span class="n">texture</span><span class="p">(</span><span class="n">uSampler</span><span class="p">,</span> <span class="n">vTextureCoord</span><span class="p">).</span><span class="n">z</span><span class="p">;</span>
<span class="kt">float</span> <span class="n">inkmx</span> <span class="o">=</span> <span class="n">texture</span><span class="p">(</span><span class="n">uSampler</span><span class="p">,</span> <span class="kt">vec2</span><span class="p">(</span><span class="n">vTextureCoord</span><span class="p">.</span><span class="n">x</span> <span class="o">-</span> <span class="n">dxy</span><span class="p">.</span><span class="n">x</span><span class="p">,</span> <span class="n">vTextureCoord</span><span class="p">.</span><span class="n">y</span><span class="p">)).</span><span class="n">z</span><span class="p">;</span>
<span class="c1">// Perform numerical calculation
</span> <span class="n">ink</span> <span class="o">=</span> <span class="n">ink</span> <span class="o">-</span> <span class="n">u</span><span class="p">.</span><span class="n">x</span> <span class="o">*</span> <span class="n">dt</span> <span class="o">/</span> <span class="n">dxy</span><span class="p">.</span><span class="n">x</span> <span class="o">*</span> <span class="p">(</span><span class="n">ink</span> <span class="o">-</span> <span class="n">inkmx</span><span class="p">);</span>
<span class="c1">// Output results
</span> <span class="kt">float</span> <span class="n">alpha</span> <span class="o">=</span> <span class="n">texture</span><span class="p">(</span><span class="n">uSampler</span><span class="p">,</span> <span class="n">vTextureCoord</span><span class="p">).</span><span class="n">w</span><span class="p">;</span>
<span class="n">outColour</span> <span class="o">=</span> <span class="kt">vec4</span><span class="p">(</span><span class="n">u</span><span class="p">,</span> <span class="n">ink</span><span class="p">,</span> <span class="n">alpha</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Recall the difference formula</p>
<script type="math/tex; mode=display">u_i^{n+1} = u_i^{n} - c\frac{\Delta t}{\Delta x}(u_i^n - u_{i-1}^n).</script>
<p>The shader encodes this formula, using the previous values <code class="highlighter-rouge">ink</code> and <code class="highlighter-rouge">inkmx</code> (as in ink-minus-x) as <script type="math/tex">u^n_i</script> and <script type="math/tex">u^n_{i-1}</script> sampled from the texture storing the previous state. As can be seen from the code, we find the <script type="math/tex">i-1</script> value by moving the sample coordinate a single <script type="math/tex">dx</script> to the left, the distance <script type="math/tex">dx</script> calculated using <code class="highlighter-rouge">1.0/gl.canvas.width</code>.</p>
<h4 id="boundary-conditions">Boundary Conditions</h4>
<p>The boundary shader is a little more interesting because we’re not rendering (or simulating) over the full domain, since the boundary conditions should only affect the pixels around the edges. So instead of rendering to a square, we simply render lines around the domain using <code class="highlighter-rouge">gl.LINE_LOOP</code>. The shader itself sets the velocity and ink level at the boundary to <script type="math/tex">0</script> by returning <code class="highlighter-rouge">gl_FragColor = vec4(0,0,0,0)</code>.</p>
<h2 id="conclusion">Conclusion</h2>
<p>So, to wrap up, we’ve gone over the basics of fluid mechanics, written through the language of partial differential equations. We’ve transformed those equations into finite difference formulae that can be readily calculated by a computer. Finally, we’ve figured out a way we can write those formulae using javascript and WebGL to simulate the fluid in real time on any modern graphics card, all within the browser!</p>
<p>Next up, we’ll explore moving to 2D, and simulating a few more interesting 1D fluid equations.</p>Jamie QuinnYears ago I worked my way through Lorena Barba’s 12 steps to Navier-Stokes in Python, but recently I’ve been getting more and more into GPU programming and figured that it would be an interesting exercise to redo the steps in WebGL. Really when I say GPU programming I mean using general purpose tech like CUDA, but CUDA and WebGL are similar enough (the boilerplate is of course totally different but the idea of writing a kernel to act on many pixels/fluid cells is the same). Plus you get easy, automatic visualisation with WebGL!Mr. Julia2017-06-08T16:36:55+00:002017-06-08T16:36:55+00:00http://blog.jamiejquinn.com/julia-fractals<p>Going on my theme of wonderfully fractal images, I wrote a little simulation to introduce myself to webGL. Go have a wee play about with it <a href="http://jamiejquinn.com/webGL-Julia-Fractal/">here</a>.</p>
<h3 id="the-maths">The Maths</h3>
<p>You can find lots of information about Julia fractals all around the web so I won’t go into much detail at all here. All I’ll say is that the fractals, named for Gaston Julia, come about by iterating a complex number through the formula
<script type="math/tex">z_{n+1} = z_n^2 + c</script>,
where <script type="math/tex">c</script> is some complex number. There’s no need to take the square but it produces some very nice images without going into the complexity of trying to take powers of complex numbers.</p>
<p>The pictures are produced by taking each pixel, turning its location (<script type="math/tex">x, y</script>) into a location in the complex plane <script type="math/tex">z = x+ iy</script> and using that as the starting point for the iteration. The location of the mouse on the screen gives the value of <script type="math/tex">c</script> and changes the entire nature of the fractal. Take a look at some of the fascinating patterns you can get below.</p>
<h3 id="examples">Examples</h3>
<p><img src="/assets/img/julia-fractals/1.png" alt="" /></p>
<p><img src="/assets/img/julia-fractals/3.png" alt="" /></p>
<p><img src="/assets/img/julia-fractals/4.png" alt="" /></p>
<p><img src="/assets/img/julia-fractals/5.png" alt="" /></p>Jamie QuinnGoing on my theme of wonderfully fractal images, I wrote a little simulation to introduce myself to webGL. Go have a wee play about with it here.Regularly Expressional Fractalations2017-06-01T12:07:43+00:002017-06-01T12:07:43+00:00http://blog.jamiejquinn.com/regex-fractals<p>There’s something about fractals that humans find fascinating. They manage to contain a beautiful impression of infinity despite being not very difficult to create. These fractals have been produced by a very simple recipe:</p>
<p><img src="/assets/img/regex-fractals/quadrant_numbering2.png" alt="quadrant_numbering2" style="width: 300px" /></p>
<ol>
<li>Split a big square of pixels into 4 quadrants and label them 1 to 4</li>
<li>Repeat this process for each of the smaller squares and add the quadrant number to the label</li>
<li>Keep cutting the squares into 4 until you’ve gone as small as you want</li>
<li>Now we can use regular expressions to find and mark boxes with certain labels</li>
</ol>
<p>I wrote a little bit of javascript that creates such fractals from regular expressions involving the digits 1, 2, 3 or 4. Have a play <a href="http://jamiejquinn.com/regex-fractals/">here</a> and check out some examples below.</p>
<h3 id="1"><code class="highlighter-rouge">1</code></h3>
<p><img src="/assets/img/regex-fractals/1.png" alt="1" /></p>
<h3 id="132413"><code class="highlighter-rouge">[13][24][13]</code></h3>
<p><img src="/assets/img/regex-fractals/132413.png" alt="1" /></p>
<h3 id="12213443"><code class="highlighter-rouge">12|21|34|43</code></h3>
<p><img src="/assets/img/regex-fractals/12or21or34or43.png" alt="1" /></p>
<h3 id="1324"><code class="highlighter-rouge">13|24</code></h3>
<p><img src="/assets/img/regex-fractals/13or24.png" alt="1" /></p>
<h3 id="13312442"><code class="highlighter-rouge">13|31|24|42</code></h3>
<p><img src="/assets/img/regex-fractals/13or31or24or42.png" alt="1" /></p>
<h3 id="342"><code class="highlighter-rouge">[34]+2</code></h3>
<p><img src="/assets/img/regex-fractals/34plus2.png" alt="1" /></p>Jamie QuinnThere’s something about fractals that humans find fascinating. They manage to contain a beautiful impression of infinity despite being not very difficult to create. These fractals have been produced by a very simple recipe:How I lost a Day to OpenMPI Being Mental2017-05-19T12:28:58+00:002017-05-19T12:28:58+00:00http://blog.jamiejquinn.com/how-i-lost-a-day-to-openmpi-being-mental<p>So at Glasgow Uni we have this little cluster for the maths department which happens to including about ten machines set up to work with torque (a job scheduling system). I discovered that these machines hadn’t had <em>anything</em> run on them for literally months, what a waste of resources! To rectify this atrocity I decided to try and run my MPI enabled code on <em>all ten machines</em>.</p>
<h3 id="problem-one">Problem One</h3>
<p>Turns out two of the machines have eight cores and the other eight have twelve cores. I can’t mix and match with core number (AFAIK) so I guess I’ll just run the code on <em>all eight machines</em>.</p>
<h3 id="problem-two">Problem Two</h3>
<p>The vendor supplied fortran compiler and openmpi library are pretty old and pretty useless. They produce binaries that run about 3x slower than current versions so I guess I’ll have to build the latest versions myself.</p>
<h3 id="problem-three">Problem Three</h3>
<p>I run the code on a single machine. Works great, not so great performance but I can perhaps sort that out later. I run the code on multiple machines and <strong>instant</strong> crash.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ERROR: received unexpected process identifier
</code></pre></div></div>
<p>Google doesn’t help, the only result is someone trying to run openmpi on one machine with multiple IP addresses on a single interface. So I spend <em>six hours</em> looking through the verbose MPI output to discover that upon MPI setting itself up, each MPI process contacts every other process.</p>
<p>Contacting another process on the same machine? Absolutely fine, nae bother.</p>
<p>Contacting another machine over the network? Not fine, definitely bother.</p>
<p>Why? It was contacting the other machines through the interface <code class="highlighter-rouge">virbr0</code> which happens to have the <em>same IP address</em> on <em>every single machine</em>. So the conversation between a machine <em>and itself</em> was going something like this:</p>
<p>Machine A: Hey, 192.168.1.1, you’re running this MPI thing?<br />
Machine A: Do I know you?<br />
Machine A: Apparently not.<br />
Machine A: Well, let’s decide to produce a very confusing error message and hang.<br />
Machine A: Agreed.</p>
<p>Machine B: Me too.</p>
<p>After I found this out simply passing <code class="highlighter-rouge">--mca btl_tcp_if_exclude virbr0,lo</code> to <code class="highlighter-rouge">mpirun</code> suppressed any contact through that odd interface and loopback and it worked perfectly.</p>
<h3 id="problem-four">Problem Four</h3>
<p>It crashes on more than four machines.</p>
<h3 id="problem-five">Problem Five</h3>
<p>It’s about 7 times slower than running on a single one of the newer machines in the cluster.</p>
<h3 id="lesson-learned">Lesson Learned:</h3>
<p>Sometimes clusters are better left unused.</p>Jamie QuinnSo at Glasgow Uni we have this little cluster for the maths department which happens to including about ten machines set up to work with torque (a job scheduling system). I discovered that these machines hadn’t had anything run on them for literally months, what a waste of resources! To rectify this atrocity I decided to try and run my MPI enabled code on all ten machines.4 Tips on Making Simulations Bug Resistant2017-05-19T12:02:00+00:002017-05-19T12:02:00+00:00http://blog.jamiejquinn.com/making-simulations-bug-resistant<p>Having written and used a decent number of simulations over the past few years I’ve come to understand that preventing bugs in scientific software is just a wee bit different from how it’s usually done in more standard software development.</p>
<p>For one thing, many of the simulations come under the category of high performance computing (HPC) simulations, so it can take a long time to build and run a test case, leading to iteration speeds that are <em>painfully</em> slow. Another feature of simulations is that sometimes the code isn’t that complex in software terms, the programs simply iterates over a large number of equations, which itself comes with a pitfalls. Debugging is also a nightmare, not because something goes terribly wrong and the program crashes, that’s not too hard to fix, but what are you meant to do when a graph your simulation produces is just a little bit too tall? Or the fluid you’re simulating inexplicably only goes one way? Is the problem in your theory or in your code?</p>
<p>In light of this, here are a few tips that have saved me a lot of bother in the past, and might save you in the future.</p>
<h2 id="1-unit-test">1. Unit Test</h2>
<p>I’m going to get this one out of the way quickly. I know unit testing is an obvious one to most software developers, but I also know that many scientists who program do not consider themselves developers. At any rate, unit testing is kind of hard, especially when common HPC languages like fortran and C aren’t exactly the most accommodating of languages for unit testing, and even more so when you consider how many parts of a simulation just aren’t unit testable.</p>
<p>Taking that into account my advice is this:</p>
<ul>
<li>Set up a unit testing library in your language of choice</li>
<li>Learn how to write a basic test in it</li>
<li>Test the living crap out of every bit you can, eg
<ul>
<li>Derivatives and mathematical helper functions</li>
<li>System of equation solvers</li>
<li>Functions that deal with data in/out</li>
</ul>
</li>
</ul>
<p>Not only will you be able to show that much of your code works fine, but you’ll hopefully write better, simpler functions.</p>
<p>Here’s a reasonably simple example from a small 1D fluid simulation that I wrote. It tests the ability of a variable structure, <code class="highlighter-rouge">ModelVariables</code>, to save and load itself. It’s just a simple, effective test, and it came in useful a few times when I made some breaking code change and knew about it <em>immediately</em> when this test failed.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">TEST_CASE</span><span class="p">(</span> <span class="s">"ModelVariables save and load correctly"</span><span class="p">,</span> <span class="s">"[variables]"</span> <span class="p">)</span> <span class="p">{</span>
<span class="c1">// Setup
</span> <span class="kt">int</span> <span class="n">N</span> <span class="o">=</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">filePath</span> <span class="o">=</span> <span class="s">"ModelVariableSaveTest.dat"</span><span class="p">;</span>
<span class="k">const</span> <span class="n">Constants</span> <span class="n">c</span><span class="p">(</span><span class="mf">0.0001</span><span class="n">f</span><span class="p">,</span> <span class="mf">2.0</span><span class="n">f</span><span class="p">,</span> <span class="n">N</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="mf">2.0</span><span class="n">f</span><span class="p">,</span> <span class="mf">3.0</span><span class="n">f</span><span class="p">);</span>
<span class="n">ModelVariables</span> <span class="n">vars</span><span class="p">(</span><span class="n">c</span><span class="p">);</span>
<span class="n">ModelVariables</span> <span class="n">vars2</span><span class="p">(</span><span class="n">c</span><span class="p">);</span>
<span class="c1">// Load in some test data
</span> <span class="n">real</span><span class="o">*</span> <span class="n">data</span> <span class="o">=</span> <span class="n">vars</span><span class="p">.</span><span class="n">pressure</span><span class="p">.</span><span class="n">get</span><span class="p">();</span>
<span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">i</span><span class="o"><</span><span class="mi">5</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
<span class="n">data</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="mf">4.0</span><span class="n">f</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// Save it
</span> <span class="n">vars</span><span class="p">.</span><span class="n">save</span><span class="p">(</span><span class="n">filePath</span><span class="p">.</span><span class="n">c_str</span><span class="p">());</span>
<span class="c1">// Load it
</span> <span class="n">vars2</span><span class="p">.</span><span class="n">load</span><span class="p">(</span><span class="n">filePath</span><span class="p">.</span><span class="n">c_str</span><span class="p">(),</span> <span class="n">c</span><span class="p">);</span>
<span class="c1">// Check it
</span> <span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">i</span><span class="o"><</span><span class="n">vars</span><span class="p">.</span><span class="n">len</span><span class="p">();</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
<span class="n">CHECK</span><span class="p">(</span><span class="n">vars</span><span class="p">.</span><span class="n">pressure</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">==</span> <span class="n">vars2</span><span class="p">.</span><span class="n">pressure</span><span class="p">[</span><span class="n">i</span><span class="p">]);</span>
<span class="n">CHECK</span><span class="p">(</span><span class="n">vars</span><span class="p">.</span><span class="n">density</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">==</span> <span class="n">vars2</span><span class="p">.</span><span class="n">density</span><span class="p">[</span><span class="n">i</span><span class="p">]);</span>
<span class="n">CHECK</span><span class="p">(</span><span class="n">vars</span><span class="p">.</span><span class="n">velocity</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">==</span> <span class="n">vars2</span><span class="p">.</span><span class="n">velocity</span><span class="p">[</span><span class="n">i</span><span class="p">]);</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<h2 id="2-line-equations-up">2. Line Equations Up</h2>
<p>Say you’ve got multiple equations that have a symmetry to them, for example if you’re working in 3D and you’re calculating a similar thing in all three dimensions. Make it easy for your eyes to spot an error and line up your equations.</p>
<p>The following three lines of almost language-agnostic code should all be identical except from any <code class="highlighter-rouge">x</code> being replaced with <code class="highlighter-rouge">y</code> and <code class="highlighter-rouge">z</code>, a reasonably typical situation in a fluid dynamics simulation. There are 3 errors. Can you see them?</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">bxx</span> <span class="o">=</span> <span class="n">c</span><span class="o">*</span><span class="n">bxx</span> <span class="o">-</span> <span class="n">a</span><span class="o">*</span><span class="n">bxx</span> <span class="o">+</span> <span class="mi">2</span><span class="o">*</span><span class="n">x</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">pbxx</span>
<span class="n">byy</span> <span class="o">=</span> <span class="n">a</span><span class="o">*</span><span class="n">byy</span> <span class="o">+</span> <span class="mi">2</span><span class="o">*</span><span class="n">x</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">pdyy</span> <span class="o">-</span> <span class="n">c</span><span class="o">*</span><span class="n">byy</span>
<span class="n">bzz</span> <span class="o">=</span> <span class="mi">2</span><span class="o">*</span><span class="n">z</span><span class="o">**</span><span class="mi">2</span> <span class="o">*</span> <span class="n">pbzz</span> <span class="o">+</span> <span class="n">o</span> <span class="o">*</span> <span class="n">bzz</span> <span class="o">-</span> <span class="n">a</span> <span class="o">*</span> <span class="n">bzz</span>
</code></pre></div></div>
<p>What if I do this?</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">bxx</span> <span class="o">=</span> <span class="o">-</span><span class="n">a</span><span class="o">*</span><span class="n">bxx</span> <span class="o">+</span> <span class="mi">2</span><span class="o">*</span><span class="n">x</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">pbxx</span> <span class="o">+</span> <span class="n">c</span><span class="o">*</span><span class="n">bxx</span>
<span class="n">byy</span> <span class="o">=</span> <span class="n">a</span><span class="o">*</span><span class="n">byy</span> <span class="o">+</span> <span class="mi">2</span><span class="o">*</span><span class="n">s</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">pdyy</span> <span class="o">-</span> <span class="n">c</span><span class="o">*</span><span class="n">byy</span>
<span class="n">bzz</span> <span class="o">=</span> <span class="o">-</span><span class="n">a</span><span class="o">*</span><span class="n">bzz</span> <span class="o">+</span> <span class="mi">2</span><span class="o">*</span><span class="n">z</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">pbzz</span> <span class="o">+</span> <span class="n">o</span><span class="o">*</span><span class="n">bzz</span>
</code></pre></div></div>
<p>Immediately you can see most of the errors clearly! The minus sign is in the wrong place in the <code class="highlighter-rouge">yy</code> equation, there’s an <code class="highlighter-rouge">s</code> instead of an <code class="highlighter-rouge">y</code> too (woops, copy and paste is the devils work), there’s an <code class="highlighter-rouge">o</code> instead of a <code class="highlighter-rouge">c</code> and a <code class="highlighter-rouge">d</code> instead of a <code class="highlighter-rouge">b</code>. You’ll note that was actually 4 errors, because how often do you know how many errors you have in a piece of code?</p>
<p>Now certainly this is bad code as well because the names don’t make immediate sense (see tip #3) and perhaps I should have brackets around the power, and yes the compiler might pick up some of these errors for you, but just look how much easier it is to spot little formatting errors that <em>could</em> cause you major headaches when a little time is taken to properly format the code.</p>
<h2 id="3-name-your-goddamn-variables">3. Name Your Goddamn Variables</h2>
<p>I did a maths degree at undergraduate level so I totally understand that when maths is written down it looks better as \(a = b_n + c\) than \(ants = bread_n + corn\). It’s mainly because when you ram a bunch of symbols together like \(xyz\) you know that just means \(x\), \(y\) and \(z\) all multiplied together. This falls a bit short when an equation is translated into code. It looks like incomprehensible garbage unless you have a symbol chart in a comment somewhere (which will end up being wrong some time down the line) or you actually have the book or paper with the equation in it in front of you. Plus having one letter names make refactoring or even just searching for a variable <em>extremely</em> difficult.</p>
<p>Here’s a real life example of when this habit of short variable names lead me to spend nearly an entire week debugging a small 2D fluid simulation.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">n</span><span class="o">=</span><span class="mi">1</span><span class="p">;</span> <span class="n">n</span><span class="o"><</span><span class="n">Nn</span><span class="p">;</span> <span class="o">++</span><span class="n">n</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span><span class="p">;</span> <span class="n">k</span><span class="o"><</span><span class="n">Nn</span><span class="o">-</span><span class="mi">1</span><span class="p">;</span> <span class="o">++</span><span class="n">k</span><span class="p">)</span> <span class="p">{</span>
<span class="p">...</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>In line 2, that’s not meant to be an <code class="highlighter-rouge">Nn</code>. That’s meant to be an <code class="highlighter-rouge">Nz</code>, but it’s nearly impossible to see!</p>
<p>A better way to do this is to name those variables something a little more meaningful. The variable <code class="highlighter-rouge">n</code> is actually counting the number of spectral modes in the code, so why not call <code class="highlighter-rouge">Nn</code> something better like <code class="highlighter-rouge">n_modes</code>. It’s readable and its not going to get mixed up with <code class="highlighter-rouge">Nz</code>, especially if <code class="highlighter-rouge">Nz</code> is named something more readable like <code class="highlighter-rouge">n_points</code>, now a meaningful name since <code class="highlighter-rouge">k</code> is actually counting through the grid points.</p>
<p>Sometimes it can go too far however, especially when transcribing equations into code. Check out the next tip.</p>
<h2 id="4-do-as-little-as-possible">4. Do as Little as Possible</h2>
<blockquote>
<p>Every step between your equations and code is a source of error.</p>
</blockquote>
<p>As in many things, when writing simulations there is little scope for error, especially considering the results of the simulation are often untestable or the code takes so long to run that it’s just impractical to test against real data. Even then, with complex simulations it’s difficult to know whether differences between your simulation results and your experimental data are due to numerical inaccuracy, experimental issues, problems with the theory (but not the code) or straight up bugs in the code.</p>
<p>If you’ve ever done any kind of mathematical derivation or proof or just some long-winded calculation, you know that you’re going to make a mistake somewhere which means in turning any equations or theory into code you want to do it in as few steps as possible. That means, at least for a first version, the equations should be written in code as close to the originals as possible. There are two ways to do this and pros and cons to each.</p>
<ol>
<li>Write using the same (or close to the same) symbols. \(a = b \times c\) becomes <code class="highlighter-rouge">a = b*c</code>.</li>
<li>Write the equation in terms of what it means. \(a = b \times c\) becomes <code class="highlighter-rouge">apples = bananas*cantaloupe</code>.</li>
</ol>
<p>I tend to use the second style now, although I was very much a fan of the first style for years. The second style feels more natural to program in. Not only is it easier to search for <code class="highlighter-rouge">apples</code> than <code class="highlighter-rouge">a</code> but you can spot errors much more easily (see tip #3). The issue with it is that it’s slightly more difficult to check against the original equation and formatting can get a little harder.</p>
<p>But you can make your own decision. I’m not your mother.</p>
<h1 id="the-round-up">The Round Up</h1>
<p>In short,</p>
<ol>
<li>Unit test what you can</li>
<li>Use good formatting</li>
<li>Name your variables well</li>
<li>Don’t over-complicate your equations</li>
</ol>
<p>Hopefully I’ve presented some food for thought here. If you disagree with anything or just have any examples or comments, let me know!</p>
<p>Many of these ideas are inspired by Joel Spolsky’s philosophy of making bad code look wrong. Do check out his <a href="https://www.joelonsoftware.com/2005/05/11/making-wrong-code-look-wrong/">blog post</a> for more ideas and philosophy on writing code that just looks wrong when it is. Code Complete is also a fantastic tome with some empirical evidence on how to improve your code style.</p>Jamie QuinnHaving written and used a decent number of simulations over the past few years I’ve come to understand that preventing bugs in scientific software is just a wee bit different from how it’s usually done in more standard software development.Parallelising Wondrous Numbers in C++2017-02-14T00:00:00+00:002017-02-14T00:00:00+00:00http://blog.jamiejquinn.com/parallelising-collatz<p>The <a href="https://en.wikipedia.org/wiki/Collatz_conjecture">Collatz conjecture</a>, named for Lothar Collatz, goes as follows.</p>
<blockquote>
<p>Take any positive integer \(n\). If \(n\) is even, half it, or if it’s odd, multiply it by three and add one. Repeating the process will always bring you back to 1.</p>
</blockquote>
<p>The sequence of numbers generated by repeating the process is sometimes called the hailstone sequence due to the strange way the sequence bounces around, as you can see below if we start the sequence at 19. The numbers generated are also called the wondrous numbers, a name I found so enticing that I named this post for it.</p>
<p><img src="/assets/img/collatz/hailstone_sequence_19.png" alt="hailstone-sequence-19" class="align-center" /></p>
<p>This problem is deceptively simple, isn’t it? You can definitely see that ever power of 2 must come back to 1, that’s pretty easy to see after playing about with the process for a while. It’s this deceptive simplicity of the problem that really makes it a kind of <a href="https://rjlipton.wordpress.com/2009/11/04/on-mathematical-diseases/">mathematical disease</a>, so well described in the xkcd comic below.</p>
<p><img src="https://imgs.xkcd.com/comics/collatz_conjecture.png" alt="xkcd-collatz" class="align-center" style="width: 300px" /></p>
<p>It’s almost infuriating that this seemingly straightforward problem hasn’t been solved, but check out some of the complex maths needed to even try and think about proving this at Terry Tao’s <a href="https://terrytao.wordpress.com/2011/08/25/the-collatz-conjecture-littlewood-offord-theory-and-powers-of-2-and-3/">blog</a>.</p>
<h2 id="stopping-times">Stopping Times</h2>
<p>Fortunately I didn’t catch the disease of obsessively trying to prove this. I did however become slightly obsessed with the <strong>stopping times</strong> of the sequences. Simply put, the stopping time is how many numbers does the process have to go through before you get back to 1. For example, 1 has a stopping time of 0, since it’s already at 1, and any powers of 2 will have a stopping time of precisely the power. We can count the stopping time of the hailstone sequence above to be about 20, depending on if you’re counting 19 and/or 1 in the steps. Calculating each of the stopping times for numbers up to ten thousand gives a fascinating pattern, as you can see below.</p>
<p><img src="/assets/img/collatz/scatter_stopping_times_1e4.png" alt="stopping_times_1e4" class="align-center" /></p>
<p>So this pattern is nice but what makes the calculation of all these stopping times so lovely is that it’s an example of an <a href="https://en.wikipedia.org/wiki/Embarrassingly_parallel">pleasingly parallel</a> problem, that is a problem that’s so easy to split up and solve in parallel that it’s just satisfying in its simplicity. Calculating the stopping time for one number doesn’t depend <em>at all</em> on the calculation of another, hence we can split the domain up very simply and calculate each one independently in whichever way we choose. What better to introduce myself to the various methods of parallel programming, specifically in C++?</p>
<p>The problem I put forth was simply this:</p>
<blockquote>
<p>What is the longest stopping time possible (for integers below some limit)?</p>
</blockquote>
<p>Just looking at the stopping times graphed above, you can see that the number with the longest stopping time under 10000 is about 6250 or so, having a stopping time of about 260.</p>
<p>Finding the maximum stopping time gives a nice number that can be used to test and verify results from different implementations of the solution, but we still have to calculate (almost) every single stopping time. That “almost” appears because if you consider a hailstone sequence starting from \(n\), every other number in that sequence must have a lower stopping time than \(n\), since that sequence is entirely inside the sequence starting from \(n\). That means that once we’ve calculated the stopping time for some number \(n\), we can forget about all the numbers in \(n\)’s hailstone sequence. However, even though taking advantage of this could theoretically speed up the solution of the problem, the mere act of taking numbers out of the domain takes some computation time so it’s not obvious whether a speed up would actually be gained. I may experiment with this but that’s a topic for another post.</p>
<h2 id="the-serial-solution">The Serial Solution</h2>
<p>You’ll find all the code, scripts and bits and pieces on <a href="https://github.com/JamieJQuinn/Collatz-Conjecture">my github</a>.</p>
<p>Although many simulations shouldn’t be built serially and then parallelised, since this is a pleasingly parallel problem, it shouldn’t be an issue, so let’s think about solving this in a serial fashion first. We can code a function that finds the path length for a given starting number as follows.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">calcCollatzPathLength</span><span class="p">(</span><span class="n">big_int</span> <span class="n">n</span><span class="p">)</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">pathLength</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span>
<span class="k">while</span><span class="p">(</span><span class="n">n</span><span class="o">></span><span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span><span class="p">(</span><span class="n">n</span><span class="o">%</span><span class="mi">2</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="n">n</span><span class="o">=</span><span class="n">n</span><span class="o">*</span><span class="mi">3</span><span class="o">+</span><span class="mi">1</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="n">n</span><span class="o">/=</span><span class="mi">2</span><span class="p">;</span>
<span class="p">}</span>
<span class="o">++</span><span class="n">pathLength</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">pathLength</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>You’ll find in the actual code the definition of <code class="highlighter-rouge">big_int</code> as a typedef defining whatever we’re considering to be a large integer (I usually used 64-bit integers, i.e. <code class="highlighter-rouge">long long</code>) and also some little bits that catch overflow problems. I specifically chose to simply ignore those sequences that overflowed for ease of programming, of course it does mean we might miss the true maximum path length but I’m fine with that. <em>Total</em> accuracy is not the main point of this whole exercise.</p>
<p>Another point to make is that this could have been implemented recursively instead of using a while loop, and it might be interesting to compare the efficiency of the loop against recursion, however GPU core typically don’t use stacktraces, can’t call functions and thus can’t use recursive techniques. For simplicity I just stuck to the while loop.</p>
<p>To find the maximum, we apply our <code class="highlighter-rouge">calcCollatzStoppingTime</code> function over the range of integers we’re interested in.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">maxCollatzPath</span><span class="p">(</span>
<span class="n">big_int</span> <span class="n">N</span><span class="p">,</span>
<span class="n">big_int</span> <span class="n">M</span><span class="p">,</span>
<span class="c1">// Return values
</span> <span class="kt">int</span> <span class="o">&</span><span class="n">maxStoppingTime</span><span class="p">,</span>
<span class="n">big_int</span> <span class="o">&</span><span class="n">maxN</span>
<span class="p">)</span> <span class="p">{</span>
<span class="n">maxStoppingTime</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span>
<span class="n">maxN</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span>
<span class="k">for</span><span class="p">(</span><span class="n">big_int</span> <span class="n">i</span> <span class="o">=</span> <span class="n">N</span><span class="p">;</span> <span class="n">i</span><span class="o"><</span><span class="n">M</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">stoppingTime</span> <span class="o">=</span> <span class="n">calcCollatzStoppingTime</span><span class="p">(</span><span class="n">i</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span> <span class="n">maxStoppingTime</span> <span class="o"><</span> <span class="n">stoppingTime</span> <span class="p">)</span> <span class="p">{</span>
<span class="n">maxStoppingTime</span><span class="o">=</span><span class="n">stoppingTime</span><span class="p">;</span>
<span class="n">maxN</span><span class="o">=</span><span class="n">i</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>And that’s basically that…</p>
<h2 id="openmp-aka-the-lazy-mans-multithreading">OpenMP (aka the lazy man’s multithreading)</h2>
<p>OpenMP is wonderful. By simply installing the libraries and adding the appropriate compiler directive your code is instantly multithreaded. So all we need to do to allow openMP to do it’s thing in our code is add the <code class="highlighter-rouge">parallel for</code> directive just before the for loop in our serial implementation, as you can see below. What an absolute belter.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#pragma omp parallel for schedule(dynamic)
</span><span class="k">for</span><span class="p">(</span><span class="n">big_int</span> <span class="n">i</span> <span class="o">=</span> <span class="n">N</span><span class="p">;</span> <span class="n">i</span><span class="o"><</span><span class="n">M</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">stoppingTime</span> <span class="o">=</span> <span class="n">calcCollatzStoppingTime</span><span class="p">(</span><span class="n">i</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span> <span class="n">maxStoppingTime</span> <span class="o"><</span> <span class="n">stoppingTime</span> <span class="p">)</span> <span class="p">{</span>
<span class="n">maxStoppingTime</span><span class="o">=</span><span class="n">stoppingTime</span><span class="p">;</span>
<span class="n">maxN</span><span class="o">=</span><span class="n">i</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>You can find more about openMP in general from <a href="http://bisqwit.iki.fi/story/howto/openmp/">this</a> wonderful tutorial, but in short, the pragma shown above simply tells openMP to create a bunch of threads specifically to handle the for loop and dole out the inputs using the dynamic schedule which allows better distribution of the work.</p>
<h2 id="c-threads">C++ Threads</h2>
<p>Originally this was threaded using <a href="http://www.boost.org/doc/libs/1_63_0/doc/html/thread.html">boost threads</a>, but since C++11 the language has had its own implementation of threads, modelled on boosts implementation. The implementation is simple, create a number of threads and give each of those threads en equal portion of the work. You can see the implementation in C++11 thread parlance below.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">threadedCollatz</span><span class="p">()</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">stoppingTimes</span><span class="p">[</span><span class="n">N_THREADS</span><span class="p">];</span>
<span class="n">big_int</span> <span class="n">maxNs</span><span class="p">[</span><span class="n">N_THREADS</span><span class="p">];</span>
<span class="n">big_int</span> <span class="n">nPerThread</span> <span class="o">=</span> <span class="n">big_int</span><span class="p">(</span><span class="n">upperLimit</span><span class="o">/</span><span class="n">N_THREADS</span><span class="p">);</span>
<span class="c1">// Create and run threads
</span> <span class="n">std</span><span class="o">::</span><span class="kr">thread</span> <span class="n">threads</span><span class="p">[</span><span class="n">N_THREADS</span><span class="p">];</span>
<span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">i</span><span class="o"><</span><span class="n">N_THREADS</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
<span class="n">threads</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="kr">thread</span><span class="p">(</span><span class="n">maxCollatzPath</span><span class="p">,</span> <span class="n">i</span><span class="o">*</span><span class="n">nPerThread</span><span class="p">,</span> <span class="p">(</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span><span class="o">*</span><span class="n">nPerThread</span><span class="p">,</span>
<span class="n">std</span><span class="o">::</span><span class="n">ref</span><span class="p">(</span><span class="n">stoppingTimes</span><span class="p">[</span><span class="n">i</span><span class="p">]),</span> <span class="n">std</span><span class="o">::</span><span class="n">ref</span><span class="p">(</span><span class="n">maxNs</span><span class="p">[</span><span class="n">i</span><span class="p">]));</span>
<span class="p">}</span>
<span class="c1">// Get max out of all thread return values
</span> <span class="kt">int</span> <span class="n">maxStoppingTime</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">big_int</span> <span class="n">maxN</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">i</span><span class="o"><</span><span class="n">N_THREADS</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
<span class="n">threads</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="n">join</span><span class="p">();</span>
<span class="k">if</span><span class="p">(</span><span class="n">stoppingTimes</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">></span> <span class="n">maxStoppingTime</span><span class="p">)</span> <span class="p">{</span>
<span class="n">maxStoppingTime</span> <span class="o">=</span> <span class="n">stoppingTimes</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
<span class="n">maxN</span> <span class="o">=</span> <span class="n">maxNs</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">printMax</span><span class="p">(</span><span class="n">maxStoppingTime</span><span class="p">,</span> <span class="n">maxN</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Thinking about the problem for a little while, you might think to say something along the lines of,</p>
<blockquote>
<p>Ahh, but Jamie, won’t the numbers that have the longest stopping time cluster around the top end? Thus, the thread that receives the last block of numbers will inevitably run much slower than the others.</p>
</blockquote>
<p>It’s perfectly plausible, but then I tried interlacing the numbers given to the threads, even giving them random numbers, and it just didn’t seem to help much. Looking at a sample of the stopping times below, for numbers up to 10,000,000, you can actually see that the stopping times are pretty well dispersed anyway, we absolutely <em>should</em> see little speed up, if any at all, if we interlace things, it’s going to give the same result!</p>
<p><img src="/assets/img/collatz/scatter_stopping_times_1e7.png" alt="scatter_stopping_times_1e7" /></p>
<h2 id="opencl-with-boost-compute">OpenCL with Boost Compute</h2>
<p>This was the fun one. I’d never done much GPU programming before so when I actually managed to get it to work I was so elated I had to take a wee break and eat a digestive. The code is remarkably simple thanks to the <a href="http://www.boost.org/doc/libs/1_63_0/libs/compute/doc/html/index.html">Boost compute library</a>.</p>
<p>The code that actually runs on each of the GPU cores is identical to the serial version, but we wrap it in the <code class="highlighter-rouge">BOOST_COMPUTE_FUNCTION</code> macro to let Boost convert it properly. Note the use of <code class="highlighter-rouge">cl_ulong</code>, the equivalent to our previously used <code class="highlighter-rouge">long long</code>.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>BOOST_COMPUTE_FUNCTION(int, gpu_calc_stopping_time, (cl_ulong n), {
/* SERIAL CODE */
});
</code></pre></div></div>
<p>Then, we set up the device, load the starting numbers into the GPU and tell it to compute the stopping time for each one using the transform function, similar to a functional map. Then we find the maximum stopping time (still on the GPU) and return the value. It’s a little bit of boilerplate code to copy data to and from the GPU and this is a very rough implementation, but I just find this code amazingly simple.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void boostCompute() {
// Setup device
compute::device device = compute::system::default_device();
compute::context context(device);
compute::command_queue queue(context, device);
// Fill array with increasing values
std::vector<cl_ulong> starting_points(UPPER_LIMIT);
for(int i=0; i<UPPER_LIMIT; ++i) {
starting_points[i] = i+1;
}
// Copy numbers over
compute::vector<cl_ulong> device_vector(UPPER_LIMIT, context);
compute::copy(
starting_points.begin(), starting_points.end(), device_vector.begin(), queue
);
// Calculate stopping time on each value
compute::transform(
device_vector.begin(),
device_vector.end(),
device_vector.begin(),
gpu_calc_stopping_time,
queue
);
// Find max
compute::vector<cl_ulong>::iterator max =
compute::max_element(device_vector.begin(), device_vector.end(), queue);
// Get max back
compute::copy(max, max + 1, starting_points.begin(), queue);
int maxStoppingTime = starting_points[0];
int maxN = max - device_vector.begin() + 1;
printMax(maxStoppingTime, maxN);
}
</code></pre></div></div>
<h2 id="the-benchmark">The Benchmark</h2>
<p>Running a little benchmarking script that times each implementation running ten times each to get an average, we compare the parallel implementations with the serial version.</p>
<table>
<thead>
<tr>
<th> </th>
<th>Serial</th>
<th>OpenMP</th>
<th>Threaded</th>
<th>Boost Compute</th>
</tr>
</thead>
<tbody>
<tr>
<td>Average Time</td>
<td>5.306</td>
<td>1.620</td>
<td>1.595</td>
<td>0.214</td>
</tr>
<tr>
<td>Speedup</td>
<td>1</td>
<td>3.27</td>
<td>3.33</td>
<td>24.79</td>
</tr>
</tbody>
</table>
<p>This benchmark was performed on an Intel i5 quad core processor, with the GPU code running on an Nvidia GTX 570 (about 450 cores).</p>
<h2 id="conclusion">Conclusion</h2>
<p>This blog post is a few things, a comparison of parallel methods, an example of a nice pleasingly parallel mathematics problem, and just a log of my introduction to parallel programming in general. To conclude the comparison of the methods, the wonderful speed up of between 3 and 4 times from the openMP and threading implementations is to be expected on a quad core machine. What is really quite important to note is, at least for this simple parallel problem, openMP was realistically just as effective at speeding up the code as the handwritten threading stuff, but openMP is <em>just one line of code</em>. That’s just amazing. With just a little bit more work, the speedup from implementing this on the GPU is undoubtedly worth it in this case. 25 times faster is just crazy, especially on a 5-6 year old graphics card.</p>
<p>Please do send me any feedback you have on this, be it about the maths, the code or even just the writing. This blog and my writing style are still in infancy so any feedback at all is incredibly helpful. Cheers for reading!</p>Jamie QuinnThe Collatz conjecture, named for Lothar Collatz, goes as follows.The Making of the Fiddle Synth I2017-01-29T12:02:00+00:002017-01-29T12:02:00+00:00http://blog.jamiejquinn.com/violin-synth<p>My idea for a violin synthesizer came about from a Lau concert I recently went to.</p>
<p>Before the concert there were a few different workshops, one of which was a synthesizer making workshop run by Martin Green, the accordion player from Lau. Unfortunately I didn’t make it along, but I did manage to see the concert where he uses synthesizers in just the right places to create an amazing deep sound. The kind of synths he was using were not just keyboard synths though, he played some kind of mental wooden frame with a couple of strings and springs attached. The noise that came out of that when pumped though the keyboard to then add pitch to sound was incredible. During the entire gig I started to really feel like I wanted a way to produce these kinds of sounds but without taking all the time to actually learn a keyboard instrument. I figured, I’m already a decent fiddle player, why can’t I construct a fiddle shaped synth?</p>
<p>In prototyping this idea it seems like there are a few distinct parts to the instrument; how it detects my finger placement on the instrument, how it processes that information and how it produces the sound.</p>
<h3 id="detecting-finger-placement">Detecting Finger Placement</h3>
<p>I want to get the synth as close to playing a fiddle as possible. That means I have to have very good accuracy in where the synth thinks my fingers are placed, a discrete system of buttons might do but I also need microtones for musical features like vibrato and slides. So I need some way of continuously and accurately detecting where my fingers are.</p>
<p>My very first idea was to simply use the violin string itself as a variable resistor, passing a current from the endpoint of the string to an electrical contact on my finger and using the resistance of the string itself as a measure of the length. However, using a multimeter I found the resistance per length of the string to be under a hundred milliohms per centimeter. Doing a little bit of research on how the Arduino could possibly detect these small resistances it seems like it’s a difficult task, especially if we’re wanting to detect changes in length as small as perhaps a millimeter. This idea has certain advantages in that it wouldn’t require building a whole new instrument! It could just be attached to a regular fiddle. Because of that I’m not giving up on this idea but it needs a little more testing of different types of string and ways to measure small resistances.</p>
<p>After failing to really get anywhere with the string idea, I came across Spectra Symbol’s <a href="http://www.spectrasymbol.com/product/softpot/">softpots</a>. These are simply long flat potentiometers which have a resistance that varies linearly with where you’re touching it. Perfect, right? Well I’ve bought some and we’ll see how it goes.</p>
<p>I had also heard of people using <a href="https://www.bareconductive.com/shop/electric-paint-50ml/?gclid=Cj0KEQiAw_DEBRChnYiQ_562gsEBEiQA4LcssuHbBHbOonK1rWwtI1zLbZnkc8qW16UWxkNb6rlz9UoaAvzy8P8HAQ">electrical paint</a> for interesting projects so naturally I thought about perhaps using it for this. It’s literally what it sounds like, conductive paint that you apply to a surface and the resistance varies over whatever you’ve painted it on. This would allow me to create my own (possibly linear) potentiometers in basically any shape I want. I’ll see how the softpots work but this is probably an option I’ll explore as well.</p>
<h3 id="producing-the-sound">Producing the Sound</h3>
<p>Because I have very little electronics experience I figured using a nice simple arduino board would be the best way to at least prototype the whole thing. Having previously went through a short course on the arduino I knew roughly what I was doing but it was about 7 years ago so I decided to buy one of the <a href="https://www.amazon.co.uk/gp/product/B01D8KOZF4/ref=oh_aui_detailpage_o01_s00?ie=UTF8&psc=1">starter packs</a>. Ultimately I’d like to be able to take the notes that the arduino produces, convert it into MIDI, probably following <a href="https://www.arduino.cc/en/Tutorial/Midi">this</a> tutorial, and play that through a regular keyboard synthesizer.</p>Jamie QuinnMy idea for a violin synthesizer came about from a Lau concert I recently went to.Setting Up This Site2016-04-09T22:44:51+00:002016-04-09T22:44:51+00:00http://blog.jamiejquinn.com/this-site<p>First there was a purely html site. I was about eleven.</p>
<p>Then, there was the Wordpress blog (or maybe Blogger). I was probably about thirteen.</p>
<p>After that came the social networks, Myspace, Bebo, Twitter, Facebook and Instagram, and then I decided I needed an actual blog for some reason. In my naivety I went for a full blown CMS stack, Mezzanine, a Django project. It took about a week solid, buying server space, setting up the entire site, finding a theme, tweaking the theme, breaking the theme, etc. I didn’t post a single thing.</p>
<p>Finally today I decided I would knuckle down, build a nice <em>simple</em> blog and actually post to it. I feel like I’ve actually got things to say now, things about maths and programming and the different projects I’ve done over the last few years. But I needed to keep it simple. Enter Jekyll.</p>
<h3 id="jekyll">Jekyll</h3>
<p>I’d heard about Jekyll when I was last researching how I might want to build my blog. It basically works in the same way as large scale CMS’s like Wordpress and Django-CMS by using templates to serve up information like blog posts. While CMSs typically store their information in a database and interact with it as the user needs it, Jekyll is a static site generator, meaning all the information is stored in (typically) markdown files and then the entire site generated as a static site. When I first thought of doing a blog, Jekyll seemed too simple, maybe a bit inflexible without an interactive client-server thing going on. Having over the last 12 hours built the blog in a <em>day</em>, I can say that I was wrong and the thing that took the longest was actually settling on a theme (because there are a <a href="jekyllthemes.io">lot</a>).</p>
<p>There is a good reason why I like Jekyll now. It is to blogs what a cafetiere is to coffee makers. Sure, you could go down the route of buying a fancy electric coffee maker that automatically grinds the beans and makes you coffee that’s ready just before you knew you wanted it, and that might be exactly what’s needed for an office. I mean can you imagine thirty people using the same cafetiere? But what if it’s just you, and you don’t <em>really</em> know how fancy coffee makers work, and you don’t <em>really</em> have the time to figure them out, and you frankly just want some coffee? Why would you go for the option that’s overkill?</p>
<p>Although the analogy to the coffee makers is a bit tongue-in-cheek, hopefully you can appreciate where I’m coming from in that, for a simple blog, Jekyll is all I need. As a plus, I haven’t spent a thing on hosting or domain names this time around since I’ve decided that <a href="https://pages.github.com/">Github Pages</a>, with its free hosting and domain name, is absolutely fine.</p>
<p>The workflow I’ve settled on begins with modifying a local version of the site, which I set up following the Github Pages <a href="https://help.github.com/articles/setting-up-your-github-pages-site-locally-with-jekyll/">tutorial</a>. I can test it by running from the site root directory the command</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>jekyll serve
</code></pre></div></div>
<p>and checking that the site is working fine in my browser. Then I simply push the changes to github and usually within minutes the site is live.</p>
<h3 id="minimal-mistakes">Minimal Mistakes</h3>
<p>Once I’d settled on Jekyll, I had to find a theme to base my design on. I’m no web designer so I worked primarily by finding a base theme, <a href="http://mmistakes.github.io/minimal-mistakes/">Minimal Mistakes</a> by Michael Rose, then looking through blogs that I like the look of and trying to emulate the bits that I really liked. For instance, I felt that posts needed something to make major headings stand out a little more, to really highlight separate sections. I didn’t want to use lines across the page, so after seeing the little solid bars to the left of the headings in <a href="http://gameprogrammingpatterns.com/game-loop.html#motivation">Game Programming Patterns</a>, I went ahead and implemented them. It looks a little clunky still in my opinion so I’ll probably change it later, but in keeping with my simple philosophy for the rest of the site-building project, it’ll do for my needs right now.</p>Jamie QuinnFirst there was a purely html site. I was about eleven.