Project Euler in Clojure: The First 50 Problems - Problems 10-14

posted Apr 13, 2017, 10:46 PM by Frank Adrian

Problem 10

This problem asks for the sum of the prime numbers less than 2000000. A simple transliteration of the problem using take-while and reduce does the trick, yielding the solution:
(defn euler-10 []
  (->> (gen-primes)
       (take-while #(< % 2000000))
       (reduce +)))

Problem 11

This problem starts by defining a 20x20 numeric grid:
(def e11grid [[ 8 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91  8]
              [49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00]
              [81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65]
              [52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91]
              [22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80]
              [24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50]
              [32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70]
              [67 26 20 68 02 62 12 20 95 63 94 39 63  8 40 91 66 49 94 21]
              [24 55 58 05 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72]
              [21 36 23  9 75 00 76 44 20 45 35 14 00 61 33 97 34 31 33 95]
              [78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14  9 53 56 92]
              [16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57]
              [86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58]
              [19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40]
              [04 52  8 83 97 35 99 16 07 97 57 32 16 26 26 79 33 27 98 66]
              [88 36 68 87 57 62 20 72 03 46 33 67 46 55 12 32 63 93 53 69]
              [04 42 16 73 38 25 39 11 24 94 72 18  8 46 29 32 40 62 76 36]
              [20 69 36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 04 36 16]
              [20 73 35 29 78 31 90 01 74 31 49 71 48 86 81 16 23 57 05 54]
              [01 70 54 71 83 51 54 69 16 92 33 48 61 43 52 01 89 19 67 48]])
Here, we've defined a corresponding matrix using an array of Clojure arrays as our data structure. The problem asks for the maximum of all products of four contiguous linear elements, oriented up-down, left-right, and diagonally, over the entire matrix. Let us suppose we are sitting at an interior point [i, j]. We can imagine four four-element rays emanating from where we stand, one heading to our east, one to our south, one to the southeast, and one to the southwest. Now map these four rays across the entire surface of the matrix, it's clear that, ignoring edge effects, we have covered all linear quadruples in the array. If we map all of these quadruples to their products and take the maximum, we have the answer to out problem. And this is essentially how we will solve our problem. To start, we need to deal with points of our rays that lie outside the bounds of the matrix. If we assume that any element outside the array bounds is zero, then rays that lie outside our matrix will have a corresponding product of zero, and will not interfere with the maximum we want to take. As such, we write a helper function that returns the element at matrix index [i, j] and zero for any coordinate outside the array bounds:
(defn grid-at [[i j]]
  (cond (or (< i 0) (< j 0) (> i 19) (> j 19)) 0
        :else (get (get e11grid i) j)))
Next we define offsets from an origin defining our four rays:
(def quad-deltas [[[0 0] [1 1] [2 2] [3 3]]         ;diagonal right
                  [[0 0] [1 0] [2 0] [3 0]]         ;left-to-right
                  [[0 0] [0 1] [0 2] [0 3]]         ;up-down
                  [[0 0] [-1 1] [-2 2] [-3 3]]])    ;diagonal left
We now define a helper function quad-products-at:
(defn quad-products-at [[i j]]
  (->> quad-deltas
      (map #(map (fn [[x y]] [(+ i x) (+ j y)]) %))
      (map #(->> %
                 (map grid-at)
                 (reduce *)))))
This function takes the four offset rays and uses the first map function to add them to each of the rays to give the four sets of four squares emanating from [i, j]. The second map maps the squares to their values using grid-at function and then produces their product via reduction. This produces four numbers for each square - the products along each of the four rays. To obtain the final answer to the problem, we map this function across each square of the matrix, obtaining the four products for each square, use concat to get the entire list of numbers, and use max to find the maximum:
(defn euler-11 []
  (->> (combo/cartesian-product (range 20) (range 20))
       (map quad-products-at)
       (apply concat)
       (apply max)))

Problem 12

The twelveth Project Euler problems wants the first triangular number with more than 500 factors. Remember that the n'th triangular number is given by Tn= n(n+1)/2. We define the function triangle:
(defn triangle [N]
  (/ (* N (inc N)) 2))
The function prime-factors uses the previously-defined function smallest-prime-factor to peel away factors of N, leaving it's factorization:
(defn prime-factors [N]
  (let [p (smallest-prime-factor N)]
    (if (= p N) [N] (conj (prime-factors (/ N p)) p))))
The function factorization uses the frequencies function to put the factorization of N into a standard form - a factor map where the keys are the prime factors of the number and the value associated with the key is the multiplicity of that prime in N:
(defn factorization [N]
  (frequencies (prime-factors N)))
factor-count uses the factorization to compute the number of factors of N:
(defn factor-count [N]
  (reduce * (map (fn [[k v]] (inc v)) (factorization N))))
Combining these functions gives the solution to this problem:
(defn euler-12 []
  (->> (range 2 1000000)
       (map triangle)
       (drop-while #(< (factor-count %) 500))

Problem 13

Problem 13 gives us a series of very large numbers and asks for us to return the integer produced by taking the first 10 digits of the sum of this series:
(def e13numbers [37107287533902102798797998220837590246510135740250
The solution is transliterated from the problem description:
(defn euler-13 []
  (let [s (reduce + e13numbers)]
    (read-string (apply str (take 10 (str s))))))

Problem 14

The Collatz sequence for a number N is defined by the process N -> N/2 (if N even) and N -> 3N+1 (if N odd). The Collatz Conjecture states that for any positive integer, the Collatz sequence eventually reaches 1. The conjecture is still unproven. The problem asks for the number less than 1000 with the longest Collatz sequence.

We start with a function for calculating a Collatz step:

(defn collatz [N]
  (if (even? N) (/ N 2) (inc (* 3 N))))
The next function returns a lazy Collatz sequence for the given number:
(defn collatz-seq [N]
  (if (= N 1) [1] (conj (lazy-seq (collatz-seq (collatz N))) N)))
The following code solves the problem:
(defn euler-14 []
  (->> (range 1 1000000)
       (map (fn [x] [x (count (collatz-seq x))]))
       (reduce (fn [[i cc :as a] [i2 cc2 :as b]] (if (> cc cc2) a b)))
Here, the range is passed to a map which returns the pair of the number and the length of it's Collatz sequence. The reduction selects the pair with the longest sequence, and the final call strips the number out of the pair.

Project Euler in Clojure: The First 50 Problems - Problems 5-9

posted Mar 23, 2017, 9:37 AM by Frank Adrian   [ updated Mar 23, 2017, 10:25 AM ]

Problems five through nine all have relatively simple solutions in Clojure. My apologies for those wanting more length.

Problem 5

Problem 5 asks for the smallest number cleanly divisible by all numbers 1 through 20. To meet the cleanly divisible criteria, our number must include all of the prime factors of the numbers from 1 through 20. To be the smallest, it must be the product of the minimum number of prime factors that meets the previous criterion.

We'll start by looking through the numbers one-by-one. Our number must be divisible by 2 and then 3. So the number must be divisible by 2*3. The next divisor, 4=22, meaning that to keep the number minimal, we need have only another single factor of 2 in the number or that the number must be divisible by 2*3*2. Five is a new prime divisor and so the number must be divisible by 2*3*2*5. Continuing in this manner, we can see that divisibility by 6 can be satisfied by the existing factors of 2 and 3, seven can be added to the list, and 8 can be accounted for by adding another factor of 2 to the product, meaning that the number must be divisible by 2*3*2*5*7*2. Nine brings in the second factor of 3 and continuing writing factors in this manner up to 20 (whose divisibility is already accounted for by 2 factors of 2 and one of 5 at that point) leaves us with a number that must be divisible by 2*3*2*5*7*2*3*11*13*2*17*19, which having all the factors of numbers less than or equal to 20 is the solution to the problem.

Using Clojure as a desk calculator gives us the answer:

(defn euler-5 "Expand numbers by prime factors and read off non-dupe factors."
  (* 2 3 2 5 7 2 3 11 13 2 17 19))

Problem 6

This problem is another simple calculation problem - find the difference between the square of the sum of the numbers from 1 to 100 and the sum of the squares of these same numbers. Using reduce to calculate the sums leads to another fairly self-explanatory function:
(defn euler-6 []
  (let [to100 (range 1 101)]
    (- (* (reduce + to100) (reduce + to100)) (reduce + (map #(* % %) to100)))))

Problem 7

What is the 10001'st prime number? To determine this, we use a helper function, nth-prime:
(defn nth-prime [n]
  (nth (gen-primes) (dec n)))
Note the (dec n) - Clojure's counting is zero-based, the problem counts from 1. This allows us to write our solution for Problem 7:
(defn euler-7 []
  (nth-prime 10001))

Problem 8

Problem 8 gives us a very big number and asks us for the largest product of thirteen consecutive digits in this number. If we can find all thirteen-character substrings of this number, then we need only map a function over this collection that finds the product of the digits and then take the maximum of these mapped products. Lets start by finding the thirteen-character substrings. This is easily accomplished using Clojure's partition function:
(def ce8 7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450)

(defn get-partitions [] (partition 13 1 (str ce8)))
This function returns a lazy sequence of lazy sequences of characters:
((\7 \3 \1 \6 \7 \1 \7 \6 \5 \3 \1 \3 \3) (\3 \1 \6 \7 \1 \7 \6 \5 \3 \1 \3 \3 \0) (\1 \6 \7 \1 \7 \6 \5 \3 \1 \3 \3 \0 \6) ...)
Each sub-sequence contains the thirteen characters of the substring it corresponds to. To convert these to their products, we must convert the character sequences to integer sequences and multiply. To do the first, we construct an auxiliary function char->int:
(defn char->int [c] (- (int c) (int \0)))
Once this function is available, we can solve the problem as follows:
(defn euler-8 []
  (->> (get-partitions)
       (map #(apply * (map char->int %)))
       (apply max)))
In this solution, we take the partitions, use the map function to get the products, and take the max. The internal function in the map form takes the sequence, converts its characters to integers, and multiplies them together.

Problem 9

There is a single Pythagorean triple (i.e., [a, b, c] such that a2+b2=c2) where a+b+c=1000. Our problem asks for a*b*c for this number. Assuming we can find all triples summing to 1000 and have a function that tests whether or not a triple is Pythagorean, we can write our solution as:
(defn euler-9 []
  (->> (triples-summing 1000)
       (filter pythagorean-triple?)
       (map #(apply * %))
The test for being a Pythagorean triple is simple:
(defn pythagorean-triple? [[a b c]]
  (= (* c c) (+ (* a a) (* b b))))
The code for producing a set of all positive integer pairs summing to N is:
(defn doubles-summing [N]
  (let [rangeN (range 1 (inc N))]
    (take (dec N) (map (fn [x] [x (- N x)]) rangeN))))
Here, we take the numbers i from 1 to N, and use the map function to pair them with N-i. We only use the first N-1 of these because we do not want to consider the number pair [N, 0]. We will use these doubles to find triples summing to N:
(defn triples-summing [N]
  (apply concat
         (let [rangeN (range 1 (inc N))]
           (map (fn [k] 
                  (let [doubles (doubles-summing (- N k))]
                    (map (fn [[a b]] [a b k]) doubles)))
                (take (dec N) rangeN)))))
In this function, the first map is applied to the numbers k from 1 to N, where for each it maps the doubles [a, b] summing to N-k and packages them into a triple [a, b, k] summing to N. The sequences of triples produced by this map are then concatenated to form the final list of all triples. Again, note the (take (dec N) ...) clause. This is again because we do not want to consider solutions of the form [a, b, 0], where a+b = N.

Project Euler in Clojure: The First 50 Problems - Problems 2-4

posted Mar 17, 2017, 12:17 PM by Frank Adrian   [ updated Mar 21, 2017, 9:18 AM ]

Problem 2

The second Project Euler problem asks us to find the sum of the even Fibonacci numbers less than 4000000. As such, the simple generate, filter, and sum solution will look like that of Problem 1, but with a couple twists:
(defn euler-2 []
   (->> (fib-seq)
        (take-while #(< % 4000000))
        (filter even?)
        (reduce +)))
Here, we take from the sequence of Fibonacci numbers in order until the next Fibonacci number is greater than or equal to 4000000, filter the resulting collection to get the even elements, and sum them by reducing - a straightforward translation of the problem. The main trick would seem to be writing the function fib-seq. Our first attempt is as follows:
(defn fib-seq-bad
  "Returns a sequence of Fibonacci numbers."
     (fib-seq 0N 1N))
  ([a b]
     (cons b (fib-seq-bad b (+ a b)))))
In this code, a represents the previous Fibonacci number and b represents the current Fibonacci number in the sequence. Assume we're at Fk in the sequence. The next time we call fib-seq-bad, Fk, the current value, will be the next previous value, and Fk+1=Fk+Fk-1(the current previous value)=a+b will be the next value. This gives us the recursion relation shown in the function. The zero arity function seeds the 2 arity function with initial values of 0 and 1 (previous and current values for the first time the function is called). The list is built by consing each current value to the list generated by the next values. The only problem comes when we actually try to run the function - it fails with a stack overflow error. The reason why is inherent in the solution - in order to cons the current Fibonacci number onto the tail of the sequence, you first need to build that sequence. And to build this infinite sequence, you must recurse infinitely. The good news about Clojure is that instead of forming this infinite list, we can instead use lazy-seq to form a lazy sequence.

So what is lazy-seq?

According to its Clojure documentation, lazy-seq:

Takes a body of expressions that returns an ISeq or nil, and yields a Seqable object that will invoke the body only the first time seq is called, and will cache the result and return it on all subsequent seq calls.
In our case, we'd use it like this:
(defn fib-seq
  "Returns a lazy sequence of Fibonacci numbers"
     (fib-seq 0N 1N))
  ([a b]
      (cons b (fib-seq b (+ a b))))))
In this code, the first time seq is called to get the contents of the list, the body is evaluated, leading to a list holding the current Fibonacci number, followed by the rest of the list. But when we attempt to build the rest of the list, our call to fib-seq immediately hits a lazy-seq call, which does not get invoked until the next seq is called. Each time this happens, another item is consed onto the end of the list, allowing us to realize the list in order and only as needed. As such, we've created a lazy sequence of Fibonacci numbers which can be processed by our solution. This is not the only lazy sequence we shall be building. Problem 3 also utilizes lazy-seq to generate a sequence of prime numbers.

Problem 3

In this problem, a key piece is generating a lazy sequence of primes. To do this, we use a sieve - in a map we hold associations between the next number that is going to be sieved out by the current set of primes and the primes that number will be sieved out by. To find the next prime, we look in the table at the next number, d. If we find d in the table, we know it is not prime and that the next number(s) sieved by the algorithm will be the sum of that number and the prime(s) that sieves the number. If d's not in the table, then it's prime. We'll step through the algorithm to demonstrate.

Start with an empty map and initial number d=2. Since 2 is not a key in the map, 2 must not be sieved yet and is thus a prime - add it to the list of primes and add to the map [d2, (d)]. The map now holds {4 (2)}. The next number is 3. We check and it is not in the map - again this number is prime so add it to the list of primes. Add [9 (3)] to the map, giving a map of {4 (2), 9 (3)}. The next number is 4. It is in the map so it is not prime. For each of the sieve factors of d, add the sieve factor for the number and update the map. This gives us a map of {6 (2), 9 (3)}. The next number, 5, is not in the map and prime, leading to the resulting map {6 (2), 9 (3), 25 (5)}. Six is in the map and is not prime. We increment it in the sieve, resulting in the map {8 (2), 9 (3), 25 (5)}. We continue this way until we come to the number 10, when our map contains {10 (2), 12 (3), 25 (5), 49 (7) }. Ten is not prime, but when we add it's sieving factor to it, we find that the number 12 is already in the map. To fix this, we add both sieve factors under 12, giving the map  {12 (2 3), 25 (5), 49 (7)}. Eleven is prime so we augment the map again giving {12 (2 3), 25 (5), 49 (7), 121 (11)}. Twelve, the next number, is not prime and we increment the number by both factors in the sieve, re-splitting the factors giving: {14 (2), 15 (3), 25 (5), 49 (7), 121 (11)}. We continue, adding primes to our list and updating the sieve properly depending on whether the currently examined number is prime or not.. The algorithm is shown here:

(defn gen-primes "Generates an infinite, lazy sequence of prime numbers"
  (let [reinsert (fn [table x prime]
                   (update-in table [(+ prime x)] conj prime))]
    (defn primes-step [table d]
                 (if-let [factors (get table d)]
                   (recur (reduce #(reinsert %1 d %2) (dissoc table d) factors)
                                 (inc d))
                   (lazy-seq (cons d (primes-step (assoc table (* d d) (list d))
                                                 (inc d))))))
    (primes-step {} 2)))
In this code, reinsert does the job of updating the table with the sieved numbers and their sieving primes. The call to recur (when the number is in the table) handles the case where the number is not prime using reinsert to update the table (repeatedly for each prime sieving factor, using reduce), while the cons branch of the code handles the case when the current number is prime, adding of the current number being checked to the list of primes and updating the sieve with the square of the prime. The lazy-seq call in the cons branch ensures that the list will be delivered lazily. And, of course, we start at 2 with an empty map.

Why is the first sievable number for a given prime initialized to that prime's square? One can see that earlier multiples of this prime will be caught by composites and primes less than this number. So 2*5 will be caught by 2's sieve, 3*5 by 3's etc. The first number not caught by a number's sieve is that number squared (and after that, its increments take care of the sieving).

So what does the problem ultimately ask? It wants the maximal prime factor of 600851475143. How do we find the maximal factor? Assume a number N can be written as a product of prime numbers p0ip1j...pnk, with factors ordered from low to high. By dividing by p0, we obtain a smaller number. Continuing removing smallest factors in this manner, we are eventually left with N=pk, the maximal prime factor. To find the smallest prime factor of the number at any iteration, test against the sequence of primes until the first one is found that divides N. This will be the smallest prime factor.

The code to accomplish all this is:

(defn smallest-prime-factor "Iterate through the sequence of primes until you find a divisor."
  (first (drop-while #(not (factor? % N)) (gen-primes))))

(defn max-prime-factor "Divide by smaller factors until you're left with the largest."
  (let [p (smallest-prime-factor N)]
    (if (= N p) p (recur (/ N p)))))
This allows the solution to Problem 3 to be written simply as:
(defn euler-3 []
  (max-prime-factor 600851475143))

Problem 4

This problem (like several from the Project Euler corpus) deals with palindromic numbers (numbers whose digits read the same backward and forward). A prime requirement is to test whether a given number is a palindrome. For this, we check if the reverse of the number's string equals the number's string:
(defn palindrome? [N]
  (= (str N) (apply str (reverse (str N)))))
The problem asks for the maximal palindromic number formed from the product of two three digit numbers. The set of three digit numbers can be generated by:
(def three-digits (range 100 1000))
This returns the numbers from 100 to 999. The remainder of the problem can be written as follows:
(defn euler-4 []
  (->> (combo/cartesian-product three-digits three-digits)
       (filter (fn [[i j]] #(>= i j)))
       (map (fn [[i j]] (* i j)))
       (filter palindrome?)
       (apply max)))
In it, we generate all pairs of three digit numbers using combo/cartesian-product. We take those pairs and, since i*j=j*i, we get rid of half of them by trying only the ones where i>=j. Next we map these pairs to their products and filter them on whether the products are palindromic. Finally, we take the max of all of the palindromic products to find the answer.

Next post, we'll tackle problems five through nine.

Project Euler in Clojure: The First 50 Problems - Problem 1

posted Mar 16, 2017, 9:50 AM by Frank Adrian   [ updated Mar 16, 2017, 10:26 AM ]

I enjoy programming. In addition to being my livelihood, it serves as one of my hobbies. I enjoy code kata and am always searching for new exercises. I had heard of Project Euler, but had not participated yet, and decided that this would be my next challenge. I found a site in github purporting to be Clojure solutions of these problems, but they were neither particularly complete nor well-documented. I thought I could do better, learn more Clojure, and refresh my knowledge of number theory along the way. In addition, the solutions are an interesting way to exercise a language.

I expect that these blog posts will be most useful to beginning to intermediate Clojure programmers. I've searched for the most straightforward solutions I could find for these problems and most of them have relatively simple solutions. Note that these solutions are spoilers for the problems. If you want the maximum benefits from these problems, you might want to solve them first yourself. My code has been checked for correctness, but you might find better solutions! Please share in the comments section any solutions you think are better, why, etc.

The Namespace Declaration

(ns euler.core
     (:require [clojure.math.combinatorics :as combo])
     (:require [clojure.math.numeric-tower :as nt])
     (:require [clojure.set :as set]))
We require three Clojure libraries, which are used for a few minor bits of code in the problems - from combo we get combo/cartesian-product, from nt we obtain nt/expt, and from set we use set/difference. These are useful libraries for anyone doing numeric and/or combinatorial problems (which many programming problems ultimately come down to).

Project Euler - Problem 1

The first PE problem wants the sum of all numbers not divisible by three or five less than 1000. The solution would seem to be so simple that it writes itself:
(defn euler-1a []
   (reduce + (filter #(or (zero? (mod % 3)) (zero? (mod % 5))) (range 1 1000))))
This is a perfectly fine solution (and gets the correct answer) - we take a collection of numbers from 1 to 999 (the range function), filter the ones divisible by 3 or 5, and sum the remainder using reduce. However, the filter function is a bit opaque; its meaning somewhat obscure. The repetition in the test function is a code smell, as well. We can refactor this code a bit to obtain an easier to understand version:
(defn factor? [n k] "Is n a factor of k?" (zero? (mod k n)))

(defn euler-1b []
   (reduce + (filter #(or (factor? 3 %) (factor? 5 %)) (range 1 1000))))
This version is not only clearer, but provides the function factor? which will be useful in upcoming problems. But can we do a bit better? The problem description talks about ignoring numbers divisible by 3 or 5. Is there a clearer way to capture that? One solution can be obtained by defining a new function divisible-by?. This function, when passed an integer, returns a function to determine if another number is divisible by that integer:
(defn divisible-by? "Return a function that tests if a number is divisible by the argument."
  (partial factor? n))

(defn euler-1c []
  (reduce + (filter #(or ((divisible-by? 3) %) ((divisible-by? 5) %)) (range 1 1000))))
This solution is a bit clearer, using the same terminology as the original Project Euler problem. The trade off is that a function call in the function position of a form is a bit unusual. The good news is that this unusual usage calls out the special nature of divisible-by? function which, with any luck, will be examined by anyone looking at the code. This function will also be useful in future problems.

The final version of the code uses the threading macro ->>:

(defn factor? [n k] "Is n a factor of k?" (zero? (mod k n)))

(defn divisible-by? "Return a function that tests if a number is divisible by the argument."
  (partial factor? n))

(defn euler-1 []
  (->> (range 1 1000)
       (filter #(or ((divisible-by? 3) %) ((divisible-by? 5) %)))
       (reduce +)))
This version looks a bit more Clojure-ish, even though to a classic Lisper, like me, the string of forms still looks a bit too prog-ish for comfort. In general, the longer the set of nested calls, the clearer the threaded solution becomes. Another choice would be to use an internal let to call out intermediate results like this:
(defn euler-1d []
  (let [count999 (range 1 1000)
        nums-to-add (filter #(or ((divisible-by? 3) %) ((divisible-by? 5) %)) count999)]
    (reduce + nums-to-add)))
This form is useful when the intermediate results have some significance (not like in this problem) and naming them makes the algorithm easier to understand (also not like this problem). Generally speaking, a competent Clojure programmer should understand the usage of all of these forms of sequencing calls, be able to convert from one form to another, and select the one that is clearest for use. In the next post, we'll cover problems two through four, touching on lazy-seq, prime generation, and factorization.

Heartbleed and code opacity

posted Apr 10, 2014, 5:47 PM by Frank Adrian

There is an old open source saying that "Many eyes make all bugs shallow". So what happened with the now infamous heartbleed bug? Supposedly, open source code is less prone to these kind of security breaches because so many eyes can look at it. In addition, open source is supposed to have better fix properties because fixes can be turned around and disseminated quickly. However, something happened with heartbleed.

One issue is that most eyes are shallow, not deep. If code is opaque (as is the case for Open SSL) than it doesn't matter how many eyes are on it - bugs can be hiding in the submerged depths where only the deepest eyes are likely to stumble upon them. And it is likely that the number of deep eyes looking at the code is still as small as is usual in any other software, closed or open.

The second issue is that the decentralized nature of open source movement actually harms as much as helps with dissemination. Commercial software, with its lists of registered users at least has a starting point to make issues known and patches distributed. Open source projects tend to have much more chaotic and imprecise means of the same. No one, commercial nor open source, can be sure that software will be patched - but at least commercial software has a starting point.

So, what can the open source community learn from this issue?

First, make sure that the code is written in the clearest possible way. Make sure that as many eyes can peer into its depths without obscuring murk. It may hurt performance, especially with ubiquitous software like Open SSL, but the alternative, as we have seen, sucks. An old wise man once said that "Premature optimization is the root of all evil." Although this is true, code that needs to be secure cannot afford any optimization - the code must be lucid enough to be easily understood and checked, lest hard-to-detect security defects sneak in. God knows you probably shouldn't be wrapping OS calls (the putative source of this defect).

Second, for code as ubiquitous and as important for security as this code is, independent review by people qualified to do so is vital. Most projects have only people related to the project reviewing the code. This is probably fine for the latest Zul Zebra Zaga game, or other kind of fart app, but not so good for a piece of code upon which most of the current internet's security rests.

Third, it is almost certain that the crucial parts of the Open SSL code, had it been written in a simple intelligible form would have been small enough to be formally verified. And, again, given the criticality of this code to secure operation of the internet, it should have been done.

Do all of these ideas have costs? Yes. But so do bugs that have as high an impact as heartbleed appears to. We've known how to not come a cropper when the stakes are high for at least the past thirty years. The fact that these kinds of issues are still arriving (and in such crucial code) is an embarrassment to the whole industry. 

Software Engineering: 8 - Align process with your environment

posted Nov 5, 2013, 10:49 AM by Frank Adrian   [ updated Nov 5, 2013, 10:55 AM ]

You wouldn't wear a parka in July would you? If you were in Antarctica, you would, because there at that time of year, it's winter... Plus, let's be clear - it's Antarctica, and the average temperatures, even at the height of Antarctic summer, don't get above zero. Your environment matters. A lot.

It matters when designing processes for your organization, as well. A good process must be aligned with four important features of your professional environment - the market into which you're selling your products, the nature of the products themselves, your organization, and the skills and abilities of the teams within it.

Let's look at market influences first. If my market is consumer-oriented, dominated by social media and moving at lightning speed, I don't want to wait for a traditional process to "get it right". I'm going to be putting my best agile practices into place and releasing every two weeks. On the other hand, if I'm releasing into a more static enterprise environment, I probably don't need frequent releases (in fact, my rather conservative IT customers are probably unhappy if they have a new version of the product more than once a year). As such, a traditional process with decent change control mechanisms might bring better results with less uncertainty on which features will be delivered.

It's the same with product alignment - a web-based product can have more frequent releases than one that needs to be updated on every client's system. The process for the first product should target a minimal time to release, whereas this factor is less important in the process used to develop the latter. The process for the second should take the essential difficulty of the upgrade process into account and, if possible, use this essential issue to its advantage (not that one shouldn't make the upgrade process as simple as possible in any case). If my product needs language translation, I probably (for cost reasons) don't want to have new features translated for each two-week release. I'd cut the releases to once each quarter (even if my internal releases were on a two-week cadence) and integrate a sprint/iteration/whatever to have the product translated, rechecked, and released.

Similarly, organization counts. If I have a company that boxes both time and features and that doesn't keep teams stable, my process should look different from one that boxes only time and has stable teams. If my marketing department needs a half a year to plan a product launch, it doesn't matter much if my team can release on two week's notice and my process should take this into account. It's the same with the skills and abilities of the team. An inexperienced team will need a lot more design and code reviews in its definition of done than an experienced team - maybe even more QA cycles. And your process needs to take this into account.

That being said, if your process isn't aligned with these four facets of your environment, it doesn't mean there's a total disaster brewing. A good team can make almost any process work (For varying values of "work". N.B.: The converse is not true - a bad team can make hash even if you have a good process in place). A poor match between environment and process usually just means that it will be more work to get the results you want (And who wants that?).

The other thing to realize is that any standardized process you put into place will vary in its alignment from release-to-release, product-to-product, team-to-team, etc. Markets change; products change; teams change. As such, one must be ready to adapt one's process. And, again, failure to do so won't necessarily bring disaster, just more work.

So, when putting a process into place consider carefully the four environmental attributes listed above - market, product, organization, and team. If you don't you might find yourself in a situation analogous to being at the bottom of the world in shorts, sandals, and a tee shirt. And nobody wants that.

Getting ready to record the third This, Not This! EP

posted Jul 25, 2013, 10:04 AM by Frank Adrian   [ updated Jul 25, 2013, 10:05 AM ]

The headline says it all. The band is looking for a more organic feel with all of us recording basic tracks at the same time. We'll be recording this upstairs in my house, again, trying for a more organic feel. In all of this, I'm taking the philosophy that simpler is better and that's reflected in way I'm miking the drums.

I started by using my bedroom for the drums - it's the room with the best sound in the house. It's basic sizes are not multiples giving it complex room modes and it also a gabled ceiling to add reverberant space and further reduce room modes. It also has a bed and some furniture in it for reflection and absorption. Finally, it opens into the hallway that opens onto the bottom floor, giving it a huge ambiance, if needed.

We started miking by getting a pretty good sound for the drums in the overheads (a pair of matched Mike Joly modded MXL 990s). Next, a Shure Beta 52 was placed to reinforce the kick and a couple of packing blankets tossed over that to reduce bleed. Then, I added a Shure SM61 (a venerable old dynamic vocal mike with a nice high-end sizzle) to reinforce the snare. I balanced the mix and put up an Audio Technica AT3035 as a room mike and added that in for some more ambiance (since it's going to be smashed to hell when I mix, any LDC mike would work here, but the AT3035 is familiar to me). Until this setup doesn't give us what we need, this will be the default.

In the past, I've used another mike on the hi-hat, but that always seemed to lead to nothing but snare bleed and heartache, so I'm not using one this time. Another omission? Individual tom mikes. In general, I find that using these only added to the ringiness of the toms (not my favorite sound) and that boosting the overheads in the mix during the fills works just as well. This simpler miking technique should also reduce (if not completely obviate) phase issues, too. So, simple, simple, simple and it sounds pretty good.

The guitars will be recorded using Cascade FatHead II ribbon mikes. The engineer on our first album used them and I had good luck with them on the last album, as well. They just seem to sound good for this purpose. Gary will go in the laundry room, Phil in the guest bedroom. The bass will be DI'ed from the control room, as I can muck about with it as much as I need with my standard plugins.

As for vocals, Debbie will be using an Ear Trumpet Labs Edwina in the guest bathroom. This is an LDC built by a local microphone maker named Phil Graham, optimized for live use. Intention notwithstanding, it sounded very good for Debbie's vocals on the last album and it has a certain mojo in it's look that seems to make people comfortable while recording. Pretty straightforward setup here - pop screen in front, an absorbing screen in back to try to tame some room reflections.

You notice that I'm not mentioning preamps here. They say that clean is the new color. If I really need color, I'll put it in by re-amping, not preamping. I'm keeping everything clean by using solid state preamps - a combination an M-Audio Octane, a PreSonus Digimax, and three Symetrix 202s. I may break down here and use a UA 610 as a preamp for Debbie's vocals (Yes, clean is the new color, but we can go too far with everything, can't we?).

I've also been taking time to set up separate headphone mixes for all involved. The four channels for Dave, Gary, Phil, and myself will go through my Furman HDS, while Debbie's channel will be a separate full mix (controlled by the recordist - i.e., me) fed into a discrete headphone amp.

My upstairs looks like the cable fairy came to visit. All tidied and zip-tied around the edges of the rooms, but still a quite unique look. Have I mentioned that I have a very tolerant and loving wife? I do.

So, on to recording... Of course, the main battle in recording an album like this is getting everyone together at the same time to do it, so I don't see it starting until the first week of August. The songs that we will be recording include Goodbyes, Takedown (not to be confused with Takeaway), Proof of Life, and a new one with the working title Hardhitter. In the mean time, I'll probably have Dave do some drumming for some tracks on my solo album.

Software Engineering: 7 - Beware the purple squirrel

posted Jul 25, 2013, 8:37 AM by Frank Adrian   [ updated Nov 5, 2013, 10:56 AM ]

In HR circles, "purple squirrel" is a term that is used to describe the candidate who perfectly fits a multifaceted job description. In theory, the purple squirrel would be up to speed in an instant, providing immediate value in all facets of the role. However, there are many problems with hunting the purple squirrel...

First, you will take a very long time finding such an animal. Months will be spent on the search, interviewing a plethora of rodents just to find that the squirrels in question are run-of-the-mill brown squirrels (or even worse, rats with a tail hairpiece) holding exaggerated resumes. This effort will distract you and your team from other tasks that they might be doing, putting you farther behind.

Next, assume you do find a somewhat purplish-tinted rodent. Your team's desire for the perfect purple squirrel will engender endless debate over whether the shade of purple is too blue or too red or whether the rodent in question is a tree squirrel or a ground squirrel and whether that makes any difference. Decisions on whether to use this rodent will take too long and, quite rightly, many purple squirrels will escape your grasp because you grab at them too late. And, still, all the while, your position will go unfilled and things will fall farther and farther behind.

When you can find a purple squirrel that your team agrees on, you will find that purple squirrels generally understand their rarity and will demand top compensation for their uniqueness. Even if you can find the purple squirrel, you may not have the budget to pay for them. If you do, you can assume that there are others out there also looking for your purple squirrel, ready to whisk him or her away for a few more peanuts.

Now, assume you can and do pay for a purple squirrel. Bring it into your environment and you'll notice that their shade isn't quite the shade you needed anyway - there are enough differences in your environment that you end up training the squirrel to be your shade of purple anyway - the squirrel will not provide the immediate impact that you expect. Plus, to keep the purple squirrel effective, you'll have to feed it a special diet of training to keep its unique skills in shape. If you don't do this, you may find your purple squirrel taking on a decidedly brown shade after a while.

Finally, three months from now, when the environment changes and you need a green squirrel instead of a purple one, you'll find that your purple squirrel, having invested many years in eating the proper food and finding the proper environments to allow him to survive as a purple squirrel, will object to your attempt to dye him green. More insidiously, if you attempt to modify your environment towards a different color, you will find the purple squirrel more likely to sabotage than to support the change.

Seriously, you'd be better off hiring an ordinary brown squirrel who has shown a willingness to be dyed whatever shade of color you need.

In reality, many job descriptions are written in too detailed a fashion, not examining how work could be reassigned among other team members or assuming a candidate needs immediate capabilities in tasks that need to be done sporadically. Employees with unique skill sets decrease your "bus factor". And the managers who look for people with all of these qualifications fail because they either don't want to take the time to train or because they don't recognize the difference between satisficing and optimizing. Ultimately, their projects fall behind or fail, too.

You can avoid purple squirrel hunts by performing regular skills assessment for your team and making cross-training a priority. The latter will also improve team-member engagement by increasing their sense of mastery. If you do end up with a purple squirrel on your team, add one more requirement to your list - that your purple squirrel can teach other squirrels how to be sort of purple. By doing this, you'll reduce the need to hire a purple squirrel the next time.

So, structure your team to reduce the need for purple squirrels. In the end, you'll find that a group of happy, harmonious, flexible, and (most importantly) high-quality but somewhat ordinary rodents will probably fulfill your needs just fine.

Software Engineering: 6 - Observe the 10-60-30 rule

posted Jun 3, 2013, 9:10 AM by Frank Adrian   [ updated Jun 12, 2013, 8:21 AM ]

"Ninety percent of everything is crap." -- Theodore Sturgeon

Ted Sturgeon was talking about writing when he came up with his law and I think he was correct in setting his cut point for excellence at this point. However, when you talk most things - and specifically things like software - I think that his law is overly critical.

Ten percent is probably the proper cut-point between excellence or genius or amazing or whatever other superlative adjective you want to use and everything else, and you probably need to meet that high bar in writing or any other artistic endeavor to be seen as original and making an impact. However, just because something isn't brilliant or original doesn't mean it isn't serviceable or useful - which is the bar we set in most engineering fields. As such, I propose an updated form of Sturgeons Law for engineering that I call the 10-60-30 Rule: 10% of everything is excellent, 60% of everything is useful and/or serviceable, and 30% of everything is crap.*

Original and brilliant need little explanation. However, there are many parts of our systems that don't require great genius - just good solid code that does what it's supposed to. This good code comprises 60% of most systems (and I would presume 60% of writing, art, and music creations). The final 30% is bad - it needs rework or excision and replacement to move it to the categories above.

One other aspect of this rule? The corollary that, even after refactoring and removal of the bad code, the ratios above are still descriptive to the body of code - it's just that you've raised the bar a bit. And this is why code maintenance is so important - it's hard to make the code better by improving either the good or excellent code. But by improving the bad code (which is easier to do), you've improved the quality of the whole code base, and (almost certainly) uncovered new issues in the good portions of the code that can now be improved.

So watch for that 30% of bad code. It's a high enough percentage that you almost certainly know where it's at. Then take the initiative to fix it. It won't get rid of bad code (remember, once that this bad code is fixed, another 30% becomes the worst in the code base), but it will raise the bar on your code's quality. And the people who find the code that you have left will be grateful.

*Actually, you can break down the larger categories further: 10% is excellent, 20% is really quite nice, 40% is good, solid work, 20% is bad but salvageable, and 10% reeks and should be replaced. However, "10-20-40-20-10 Rule" doesn't roll off the tongue as nicely as "10-60-30 Rule" does and, since what you do with the two classes of good and bad code are the same - leave alone and rework ruthlessly, respectively - the latter seems to be better. True aficionados of software numerology may have fun with the fuller definition.

CD Release Party

posted Mar 6, 2013, 8:49 AM by Frank Adrian

The latest CD from This, Not This!, Waiting for the World, has been released! To celebrate, we're holding a CD Release Party at the Lounge of the Hawthorne Theater in Portland at 9 pm on 29 March. Our old friend Mike Ailes will be there with his band, Aina Haina, to open for us. It's going to be a great show, with lots of songs from both our first and the new CD. If you'd like to have your own copy of the new CD (or the first one), they'll be on sale there for $5. Of course, you can also buy it from iTunes or from CDBaby. I'm really proud of this album, as it's the first I've recorded and mixed for the band. And, of course, come out to see us and Aina Haina play!

1-10 of 52