Learning Clojure: Part 2

In part 1 of my “Learning Clojure” series, I created a simple program to calculate salary based on how many years someone worked. For this post, I’m going to be attempting something a bit more complicated.

Project Gutenberg

One of my favorite websites in the entire world is Project Gutenberg(PG). PG is an archive of books that have passed into the public domain, which makes is a great resource for text mining data. I use it almost every time I need some words to parse, and you should too! So why does this matter right now? I’m glad that you asked.

Outline

Given how simple the last program was, I decided that I should probably take this one up a notch. Its going to involve fetching a file, writing it to disk, reading the file, and processing command line args. In order, here’s what the program needs to do:

  1. Validate command line args – We’re going to accept two arguments. The word that should be counted and a url that points to a .txt file at Project Gutenberg my web host (Project Gutenberg doesn’t like crawlers apparently) for processing.
  2. Download the file – It could be large and might fail. We’ll need to be careful here.
  3. Split the file into a vector – Split the file up on ” ” and load it into a vector.
  4. Print – Print to standard output how many words were found. If none, make it known.

The Program

The full source code can be found at https://github.com/vital101/learn-clojure-wordcount. It looks pretty simple, but I did expand my Clojure knowledge quite a bit with this one. Some of the things I did:

  • Used a 3rd party library
  • Messed around with vectors (split word data) and sequences (args).
  • Wrote to a file.
  • Refactored constantly

I do want to highlight one bit of code that I wrote, because its pretty straight forward but does a lot of stuff.

(defn process
    [url word]
    (write-file (get-source-file url))
    (log "Info" "Processing File...")
    (let [data (cljstr/split (slurp filename) #"\s+")]
        (log "Result" (str (count (filter #{word} data)) " occurrences of '" word "'"))))

My next program needs to be more complicated from a data perspective, so that I’m forced to use things like “map”, “reduce”, and other functional elements on data sets.

Learning Clojure: Part 1

Clojure is a functional programming language based on Lisp and written to run on top of the JVM. I’ve tried learning it in the past, but have failed mostly due to biting off more than I could chew. But not this time! I’m taking my time, reading lots of code, and doing 1st year computer science assignments with it. I figure this worked well when I first learned how to program, so it will probably work well now.

The Return to Trivial Programs

After spending the past 6 years neck deep in non-trivial professional programming, I’m returning to trivial toy programs to learn Clojure. My first task is to write a program that takes user input from the terminal and calculates their salary at a year which they input. More specifically:

  • Starting salary is $1000
  • Salary doubles every year
  • Validate input to make sure it is a number.
  • Write history to file called: salary_history.txt
  • In format…. [years_working]:$[salary]

All in all its pretty straight forward. I currently could write this program in a handful of different languages (Python, PHP, Java, Javascript [Node], Ruby), but am struggling with one bit of the Clojure implementation.

(ns salary.core
  (:gen-class))
 
(defn get-integer
    "Returns a string in integer form, else false."
    [input]
    (try
        (#(Integer/parseInt %) input)
        (catch Exception e false)))
 
;; Incomplete.  Will eventually write to a file.
(defn output
    "Takes the console input and error message and outputs them to file and console."
    [console-input message]
    (println (str console-input ": " message)))
 
;;
;; ????? WTF DO I DO HERE
;;
(defn calculate-salary
    [years]
    ())
 
(defn -main
  [& args]
  (println "How many years do you want to work?")
  (let [user-input (read-line)]
    (let [years (get-integer user-input)]
        (if years
            (calculate-salary (- years 1) 1000)
            (output user-input "This is NOT an integer.")))))

The Python implementation of calculate salary would look something like this:

def calculate_salary(years):
    salary = 1000
    for i in range(years-1):
        salary = salary * 2
    return salary

But in Clojure things are bit more complicated. In Clojure values are immutable. I can’t just loop over the years and keep doubling the salary while storing it in the same variable. I need to use recursion. Or reduce. Or map. Hell, I don’t know. I need to use something functional, lest I want the Clojure experts to laugh at me. I need something that will call a function that doubles whatever value comes into it, then returns. Then I need to call said function up to N times (where N is the number of years that the person enters).

Any ideas?

EDIT
With the help of Ryan (below), I came up with:

(defn calculate-salary
    [years salary]
    (if (= years 0)
        salary
        (calculate-salary (- years 1) (* salary 2))))