building a currency exchange rates api

While working on umsatz I had to build a tiny currency exchange rates api in go.
While the original API works fine, it clocks in at more than 400 lines of code. Let’s write a shorter version!

In this post we’ll re-build the api, in three steps:

  1. download, parse & cache the EUR exchange rates history xml, provided by the ECB 1
  2. add a tiny HTTP JSON API to request rates
  3. periodically update the cache with new data

This will leave use with a tiny, HTTP currency exchange rates api.

Let’s start by downloading, parsing & caching the EUR exchange xml:

package main

import (
  "encoding/xml"
  "fmt"
  "io"
  "net/http"
)

// these structs reflect the eurofxref xml data structure
type envelop struct {
  Subject string `xml:"subject"`
  Sender  string `xml:"Sender>name"`
  Cubes   []cube `xml:"Cube>Cube"`
}
type cube struct {
  Date      string     `xml:"time,attr"`
  Exchanges []exchange `xml:"Cube"`
}
type exchange struct {
  Currency string  `xml:"currency,attr"`
  Rate     float32 `xml:"rate,attr"`
}

// EUR is not present because all exchange rates are a reference to the EUR
var desiredCurrencies = map[string]struct{}{
  "USD": struct{}{},
  "GBP": struct{}{},
}
var eurHistURL = "http://www.ecb.europa.eu/stats/eurofxref/eurofxref-hist-90d.xml"
var exchangeRates = map[string][]exchange{}

func downloadExchangeRates() (io.Reader, error) {
  resp, err := http.Get(eurHistURL)
  if err != nil {
    return nil, err
  }

  if resp.StatusCode != http.StatusOK {
    return nil, fmt.Errorf("HTTP request returned %v", resp.Status)
  }

  return resp.Body, nil
}

func filterExchangeRates(c *cube) []exchange {
  var rates []exchange
  for _, ex := range c.Exchanges {
    if _, ok := desiredCurrencies[ex.Currency]; ok {
      rates = append(rates, ex)
    }
  }
  return rates
}

func updateExchangeRates(data io.Reader) error {
  var e envelop
  decoder := xml.NewDecoder(data)
  if err := decoder.Decode(&e); err != nil {
    return err
  }

  for _, c := range e.Cubes {
    if _, ok := exchangeRates[c.Date]; !ok {
      exchangeRates[c.Date] = filterExchangeRates(&c)
    }
  }

  return nil
}

func init() {
  if reader, err := downloadExchangeRates(); err != nil {
    fmt.Printf("Unable to download exchange rates. Is the URL correct?")
  } else {
    if err := updateExchangeRates(reader); err != nil {
      fmt.Printf("Failed to update exchange rates: %v", err)
    }
  }
}

func main() {
  fmt.Println("%v", exchangeRates)
}

There are a few things to note:

  • we’re using a map[string]struct{} to define which currencies we’re interested in.
    This adds a little more code since we have to filter the exchange rates, but also cuts down memory usage.
  • we cache all exchange rates in memory and never update them. Since we’re dealing with historic data only this shouldn’t be a problem.

Next, we add a tiny HTTP wrapper:

// accept strings like /1986-09-03 and /1986-09-03/USD
var routingRegexp = regexp.MustCompile(`/(\d{4}-\d{2}-\d{2})/?([A-Za-z]{3})?`)

func exchangeRatesByCurrency(rates []exchange) map[string]float32 {
  var mappedByCurrency = make(map[string]float32)
  for _, rate := range rates {
    mappedByCurrency[rate.Currency] = rate.Rate
  }
  return mappedByCurrency
}

func newCurrencyExchangeServer() http.Handler {
  r := http.NewServeMux()

  r.HandleFunc("/", func(w http.ResponseWriter, req *http.Request) {
    if !routingRegexp.MatchString(req.URL.Path) {
      w.WriteHeader(http.StatusBadRequest)
      return
    }

    parts := routingRegexp.FindAllStringSubmatch(req.URL.Path, -1)[0]
    requestedDate := parts[1]
    requestedCurrency := parts[2]

    enc := json.NewEncoder(w)
    if _, ok := exchangeRates[requestedDate]; !ok {
      w.WriteHeader(http.StatusNotFound)
      return
    }

    var exs = exchangeRates[requestedDate]
    if requestedCurrency == "" {
      enc.Encode(exchangeRatesByCurrency(exs))
    } else {
      for _, rate := range exs {
        if rate.Currency == parts[2] {
          enc.Encode(rate)
          return
        }
      }

      w.WriteHeader(http.StatusNotFound)
    }
  })

  return http.Handler(r)
}

func main() {
  log.Printf("listening on :8080")
  log.Fatal(http.ListenAndServe(":8080", newCurrencyExchangeServer()))
}

We can now run the API and it’ll work just fine:

$ curl http://127.0.0.1:8080/2014-12-12
{"GBP":0.7925,"USD":1.245}
$ curl http://127.0.0.1:8080/2014-12-12/USD
{"currency":"USD","rate":1.245}

Adding a period cache updater is quickly done:

func updateExchangeRatesCache() {
  if reader, err := downloadExchangeRates(); err != nil {
    fmt.Printf("Unable to download exchange rates. Is the URL correct?")
  } else {
    if err := updateExchangeRates(reader); err != nil {
      fmt.Printf("Failed to update exchange rates: %v", err)
    }
  }
}

func updateExchangeRatesPeriodically() {
  for {
    time.Sleep(1 * time.Hour)

    updateExchangeRatesCache()
  }
}

func init() {
  updateExchangeRatesCache()
}

func main() {
  go updateExchangeRatesPeriodically()

  log.Printf("listening on :8080")
  log.Fatal(http.ListenAndServe(":8080", newCurrencyExchangeServer()))
}

The API will populate the cache on startup and update it once per hour afterwards.

How much memory does this consume? Let’s check:

func updateExchangeRates(data io.Reader) error {
  var e envelop
  decoder := xml.NewDecoder(data)
  if err := decoder.Decode(&e); err != nil {
    return err
  }

  for _, c := range e.Cubes {
    if _, ok := exchangeRates[c.Date]; !ok {
      exchangeRates[c.Date] = filterExchangeRates(c)
    }
  }

  runtime.GC()

  return nil
}

func printMemoryUsage() {
  var memStats runtime.MemStats
  runtime.ReadMemStats(&memStats)

  fmt.Printf("total memory usage: %2.2f MB\n", float32(memStats.Alloc)/1024./1024.)
}

func main() {
  go updateExchangeRatesPeriodically()

  printMemoryUsage()
  log.Printf("listening on :8080")
  log.Fatal(http.ListenAndServe(":8080", newCurrencyExchangeServer()))
}

Note the new call to runtime.GC() which forces a garbage collection. This is important to get a correct memory usage report, otherwise we’d get varying and thus wrong memory usage reports.

Turns out the memory footprint is acceptable, without any optimizations:

  • all data since 1999, all currencies: 5.137 MB
  • all data since 1999, only USD and GBP: 0.836 MB
  • last 90 days, all currencies: 0.211 MB
  • last 90 days, only USD and GBP: 0.137 MB

Let’s wrap it up:

In less than 200 lines of code we managed to create a fully functional currency exchange rates api. Compared to the original version we do not cache exchange rates to disk, in favor of keeping everything in memory. This reduces the total lines of code considerably and also removes the need for a separate importer binary.

The API is not perfect, however:

  • the data source does not contain data for weekends as well as holidays. For anything production ready we’d want to write a fallback which serves old exchange rates instead of just returning a 404.

However, I’ll leave it for now. You can find the entire source in this gist.

1: If anyone knows a higher precision, open data source for history currency exchange rates I’d love to know. Leave a comment.


present a project: umsatz

In the last post regarding open source side projects I presented traq, a CLI time tracking application I use for my everyday work.

Today I decided to present and walk you through the setup of umsatz, my open source accounting application.
But let’s first introduce umsatz:

umsatz was written to ensure that my book keeping informations are kept safe - that is only locally accessible, not from the internet.

It’s not that my information are particularly sensitive. It’s just that I like control over my data and I do not trust third parties like google or apple to keep my information safe.

I use umsatz to track all my freelance related incomes and expenses, organize them by account, and get a basic overview about what’s due.
Some more details about umsatz are available at umsatz.deployed.eu.

Now, let’s set umsatz up.

Assuming you want to run umsatz on a Raspberry PI and you’ve got all rpi accessoires at hand, all you need is an empty usb stick as secondary backup storage.

raspberry pi umsatz setup

The Raspberry PI needs to run Raspbian - a modified version of Debian wheezy. You can download a copy at raspberrypi.org/downloads/. If your rpi is not running already take a look at the raspberry pi documentation.

To setup umsatz you need to use Ansible. Ansible then takes care of the entire rpi configuration, including provisioning umsatz.

It boils down to the following bash script:

# install ansible using Homebrew
brew install ansible
# clone the umsatz provisioning locally
git clone https://github.com/umsatz/provisioning.git
cd provisioning
# make sure to replace ip.of.rasp.berry
echo "ip.of.rasp.berry" > raspberry
# execute the release installation
ansible-playbook -i raspberry -u pi release-install.yml

After installing ansible make sure to write the correct IP into the raspberry file, otherwise the provisioning won’t work.

Also note that at the beginning of the provisioning you will be prompted for your rpi pi-user password.

The entire process might take about an hour. After ansible finishes your rpi will run postgreSQL, nginx and multiple other API services, written in go.

Firing up the browser on your pi’s IP will display the umsatz frontend, and you can start using it.

raspberry pi umsatz setup raspberry pi umsatz setup raspberry pi umsatz setup raspberry pi umsatz setup

To finish up, here’s a current feature listing as of ‘12 2014:

  • support for double-entry bookkeeping
  • support for multiple currencies (right now USD, GBP, EUR)
  • support for multiple languages (only DE translated)
  • support for arbitrary accounts
  • support for arbitrary fiscal periods (year, quarter, month)
  • support for quick search and filter
  • automatic backups to multiple attached devices

That’s about it.
You can check it out at github.com/umsatz.

I’ll explore the development setup for umsatz in a later post.


cleaning multiple databases with DatabaseCleaner

TL;DR; the database cleaner gem can easily be used to clean multiple databases at once.

I work and maintain some rails projects which connect to multiple databases, some legacy (read: not managed by ActiveRecord), some not.

In order to write proper integration tests you might need to load test data into multiple databases, which in turn leads to the need to have clean databases available for all your test runs.

The following DatabaseCleaner snippet (taken from a project which uses RSpec) sets up cleaning runs for multiple databases:

# spec/support/database_cleaner.rb
database_cleaners = []

RSpec.configure do |config|
  config.before(:suite) do
    database_cleaners << DatabaseCleaner::Base.new(:active_record, { :connection => :db1 })
    database_cleaners << DatabaseCleaner::Base.new(:active_record, { :connection => :db2 })
    database_cleaners.each { |cleaner| cleaner.strategy = :truncation }
  end

  config.before(:each, :js => true) do
    DatabaseCleaner.strategy = :truncation
  end

  config.before(:each) do
    database_cleaners.each(&:start)
  end

  config.after(:each) do
    database_cleaners.each(&:clean)
  end
end

# spec_helper.rb:
Dir[Rails.root.join("spec", "support", "**", "*.rb")].each { |f| require f }

A similar setup can be used with Minitest as well.


present a project: traq

In the last post regarding open source side projects I presented revisioneer, an API to track application deployments.

Today I’ll be showcasing traq, a command line application which tracks your times. I’ve also blogged about it before.

The initial proof of concept was entirely written in bash with a focus on simplicity and understandability. This meant that traq uses a simple folder structure and plain text files to store the tracked times.

As a programmer I tend to be always working with the terminal and being forced to use my mouse and to browse a GUI annoys the hell out of. It just takes far too much time. So my goal was having a tool which could manage all my times, both personal and work relateded. It should be easy to use and give basic evaluation features.

Here’s an example usage, taken from my average work day:

[08:10] $ traq -p mindmatters work    # some time later…
[10:00] $ traq -p mindmatters meeting # …off to something completly different…
[10:17] $ traq -p mindmatters work    # done /w the meeting. back to work
[12:00] $ traq -p mindmatters stop    # short break
[12:32] $ traq -p mindmatters work
[15:11] $ traq -p mindmatters stop    # playing table football
[15:23] $ traq -p mindmatters work
[16:59] $ traq -p mindmatters stop    # …that's it. done for today

Under the hood, this generates a text file like this:

Fri Aug 15 08:10:39 +0200 2014;#work;
Fri Aug 15 10:00:41 +0200 2014;#meeting;
Fri Aug 15 10:17:42 +0200 2014;#work;
Fri Aug 15 12:00:45 +0200 2014;stop;
Fri Aug 15 12:32:15 +0200 2014;#work;
Fri Aug 15 15:11:00 +0200 2014;stop;
Fri Aug 15 15:23:11 +0200 2014;#work;
Fri Aug 15 16:59:51 +0200 2014;stop;

Notice the -p flag which allows me to switch to different projects.

At the end of the day I need to the enter the times in harvest because we use it at work. Traq is able to sum up everything properly and display a short summary:

$ traq -p mindmatters -e
2014-08-15
#work:7.8083
#meeting:0.2836
%%

Since time tracking is a sensitive topic traq had tests from the very beginning. Initially I was using bats for this.

Using bash was a good start, but the had some limitations: - evaluation quickly became slow, even for small datasets (e.g. a weeks worth of data) - portability was painful.
I wanted traq to work on Linux and Darwin. Traq used date internally to convert & generate timestamps and the parameters are completly different on both OSes.

After about a year I decided to rewrite it using Go - which turned out to be a really good decision.

The limitations went away. I was able to add more test cases and at the same time making the code base more concise and easier to understand.

Having used traq for all my time tracking for nearly 2 years now using a simple, folder based structure as data store also proofed to be a good decision.

There are only two minor things I’d like to change, and the data storage allows for easy work arounds for both of them:

  • make the data storage pluggable, to allow sharing of times between devices.
    I’m sometimes using my personal notebook at work and switching between notebooks always forces me to keep the folder in sync. There are workaround options available for this (e.g. dropbox).

  • add support for a plugin structure to handle timing.
    this would allow me to easily push data to external services like Harvest, automatically. It’s a convenience feature which can also be worked around with a combination of atd and some scripting.

I’ve started a code spike regarding both ideas, but since traq works just fine it might take some time to finish them.

That’s about it. Next up: umsatz - the financial accounting app you can host yourself on a raspberry pi.