Matt Connolly's Blog

my brain dumps here…

A year with ZeroMQ + Ruby

It was a year ago since my first commit to rbczmq, a Ruby bindings gem for ZeroMQ, which wraps the higher level CZMQ library.

I don’t recall exactly how I first heard about ZeroMQ, but needless to say, I got interested and started reading up on it. At the time, I discovered there were three gems for using ZeroMQ in ruby:

  • zmq – the original official zmq library. This is old. It tracks ZeroMQ version 2 (currently version 4) and has not been updated for nearly 3 years.
  • ffi-rzmq – a binding using FFI, which is compatible with Ruby, Rubinius and JRuby.
  • rbczmq – a native extension (no JRuby support) binding using the czmq library.

I didn’t spend a lot of time with the zmq gem since it was so old. The ffi-rzmq gem worked well, but didn’t feel very ruby like in its interface. For example, when receiving a message with a C method you would pass a buffer as a parameter to receive the contents of the message and the returned value would be an error code. This is quite un-ruby-like: I would expect receive to return the received data or raise an exception for the error code in keeping with built in ruby socket/file i/o calls.

So I started to explore rbczmq. Initially, I wasn’t so interested in the CZMQ wrapping part, I just wanted something that was more ruby-like to use. And it was. And it was faster. And the CZMQ part actually helps too.

In ZeroMQ each message part, or “frame” is considered a message. So when you read multi-frame messages in ZeroMQ you need to check the “more” flag, and read the next part. CZMQ wraps this as a single message with a number of “frames”. rbczmq neatly exposes these as Ruby classes: ZMQ::Message and ZMQ::Frame. You can still send and receive raw frames (as strings), but the class is a nice wrapper.

And to boot, it turns out that it was way faster than the ffi gem. I seem to have lost track of the comparison I did, but I recall it was convincing.

What’s changed?

During this year, rbczmq has received a number of updates and new features, major ones including:

  • Upgrade to ZeroMQ 4
  • Upgrade to CZMQ 2
  • Support for SmartOS platform
  • Fixes to memory management

Major things still to do:

  • Add support for new authentication interface.
  • Ship binary gems (like libv8) to save compilation time on deploy / install.

Hard bits

The hardest bit of work I contributed to this project was fixing bugs in the memory management. In particular, CZMQ has specific rules about ownership of memory. Ruby is a garbage collected environment, which also has its own set of rules about ownership of memory. The two do not match.

Most calls to ZeroMQ are done outside of the Ruby “GVL” (global lock) which allows the ruby VM to continue processing ruby code in other threads while one is doing a synchronous/blocking read on a socket, for example. When you combine this with Ruby threads, things can get hairy. The solution was two-fold:

  1. Use an ownership flag. When ownership was known to be transferred to ZeroMQ, mark the ruby object as no longer owned by Ruby. This meant that Ruby garbage collection callbacks would know if they were ultimately responsible for freeing memory used by an object. There was also some tricky interplay between contexts and sockets, since a socket is owned by a context, and destroying a context also destroys the sockets, so a socket is only owned by ruby if it has not been closed and the context has not been destroy.
  2. A socket closing mutex: Socket closing and context closing are asynchronous. If a socket is still open when a context is destroyed, then all sockets belonging to that socket will be closed. This happens outside the Ruby GVL, which means that a race condition exists where the Ruby garbage collector may collect the socket while it is still closing. ZeroMQ socket close is not threadsafe, so a mutex was the only solution to make this safe.

Using a mutex for socket close may result in a performance hit for an application which opens and closes sockets rapidly, but from what I understand, that is a bad thing to do anyway.

Looking forward

I have a few projects in the wild now using the rbczmq gem, and am very happy with its stability and performance. I haven’t used all of the APIs in Anger (such as Loops or Beacons), but I’m sure the time will come. I look forward to another year of contributions to this project to keep it up to date with what’s happening in the ZeroMQ and CZMQ projects.

I’d love to hear from other people using this gem, so give me a shout!

Ruby Tuples (and file scanning)

I enjoyed Andrew Pontious’s recent episode of Edge Cases podcast talking about tuples. I’m doing a lot of Ruby these days, so I thought I’d add my two cents worth about using tuples in Ruby.

It’s true that there is no separate tuple class, but Ruby arrays can do everything that tuples in Python can do.

To assign two variables, you can do:

a, b = 1, 2

Which is equivalent to:

a, b = [1, 2]

Which is equivalent to:

a = 1
b = 2

Elements not present are treated as nil, so a, b = 1 assigns the value 1 into a and nil into b.

Functions can return arrays like so:

def f(x)
  [1, 2]
end

def g(x)
  return 1, 2
end  

The Ruby way to iterate a list of items is with the each method that takes a block:

[1,2,3].each { |x| puts x }

Calls the block 3 times with x having the values 1, 2 and 3 from the list. If these items are themselves arrays, then the items in those sub-arrays can be expanded out into the block variables, like so:

[[1,2], [3,4], [5,6]].each { |a, b| puts "a = #{a}, b = #{b}" }
# outputs:
# a = 1, b = 2
# a = 3, b = 4
# a = 5, b = 6

Hashes can also be enumerated this way, where each key value pair is represented as an array with 2 items:

{a: 1, b: 2, c: 3}.each { |key, value| puts "#{key} => #{value}"}
# outputs:
# a => 1
# b => 2
# c => 3

Python’s list comprehension is really great. Where in python you might write the following to select only items from a list given some condition determined by the function g(x), and return the value f(x) for those values:

results = [f(x) for x in source_list if g(x)]

Ruby achieves the same with select and map methods, which can be composed in either order according to your needs. The Ruby equivalent would be:

results = source_list.select { |x| g(x) }.map { |x| f(x) }

Python’s list comprehension can only do these two things, in that order. By making the select step and the map steps separate in Ruby, they can be composed in any order. To reverse the map and select order in Ruby:

results = source_list.map { |x| f(x) }.select { |x| g(x) }

This is not so easy in python:

results = [y for y in [f(x) for x in source_list] if g(y)]

Ruby also contains many more useful operations that can be done on any enumerable sequence (for example readlines from a file), just take a look at the Enumerable module docs: http://www.ruby-doc.org/core-2.1.0/Enumerable.html

So I’ve got a bit off the tuple track, so I’ll finish with yet another tangent relating to the podcast episode: Deep searching a file hierarchy for files matching an extension. Try this out for concise:

Dir.glob("**/*.json")

To return an array of all the .json files anywhere under the current directory. Ruby is full of little treasures like this.

I used to do quite a bit of scripting in Python until I learnt Ruby. I’ve never looked back.

Happy Holidays

This is a quick happy holidays and thank you to all the people and companies that have done great things in 2013. In no particular order:

Podcasters:

I’ve enjoyed many a podcast episode this year. My favourites are the Edge Cases featuring Wolf Rentzsch and Andrew Pontious, Accidental Tech Podcast featuring John Siracusa, Casey Liss and Marco Arment and Rails Casts by Ryan Bates.
Thank you all for your hard work putting your respective shows together. Your efforts are greatly appreciated, and I hope you are getting enough out of it so that it’s worthwhile continuing in 2014!

Companies:

JetBrains, makers of Rubymine. These guys pump out great work. If you’re keen to get involved in the early access program you can get nightly or weekly builds. Twice this year I’ve submitted a bug and within a week had it verified by JetBrains, fixed, in a build and in my hands. Their CI system even updates the bug with the build number including the fix. Seriously impressive. They set the bar so high, I challenge any company (including myself) to match their effective communication rapid turn around on issues.

Joyent for actually innovating in the cloud, and your contributions to open source projects such as NodeJS and SmartOS! Pretty impressive community engagement, not only in open source code, but community events too… What a shame I don’t live in San Francisco to attend and thank you guys in person.

Github for helping open source software and providing an awesome platform for collaboration. So many projects benefit from being on Github.

Apple, thanks for making great computers and devices. Well done on 64 bit ARM. The technology improvements in iOS 7 are great, however, my new iPhone 5S doesn’t feel a single bit faster than my previous iPhone 5 due to excessive use of ease out animations which have no place in a User Interface. Too many of my bug reports have been closed as “works as intended”, when the problem is in the design not the implementation. Oh well.

Products / Services:

Strava has helped me improve in my cycling and fitness. The website and iPhone apps are shining examples of a great user experience: works well, easy to use, functional and good looking. Thanks for a great product.

Reveal App is a great way to break down the UI of an iOS app. Awesome stuff.

Twitter has been good, mostly because of how people use it. I suppose it’s more thanks to the people on Twitter who I follow.

Black Star Coffee, it’s how I start my day! Great coffee.

Technologies:

ZeroMQ: This is awesome. Reading the ZeroMQ guide was simply fantastic. This has changed my approach to communications in computing. Say goodbye to mutexes and locks and hello to messages and event driven applications. Special thanks to Pieter Hintjens for his attention to the ZeroMQ mailing lists, and to all of the contributors to a great project.

SmartOS: Totally the best way to run a hypervisor stack. The web page says it all: ZFS + DTrace + Zones + KVM. Get into it. Use ZFS. You need a file system that can verify your data. Hard drives cannot be trusted. I repeat, use ZFS.

Using ZFS Snapshots on Time Machine backups.

I use time machine because it’s an awesome backup program. However, I don’t really trust hard drives that much, and I happen to be a bit of a file system geek, so I backup my laptop an iMac to another machine that stores the data on ZFS.

I first did this using Netatalk on OpenSolaris, then OpenIndiana, and now on SmartOS. Netatalk is an open source project for running AFP (Apple Filesharing Protocol) services on unix operatings systems. It has great support for new features in the protocol required for Time Machine. As far as I’m aware, all embedded NAS devices use this software.

Sometimes, Time Machine “eats itself”. A backup will fail with a message like “Verification failed”, and you’ll need to make a new one. I’ve never managed to recover the disk from this point using Disk Utility.

My setup is RaidZ of 3 x 2TB drives, giving a total of 4TB of storage space (and 2TB redundancy). In the four years I’ve been running this, I have had 3 drives go bad and replace them. They’re cheap drives, but I’ve never lost data due to a bad disk and having to replace it. I’ve also seen silent data corruptions, and know that ZFS has corrected them for me.

Starting a new backup is a pain, so what do I do?

ZFS Snapshots

I have a script, which looks like this:

ZFS=zones/MacBackup/MattBookPro
SERVER=vault.local
if [ -n "$1" ]; then
  SUFFIX=_"$1"
fi
SNAPSHOT=`date "+%Y%m%d_%H%M"`$SUFFIX
echo "Creating zfs snapshot: $SNAPSHOT"
ssh -x $SERVER zfs snapshot $ZFS@$SNAPSHOT

This uses the zfs snapshot command to create a snapshot of the backup. There’s another one for my iMac backup. I run this script manually for the ZFS file system (directory) for each backup. I’m working on an automatic solution that listens to system logs to know when the backup has completed and the volume is unmounted, but it’s not finished yet (like many things). Running the script takes about a second.

Purging snapshots

My current list of snapshots looks like this:

matt@vault:~$ zfs list -r -t all zones/MacBackup/MattBookPro
NAME                                      USED  AVAIL  REFER  MOUNTPOINT
zones/MacBackup/MattBookPro               574G   435G   349G  /MacBackup/MattBookPro
...snip...
zones/MacBackup/MattBookPro@20131124_1344 627M      -   351G  -
zones/MacBackup/MattBookPro@20131205_0813 251M      -   349G  -
zones/MacBackup/MattBookPro@20131212_0643 0         -   349G  -

The used at the top shows the space used by this file system and all of its snapshots. The used column for the snapshot shows how much space is used by that snapshot on its own.

Purging old snapshots is a manual process for now. One day I’ll get around to keeping a snapshots on a rule like time machine’s hourly, daily, weekly rules.

Rolling back

So when Time Machine goes bad, it’s as simple as rolling back to the latest snapshot, which was a known good state.

My steps are:

  1. shut down netatalk service
  2. zfs rollback
  3. delete netatalk inode database files
  4. restart netatalk service
  5. rescan directory to recreate inode numbers (using netatalks “dbd -r ” command.)

This process is a little more involved, but still much faster than making a whole new backup.

The main reason for this is that HFS uses an “inode” number to uniquely identify each file on a volume. This is one trick that Mac Aliases use to track a file even if it changes name and moves to another directory. This concept doesn’t exist in other file systems, and so Netatalk has to maintain a database of which numbers to use for which files. There’s some rules, like inode numbers can’t be reused and they must not change for a given file.

Unfortunately, ZFS rollback, like any other operation on the server that changes files without netatalk knowing, ends up with files that have no inode number. The bigger problem seems to be deleting files and leaving their inodes in that database. This tends to make Time Machine quite unhappy about using that network share. So after a rollback, I have a rule that I nuke netatalk’s database and recreate it.

This violates the rule that inode numbers shouldn’t change (unless they magically come out the same, which I highly doubt), but this hasn’t seemed to cause a problem for me. Imagine plugging a new computer into a time machine volume, it has no knowledge of what the inode numbers were, so it just starts using them as is. It’s more likely to be an issue for Netatalk scanning a directory and seeing inodes for files that are no longer there.

Recreating the netatalk inode database can take an hour or two, but it’s local to the server and much faster than a complete network backup which also looses your history.

Conclusion

This used to happen a lot. Say once every 3-4 months when I first started doing it. This may have been due to bugs in Time Machine, bugs in Netatalk or incompatibilities between them. It certainly wasn’t due to data corruptions.

Pros:

  • Time Machine, yay!
  • ZFS durability and integrity.
  • ZFS snapshots allow point in time recovery of my backup volume.
  • ZFS on disk compression to save backup space!
  • Netatalk uses standard AFP protocol, so time machine volume can be accessed from your restore partition or a new mac – no extra software required on the mac!

Cons:

  • Effort – complexity to manage, install & configure netatalk, etc.
  • Rollback time.
  • Network backups are slow.

As time has gone on, both Time Machine and Netatalk have improved substantially. And I’ve added an SSD cache to the server, and its is swimmingly fast and reliable. And thanks to ZFS, durable and free of corruptions. I think I’ve had this happen only twice in the last year, and both times was on Mountain Lion. I haven’t had to do a single rollback since starting to use Mavericks beta back around June.

Where to from here?

I’d still like to see a faster solution, and I have a plan: a network block device.

This would, however, require some software to be installed on the mac, so it may not be as easy to use in a disaster recover scenario.

ZFS has a feature called a “volume”. When you create one, it appears to the system (that’s running zfs) as another block device, just like a physical hard disk, or file. A file system can be created on this volume which can then be mounted locally. I use this for the disks in virtual machines, and can snapshot them and rollback just as if they were a file system tree of files.

There’s an existing server module that’s been around for a while: http://nbd.sourceforge.net

If this volume could be mounted across the network on a mac, the volume could be formatted as HFS+ and Time Machine could backup to it using local disk mode, skipping all the slow sparse image file system work. And there’s a lot of work. My time machine backup of a Mac with a 256GB disk creates a whopping 57206 files in the bands directory of the sparseimage. It’s a lot of work to traverse these files, even locally on the server.

This is my next best solution to actually using ZFS on mac. Whatever “reasons” Apple has for ditching them are not good enough simply because we don’t know what they are. ZFS is a complex beast. Apple is good at simplifying things. It could be the perfect solution.

Time Machine Backups and silent data corruptions

I’ve recently heard many folk talking about Time Machine backup strategies. To do it well, you really do need to backup your backup, as Time Machine can “eat itself”, especially doing network backups.

Regardless of whether your Time Machine backup is to a locally attached disk or a network drive, when you make a backup of your backup, you want to make sure it’s valid, otherwise you’re propagating a corrupt backup.

So how do you know if your backup is corrupt? You could read it from beginning to end. But this would only protect you from data corruptions that can be detected by the drive itself. Disk verify, fsck, and others go further and validate the file system structures, but still not your actual data.

There are “silent corruptions”, which is where the data you wrote to the disk comes back corrupted (different data, not a read error). “That never happens”, you might say, but how would you know?

I have two servers running SmartOS using data stored on ZFS. I ran a data scrub on them, and both reported checksum errors. This is exactly the silent data corruption scenario.

ZFS features full checksumming of all data when stored, and if your data is in a RAIDZ or mirror configuration, it will also self-heal. This means that instead of returning an error, ZFS will go fetch the data from a good drive and also make another clean copy of that block so that its durability matches your setup.

Here’s the specifics of my corruptions:

On a XEON system with ECC RAM, the affected drive is a Seagate 1TB Barracuda 7200rpm, ST31000524AS, approximately 1 year old.

  pool: zones
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
   
  scan: resilvered 72.4M in 0h48m with 0 errors on Mon Nov 18 13:28:16 2013
config:

        NAME          STATE     READ WRITE CKSUM
        zones         ONLINE       0     0     0
          mirror-0    ONLINE       0     0     0
            c1t1d0s0  ONLINE       0     0     0
            c1t0d0s0  ONLINE   2.61K  366k   635
            c1t4d0s1  ONLINE       0     0     0
        logs
          c1t2d0s0    ONLINE       0     0     0
        cache
          c1t2d0s1    ONLINE       0     0     0

errors: No known data errors

On a Celeron system with non-ECC RAM, the affected drive is a Samsung 2TB low power drive, approximately 2 years old.

  pool: zones
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: scrub repaired 8K in 12h51m with 0 errors on Thu Nov 21 00:44:25 2013
config:

        NAME          STATE     READ WRITE CKSUM
        zones         ONLINE       0     0     0
          raidz1-0    ONLINE       0     0     0
            c0t1d0    ONLINE       0     0     0
            c0t3d0    ONLINE       0     0     0
            c0t2d0p2  ONLINE       0     0     2
        logs
          c0t0d0s0    ONLINE       0     0     0
        cache
          c0t0d0s1    ONLINE       0     0     0

errors: No known data errors

Any errors are scary, but the checksum errors even more so.

I had previously seen thousands of checksum errors on a Western Digital Green drive. I stopped using it and threw it in the bin.

I have other drives that are HFS formatted. I have no way of knowing if they have any corrupted blocks.

So unless your data is being checksummed, you are not protected from data corruption, and making a backup of a backup could easily be propagating data corruptions.

I dream of a day when we can have ZFS natively on Mac. And if it can’t be done for whatever ‘reasons’, at least give us the features from ZFS that we can use to protect our data.

Network latency in SmartOS virtual machines

Today I decided to explore network latency in SmartOS virtual machines. Using the rbczmq ruby gem for ZeroMQ, I made two very simple scripts: a server that replies “hello” and a benchmark script that times how long it takes to send and receive 5000 messages after establishing the connection.

The server code is:

require 'rbczmq'
ctx = ZMQ::Context.new
sock = ctx.socket ZMQ::REP
sock.bind("tcp://0.0.0.0:5555")
loop do
  sock.recv
  sock.send "reply"
end

The benchmark code is:

require 'rbczmq'
require 'benchmark'

ctx = ZMQ::Context.new
sock = ctx.socket ZMQ::REQ
sock.connect(ARGV[0])

# establish the connection
sock.send "hello"
sock.recv

# run 5000 cycles of send request, receive reply.
puts Benchmark.measure {
  5000.times {
    sock.send "hello"
    sock.recv
  }
}

The test machines are:

* Mac laptop – server & benchmarking
* SmartOS1 (SmartOS virtual machine/zone) server & benchmarking
* SmartOS2 (SmartOS virtual machine/zone) benchmarking
* Linux1 (Ubuntu Linux in KVM virtual machine) server & benchmarking
* Linux2 (Ubuntu Linux in KVM virtual machine) benchmarking

The results are:

Source      Dest        Connection      Time          Req-Rep/Sec
------      ----        ----------      ----          --------
Mac         Linux1      1Gig Ethernet   5.038577      992.3
Mac         SmartOS1    1Gig Ethernet   4.972102      1005.6
Linux2      Linux1      Virtual         1.696516      2947.2
SmartOS2    Linux1      Virtual         1.153557      4334.4
Linux2      SmartOS1    Virtual         0.952066      5251.8
Linux1      Linux1      localhost       0.836955      5974.0
Mac         Mac         localhost       0.781815      6395.4
SmartOS2    SmartOS1    Virtual         0.470290      10631.7
SmartOS1    SmartOS1    localhost       0.374373      13355.7

localhost tests use 127.0.0.1

SmartOS has an impressive network stack. Request-reply times from one SmartOS machine to another are over 3 times faster than when using Linux under KVM (on the same host). This mightn’t make much of a difference to web requests coming from slow mobile device connections, but if your web server is making many requests to internal services (database, cache, etc) this could make a noticeable difference.

Installing Jenkins-CI in SmartOS

I found a very helpful gist with a script and SMF service manifest for installing Jenkins CI on SmartOS. A few tweaks later, I have my own set up. Here’s my gist:

https://gist.github.com/mattconnolly/6264850

Now I have rvm, ruby 2.0 and the rbczmq gem all building and running tests and emailing me if any fail. I’m polling git for changes hourly from the projects github repositories.

Screen Shot 2013-08-21 at 9.04.18 am

Jenkins has a plethora of plugins available and integrates with git nicely. The only thing I found unobvious was that build scripts run from Jenkins don’t inherit a normal shell environment, so you may need to set up environment variables such as TERM, PATH and CA_BUN for curl.

Here’s my build script for rvm, which installs it locally and verifies it installed.

#!/opt/local/bin/bash
#
# setup:
export HOME=/home/admin
export rvm_path=$HOME/.rvm

echo "Destroying any current installation of RVM in /home/admin/.rvm ..."
rm -rf $rvm_path

echo "running install script:"
bash ./install

if test -x $rvm_path/scripts/rvm
then
source $rvm_path/scripts/rvm
else
echo "Failed to install rvm"
exit 1
fi

rvm version

One nice thing I set up is dependent builds. I have “install rvm” => “install ruby 2.0” => “build & test rbczmq” as dependent builds, so if a dependency changes, I re-run the dependent projects to make sure there’s no negative downstream side effects. Might be overkill, but if it picks up a change, I’ll be glad to know about it.

Next, I’d like to have a nice “Build passes tests in SmartOS” badge that I could stick in the repositories read me pages just like travis-ci has.

Even better would be for Travis to run builds in SmartOS directly given their awesome integration with github!

ZeroMQ logging for ruby apps

I’ve been thinking for a while about using ZeroMQ for logging. This is especially useful with trends towards micro-services and scaling apps to multiple cloud server instances.

So I put thoughts into action and added a logger class to the rbczmq gem that logs to a ZeroMQ socket from an object that looks just like a normal ruby logger: https://github.com/mattconnolly/rbczmq/blob/master/lib/zmq/logger.rb

There’s not much to it, because, well, there’s not much to it. Here’s a simple app that writes log messages:

Log Writer:

require 'rbczmq'
require_relative './logger'
require 'benchmark'
ctx = ZMQ::Context.new
socket = ctx.socket(ZMQ::PUSH)
socket.connect('tcp://localhost:7777')
logger = ZMQ::Logger.new(socket)
puts Benchmark.measure {
 10000.times do |x|
 logger.debug "Hello world, #{x}"
 end
}

With benchmark results such as:

  0.400000   0.220000   0.620000 (  0.418493)

Log Reader:

And reading is even easier:

require 'rbczmq'
ctx = ZMQ::Context.new
socket = ctx.socket(ZMQ::PULL)
socket.bind('tcp://*:7777')
loop do
 msg = socket.recv
 puts msg
end

Voila. Multiple apps can connect to the same log reader. Log messages will be “fair queued” between the sources. In a test run on my 2010 MacBook Pro, I can send about 13000 log messages a second. I needed to run three of the log writers above in parallel before I maxed out the 4 cores and it slowed down. Each process used about 12 MB RAM. Lightweight and fast.

Log Broadcasting:

If we then need to broadcast these log messages for multiple readers, we could easily do this:

require 'rbczmq'
ctx = ZMQ::Context.new
socket = ctx.socket(ZMQ::PULL)
socket.bind('tcp://*:7777')
publish = ctx.socket(ZMQ::PUB)
publish.bind('tcp://*:7778')
loop do
 msg = socket.recv
 publish.send(msg)
end

Then we have many log sources connected to many log readers. And the log readers can also subscribe to a filtered stream of messages, so one reader could do something special with error messages, for example.

Ease out, you’re making me wait!

I hate waiting for ease out animations to finish.

When I was a sound designer, it was one of the most annoying things to sync a sound effect on. When a motion ends in an ease out curve, the point at which the animation ends has a speed of zero. This can look really pleasing, but doesn’t actually happen that often in reality. Think about closing a door: when the door closes its speed suddenly changes from moving to still when it hits the door frame. This is where the sound happens. This is the event. This is the thing that happened. Boom.

When there is an ease-out, there is no boom. No event. No done. No “the animation has finished you can start interacting with the machine again.” You just look and go, “Oh, that looks nice.” and then think “Are you done? When can I do…?”

Knowing when an animation is complete is really important because most user interfaces don’t transition to their next state until the animation completes, or don’t respond to any user input at all while the animation is in play, or even worse, send your user input to the wrong place until after the animation completes. Therefore it is CRITICAL for the user to know when the animation has finished. I want the animation. I want my boom.

Now, lets back this up with some Math. Let’s take a graph with the vertical axis being position and the X axis being time. I’m drawing the line downwards so that when the curve hits the X axis, this is the point the animation has completed:

Image

The purple line is a linear function: y = 1-x.

The red line is an ease out line. I’ve taken the curve of a quarter of a sine curve where the line is steepest at the start (maximum speed), and flat at the end (zero speed). The formula I’ve used is “(1 – sin(pi*x/2))^2”. (Adjust the power for the severity of the ease-out.) I don’t know exactly what curves people use for their ease outs, but this demonstrates the issue nicely.

Notice that as the speed slows down, there is a period of time where the animation is still playing, yet the actual movement is very slight. This might look very pleasing, but as I mentioned above, it misses the vital cue of when the animation is finished. In this example, the last 40% of the animations duration corresponds to only 5% of the movement. That combined with the fact that it ends with zero speed, gives no visual indication when the animation has actually finished.

I realise that this is the intention of the ease-out curve. I argue that it’s inappropriate for user interfaces. At least to end with zero speed.

A simple blend between the linear and ease out curves, shown by blue line in the graph above, gives a rate change consistent with the feel of ease-out (start fast, finish slow) but the finish speed IS NOT ZERO. This means there is still a bump. Still a bang. Still a cue. A little boom. This lets the user know that the animation has finished and that they can get on with using their device.

I want to use my device. I’ve got things to do with it. Don’t make me wait for something I cannot see.

Building Ruby 2 in SmartOS

Installing Ruby 2 in SmartOS!

Ruby 2 has been out for a while, so let’s get it going in SmartOS!!

I’m working with the SmartMachine (base64 1.9.0).

Part 1: Requirements.

The SmartMachine has a whole bunch of useful packages already installed. Here are the additional packages needed to build Ruby 2.0.0 in SmartOS:

# pkgin install build-essential gcc47 gcc47-libs libyaml 

Part 2: Configure hacks

There are a few issues with Ruby 2.0.0, but fortunately all of them have a command line workaround.

1. Ruby bug #5384 https://bugs.ruby-lang.org/issues/5384

Actually an upstream bug in (Open)Solaris/Illumos/OpenIndiana/SmartOS. Workaround is to add `ac_cv_func_dl_iterate_phdr=no` to the configure line.

See also: https://www.illumos.org/issues/1587

2. Ruby bug #8071 https://bugs.ruby-lang.org/issues/8071

Non-portable code in ruby’s configure scipt. Easy workaround, prepend the configure command with a shell that can handle the current state of the configure script, ie: `bash`. (A fix has already been submitted, ans should be in the next ruby patch.)

3. Ruby bug #8268 https://bugs.ruby-lang.org/issues/8268

“-ggdb3″ C flags issue (logged with Ruby, but not necessarily actually a bug in ruby. Please help if you can on this one!)

Workaround 1: Add `debugflags=”-ggdb”` to the configure line. Caveat: It appears this will add gdb debug info to built ruby binaries, which may not be desired.

Workaround 2: Add `CFLAGS=”-R -fPIC”`. This introduces a make error. Missing function ‘signbit’. Boo.

Workaround 3: Add `CFLAGS=”-R -fPIC” rb_cv_have_signbit=no`. Works!

So with these three taken care of, the following command line will configure ruby-2.0.0-p0 to compile on SmartOS (in 64 bit):

$ bash ./configure --prefix=$PREFIX --with-opt-dir=/opt/local --enable-shared ac_cv_func_dl_iterate_phdr=no CFLAGS="-R -fPIC" rb_cv_have_signbit=no
$ make && make install

And there we have it. Ruby 2.0.0-p0 building in SmartOS. Next challenge, making use of those built in DTrace probes…