How to Put Rewrite Rules in Your Ruby Code, Not Your Web Server

November 4, 2009 by Brian Morearty

Need to put some URL rewrite rules in your Rails app? Not too crazy about writing Apache mod_rewrite rules? Prefer writing Ruby code?

Refraction to the rescue.

Refraction is a new Rails plugin from Josh Susser and Pivotal Labs that helps you easily implement URL rewrites in Ruby, rather than writing the rules in web-server-specific lingo. Your code gets called by Rack, although you don’t need to know anything about Rack to write rewrite rules.

Here’s an example. Let’s say you want to duplicate Twitter’s /@username routes (did you know Twitter supports URLs like that?) but have found as I did that routes.rb doesn’t seem to work with a “@:username” rule. Instead, create a RESTful UsersController and then put this in your refraction_rules.rb file:

Refraction.configure do |req|
  # rewrite /@username as /users/username
  if req.path =~ %r{/@(.+)}
    req.rewrite! :path => "/users/#{$1}"
  end
end

The notes in the blog post (see link above) mention that there is already a Rack rewrite plugin called Rack::Rewrite but “Rack::Rewrite only really gives you access to the request URI, whereas Refraction appears to give the entire request object… that can be much more useful for determining where to send someone.”

Define Your Own Custom Service Token for OAuth

October 8, 2009 by Brian Morearty

For anyone who consumes OAuth APIs with the excellent oauth-plugin gem: I’ve just submitted a patch so you can now create your own custom “service token” in your models folder.

See the “Creating your own wrapper tokens” section of the tutorial link above to see what I’m talking about. It didn’t work before but it does now. Or will, as soon as it gets pulled back into the source. Until then you can get it here. After it’s pulled back in I’ll probably delete my fork.

Update: my change was pulled back into the master.

Intuit Community makes Business Week

July 21, 2009 by Brian Morearty

Thanks to tjhanley for alerting me to this while I was on vacation: Business Week has written a short article about Intuit Community, the product I worked on last year and early this year, which is an in-product community where users can help each other. In addition to helping users it has apparently improved QuickBooks sales and reduced product support costs.

to_json => as_json?

July 20, 2009 by Brian Morearty

The Javafication of Ruby on Rails

May 21, 2009 by Brian Morearty

As you know, one of the nice things about using Ruby on Rails is the emphasis on code simplicity and readability. A month ago in “This Week in Edge Rails,” Mike Gunderloy posted something that makes me a little concerned—it feels like one small step toward the Javafication (a.k.a. complexification) of Ruby on Rails.

The change is called “Pluggable JSON Backends” and it was implemented by Rick Olson, a.k.a. Technoweenie. (I have the utmost respect for Technoweenie. I’ve used his code before and it’s wonderful stuff—it’s clean, it’s simple, and it works.)

Instead of this:

my_model.to_json

we are now encouraged to do this:

ActiveSupport::json.encode(my_model)

I Like Stuff that’s Simple

It’s a little thing and I don’t want to blow it out of proportion, but this is a pattern I would like to see the community avoid. Once people see this style of coding they start to copy it.

I think I understand the motivation behind the change, but I believe the new style of coding is harder to read and write. It reminds me of all the years I wrote C++ code and wished I could just extend a class outside its original definition. Ruby can do that and I love the fact that Rails isn’t shy about taking advantage of this capability to make programmers’ lives easier and code simpler.

It may be too late to have this debate when it comes to JSON. Maybe this was the only way to meet the requirements Technoweenie had. I don’t know. And I understand sometimes it’s necessary to do something a little ugly to achieve a goal that’s more important than simplicity or readability. But I am writing today to encourage the Rails core committers, and the rest of us in the Rails community, to please avoid this pattern when possible. If you are considering writing a method that looks like this:

ClassOrModule::namespacey_function.method_name(obj)

please pause and consider whether the users of your API would find it easier if you instead made it look like this:

obj.method_name

Thanks!

Dataflow: Erlang-Style Thread Safety in Ruby

April 26, 2009 by Brian Morearty

Larry Diehl, a.k.a. larrytheliquid, has just released Dataflow: a tiny and remarkable gem that helps Ruby programmers write thread-safe programs more easily by duplicating one of the main features of Erlang—and in my opinion the single most important feature that makes Erlang thread-safe. Dataflow makes all variables write-once (so the name “variable” isn’t really accurate any more). This limitation is really a feature. It makes it easier to write multithreaded programs without synchronization bugs because it’s no longer possible for two threads to write different values to the same variable, and thus there’s no need to synchronize writes. When you reference a variable that has not yet been assigned, Dataflow puts your thread to sleep automatically. It is reawakened automatically when the variable is assigned.

Before we continue, a word of caution: I’ve mentioned in this blog and in the Ruby on Rails Podcast that even though multithreading is really fun to think about and play with, I approach it with reluctance in real-life projects because it makes the code more complex, makes it a lot harder to debug problems, and is hard to manage when there are multiple programmers who all have to work in and understand the threaded code. But there are still some problems for which threading is the right solution.

I Like Stuff That’s Clean and Small

Dataflow is a beautiful bit of programming. It’s small, clean, and tested. It implements write-once variables with automated thread synchronization in just 52 lines of code. (Plus 120 lines of tests.) It supports:

  • instance variables
  • local variables
  • dynamic values loaded into data structures such as arrays
  • It doesn’t seem to support class variables but I guess a constant can serve as a write-once class variable.

Here are some code samples, copied from the README.

# Local variables
include Dataflow

local do |x, y, z|
  # notice how the order automatically gets resolved
  Thread.new { unify y, x + 2 }
  Thread.new { unify z, y + 3 }
  Thread.new { unify x, 1 }
  z #=> 6
end
# Instance variables
class AnimalHouse
  include Dataflow
  declare :small_cat, :big_cat

  def fetch_big_cat
    Thread.new { unify big_cat, small_cat.upcase }
    unify small_cat, 'cat'
    big_cat
  end
end

AnimalHouse.new.fetch_big_cat #=> 'CAT'
# Data-driven concurrency
include Dataflow

local do |stream, doubles, triples, squares|
  unify stream, Array.new(5) { local {|v| v } }

  Thread.new { unify doubles, stream.map {|n| n*2 } }
  Thread.new { unify triples, stream.map {|n| n*3 } }
  Thread.new { unify squares, stream.map {|n| n**2 } }  

  Thread.new { stream.each {|x| unify x, rand(100) } }

  puts "original: #{stream.inspect}"
  puts "doubles:  #{doubles.inspect}"
  puts "triples:  #{triples.inspect}"
  puts "squares:  #{squares.inspect}"
end

It doesn’t take long to read the Dataflow code (it’s only 52 lines, after all) but it did take me a while stepping through it in the NetBeans debugger to wrap my head around how it works. Also the name of the variable-assignment method is a unintuitive to me. Assignment is done by calling the unify method. Apparently this name comes from the concept of unification, which I think means: provide a bunch of algorithms whose variables have dependencies on each other and let the system work out the dependencies and execute the algorithms in the correct order to assign values as they are needed. Anyway, using a method called unify for assignment takes a little getting used to.

Note: Larry’s README says Dataflow was inspired by the Oz programming language, not the Erlang programming language. But I’m more familiar with Erlang so that’s what I can compare it to. The primary difference between Ruby-with-Dataflow and Erlang is that in Dataflow you declare a variable and then assign it a value, whereas in Erlang you have to assign at the moment you declare it. That’s how Erlang makes variables write-once: if you can only assign a value when you declare a variable, obviously it will only be assigned once. Dataflow lets you assign to the same variable multiple times but raises an error if you assign different values, so it’s equivalent to write-once. (It uses the != operator to decide whether the values are equal.)

Interop with “Normal” Ruby

The README also says, “The nice thing is that many existing libraries/classes/methods can still be used, just avoid side-effects.”

It’s true that you can write a program that uses Dataflow for some variables but also interops with non-Dataflow code as long as that code is thread-safe. I’m not quite sure what he meant by “just avoid side-effects.”

But How Do You Assign New Values to Variables?

If a variable can only be written once, what do you do when you need to change it? Obviously programs need to deal with this. For example, what if you need to loop over an array and keep track of the index as you go? Erlang handles it by heavy use of the stack and threads, so whenever you need a new value you call a function (which spawns a thread) and the function declares a new variable, assigning the new value to it. So there’s a lot of copying of values.

In Ruby with Dataflow I imagine you would do something similar: either call a function or spawn a thread for each iteration, passing in the current value, and have the function or thread declare a new local variable which is value+1. This style of programming takes some time before it becomes natural. It’s not yet natural for me.

There could also be performance implications. Erlang’s interpreter optimizes tail recursion and converts it to an iteration (really a GOTO) under the hood so the stack doesn’t blow. I don’t know if any Ruby interpreters do that. As of a few years ago they didn’t, according to my Google search. Johannes Friestad wrote in 2005, “Recursion, tail or no tail, works just as well as any other method call in Ruby. Plenty of thrive without optimizing for tail recursion, Java is one of them. The combination of a small stack and lack of tail recursion optimization does mean that in Ruby, recursion can hardly replace every other looping construct the way it can in Lisp. You’ll be the judge of whether that is important.”

Update: Larry writes in the comments that “this library makes JRuby shine over MRI due to its green threads + native thread pool implementation.” I’ve only used MRI and I didn’t know about that aspect of JRuby but it’s pretty nice. It sounds like if you’re going to use Dataflow you might want to use it with JRuby rather than MRI.

Possible Concerns

Dataflow is really cool but I do have a few potential concerns about it:

  1. Even though Dataflow makes it easier to write thread-safe code, it doesn’t fix the fact that it’s hard to debug multithreaded code. Stepping through multithreaded code in a debugger is complicated, especially when the code switches thread context on the fly.
  2. Speaking of debugging, if the debugger tries to show you the value of a Dataflow variable that hasn’t yet been assigned, the debugger thread itself will be put to sleep. In NetBeans this means the “locals” pane stops working (but you can still debug) and if you hover the mouse over an unassigned variable, you don’t see anything in the tooltip. In rdebug it’s worse–if you eval a variable that doesn’t yet have a value, rdebug hangs because its main thread gets put to sleep.
  3. You can’t assign nil to a Dataflow variables because nil is used to indicate that it hasn’t yet been assigned. I would like to be able to assign a value of nil and have that be different from “unassigned.” This would be a pretty easy fix to make to Dataflow without bloating memory–all unassigned variables could reference the same constant:
    UNASSIGNED = Object.new
    I removed this concern because it’s been fixed. Dataflow now differentiates between nil and unassigned.
  4. Memory overhead: Dataflow is as efficient as possible with memory usage but it does incur some overhead on each variable. Compared to unthreaded programming, it is a lot. But compared to manual thread synchronization it’s probably about the same amount of memory you would have used for synchronization data structures anyway. It depends on how you do your manual synchronization. Each variable has, in addition to its value:
    1. a Mutex
    2. an Array (initially empty) of references to Threads that are waiting for it to be initialized
    3. a Monitor condition to wake up the Threads that are waiting for it to be initialized
    4. a Boolean to track whether it has a value yet (but cleverly, this boolean doesn’t get assigned until the variable is assigned, which saves some memory)
  5. More than the overhead per variable, I wonder about the memory overhead of constantly copying values rather than reassigning them. If the stack gets too deep you could run out of memory from all the copying. (See my description of looping above.) You also make the garbage collector work pretty hard. If you loop by spawning threads instead of using recursion, you incur a lot of overhead since threads are expensive compared to function calls. This is why Erlang’s interpreter has its own threading system instead of using the one in the operating system–threads have to be as cheap as function calls. In Ruby they are not.
  6. Related to memory overhead, I wonder about the performance overhead. In addition to deep stacks and lots of threads, every time you call a method on a variable it gets routed through method_missing and Mutex.synchronize even if you have already called that method on that variable. (It does this so its method_missing override can put your thread to sleep until the variable has a value.) This could be expensive but it’s impossible to know for sure without profiling it. If it turns out to be a problem, method_missing could rewrite itself the first time after the variable gets assigned a value so all subsequent calls don’t have to be synchronized.

That reads like a pretty big list of concerns but without actually using Dataflow I can’t tell how many of them will actually cause problems. I still think it’s cool. :-)

Try it if You Need Threading

I mentioned at the beginning that I’m cautious about using threads but there are some problems for which they are the right solution. Next time I’m confronted with such a problem on a Ruby project I will drop in the Dataflow gem and give it a try. It looks like a pretty good way to do threading in Ruby.

Listen to my interview on the Ruby on Rails Podcast

April 15, 2009 by Brian Morearty

On Friday, Geoffrey Grosenbach interviewed Tom Hanley and me for the Ruby on Rails Podcast. Woo hoo!

Add Optional SEO-Friendliness to link_to_remote

April 2, 2009 by Brian Morearty

link_to_remote_with_seo adds optional SEO-friendly goodness to the Rails link_to_remote function.  I wrote it for cases where I would have used link_to_remote in my Rails app but I wanted GoogleBot and other search engines to be able to follow the links.  In addition to setting onclick like the normal link_to_remote, it also sets html_options[:href] to the SAME URL that you pass in to options[:url]. (It only does this if you pass :seo => true and you do not explicitly set the href.)

See the big honking warning at the bottom for an explanation of why this plugin doesn’t just override the behavior of link_to_remote.

I Like Stuff that’s SEO-Friendly

The following example shows a “Next” link in paginated output.  Clicking the link in a browser results in an AJAX call (using the POST method) that retrieves just the “page” partial and inserts it into the “results” div on the page with a highlight visual effect.  When a search engine sees the link, however, it will send a GET request to the same URL, and the entire page (not just the partial) will be sent in the response.

Putting this in the view (home/index.html.erb):

<div id="results">
  <%= render :partial => "page" -%></div>
<%= link_to_seo_remote "Next",
  { :update => "#results",
    :url => { :action => "next_page" },
    :complete => visual_effect(:highlight, "#results") } %>

Produces (pay attention to the href attrbute):

<div id="results">
  <!-- first page of results shown here --></div>
<a href="/home/next_page"
  onclick="new Ajax.Updater('#results', '/home/next_page',
  {asynchronous:true, evalScripts:true,
  onComplete:function(request){new Effect.Highlight(&quot;#results&quot;,{});}}); return false;">
  Next
</a>

In  the controller (home.rb), render just the partial if called in an XHR (AJAX) request:

def next_page
  if request.xhr?
    render :partial => "page"
  else
    # Render the entire page, including the "results" section.
    render :action => "index"
  end
end

WARNING ABOUT INCORRECT USE OF THIS FUNCTION

Sorry but I have to yell for emphasis here.

When Google crawls your site it will follow all links on a page in advance, even before the user clicks on them.  Adding :confirm => “Are you sure?” WILL NOT HELP because it generates JavaScript that Google doesn’t execute.  So when you use link_to_seo_remote, DO NOT ALLOW destructive links to be placed in the href attribute.  Instead, override html_options[:href] to link to an intermediate page with “Are you sure?” and a BUTTON (not a link.  The crawler will not click the link, so the data will not be deleted.

See Using Rails AJAX Helpers to Create Safe State-Changing Links and search the page for “request.post?” for an explanation and some sample code.

Does it Have Tests?

Why, yes. I’d like to thank the Rails Community for not tolerating code with no tests. It was soooo tempting just to release this without writing automated tests but the peer pressure got to me.

And I’ll also like to thank Cake for awesome music.

To get the code

ruby script/plugin install http://github.com/BMorearty/link_to_remote_with_seo.git

Obfuscated ActionScript

April 1, 2009 by Brian Morearty

My brother Mike, a developer on the Flex team at Adobe, wrote a pretty impressive bit of Obfuscated ActionScript on his blog.

Check it out. See if you can figure out what the program does before you follow his link to the answer.

(This snippet is just a taste of it.)

getset

My Favorite Quotes from the Yellowpages.com Ruby on Rails Talk

March 22, 2009 by Brian Morearty

yellowpagesI just watched a video from the 2008 QCon conference of a talk by John Straw about how and why Yellowpages.com rewrote their Java site to use Ruby on Rails. It’s a pretty good talk. He starts by describing the situation they were in that led them to consider a rewrite, then goes into the architectural decisions and some of the technical details.

Here are some some choice quotes from the talk, along with my own commentary.

“All programmers want to rewrite the code they’re forced to maintain. They’re almost always wrong.”

Man, is that ever true. (Note that he said almost always. After all, his talk is about a successful rewrite.)

I’ve seen it again and again. Programmers tend to believe the code they’re maintaining (that someone else wrote) sucks and they could write it much better. Often that’s because they haven’t taken the time to understand the code base. As Joel on Software says, “It’s harder to read code than to write it.” I think usually (but not always) the cost of rewriting it far outweighs any benefits. What you’d typically end up with after a rewrite is:

  • A few years have passed
  • You’ve spent a ton of money on the rewrite
  • The app still has bugs–just a different set of bugs. (Another quote from the Joel article: “The idea that new code is better than old is patently absurd.”)
  • A new generation of programmers will join the team soon. They will complain that the code base sucks and needs to be rewritten.

Having said that, I know there are times when a rewrite is the right thing to do. But that’s a discussion for another day.

Something I think his team did correctly: they made a goal of finishing the rewrite in four months, not two years. A massive two-year rewrite has an extremely low chance of succeeding.

“EJB3 is a whole big boxcar full of crazy.”

Now that’s just funny. (He said that after saying EJB3 is much better than earlier generations of EJB, by the way.)

“At this point our performance architect will maintain that Apache is unsuitable for use in any production web serving environment, in general. (And only nginx with its polling model is the right way to go.)”

I don’t agree but it’s a great quote.

“I actually kind of like the thread-unsafety of Rails. I mean it simplifies the programming model quite a bit for simple web sites. You know: I’m handling one request; I understand how to scale that.”

I totally agree with that. As someone who loves writing software, I think threading is fun and awesome and there are situations where it’s a must–I once even thought about writing a book about threading on Win32. When I was first introduced to Ruby on Rails I had a kneejerk “are you kidding me?” reaction when I heard it wasn’t thread-safe. But I’ve since formed the opinion that single-threading is really nice when you can get away with it because of its simplicity. It helps developers focus on the task at hand rather than spending a lot of time debugging threading problems. In a multi-threading environment it’s too easy for developers who understand threading to introduce code that then gets broken by other developers–and it’s too hard to write tests that will catch the breakage the moment it occurs.

By the way, the speaker’s next sentence was “Obviously our fast service-side application is multi-threaded and we have good benefits from that.” So he’s not saying multi-threading should never be used.

“Testing was a big part of the decision. You know, that was actually one of the things which drew me so strongly to the platform once I started understanding it. I had spent years myself as a Java developer trying to figure out how in the heck to use JUnit to do anything useful on my web site. And maybe that was just a failure of imagination on my part, but when we started looking at Rails we didn’t have to figure it out. It was obvious how to test each level. Both the unit tests for the models, and the functional tests and the integration tests. It was all there in the framework. And not only was the framework built to make it easy, but the community expected it. You know, I’ve never seen a development community that was so involved and oriented towards writing test code–writing test automation–than this one. And so that was a big part of our decision.”

So true. I have found that when it’s obvious how to write effective tests and where to put them, I will write tons of tests. If the framework greases the wheels of test-writing and make it pain-free, I will write a lot more and better tests. Rails does a lot better at this than other frameworks I’ve used, although I still think it could use improvement. And I love the emphasis placed on automated testing in the Rails community.

Well, that’s it. To see the whole talk, go to http://www.infoq.com/presentations/straw-yellowpages. And enjoy the grouchy comments by Java developers below the video.