Put HTML tags and apostrophes in fixtures and tests or a meanie will hack you.

January 15, 2009 by Brian Morearty

Here’s a good way to protect against cross-site scripting attacks and SQL injection attacks. This will help catch mistakes where you (well actually your teammate, since you’re perfect) forgot to call “h” in a <%= %> block, or accidentally passed a SQL statement to the database without escaping the values:

Sprinkle unclosed HTML tags and apostrophes all over your fixture data and test code.

Then use assert_select liberally, which will barf on the console if it sees unclosed HTML tags–even if you were selecting some other part of the document.

I Like Stuff that’s Safe

Here is what a posts.yml file might look like:

test_post:
  id: 1
  subject: <script> attack!
  detail: "sql injection: '; drop table posts;"

(If you use an apostrophe in YAML you have to quote the whole string.)

So assert_select has this handy side-effect I mentioned where it tells you about your malformed HTML. Since Rails tests don’t actually run in a browser, you need some other way to know that you’ve forgotten to escape data. Unclosed HTML tags in your fixtures, yeah, that’s the ticket.

And remember, you don’t need to call assert_select on the element that contains the bad data. Just call assert_select on anything and it will parse the output to make sure it’s well-formed.

  def test_show
    post = posts(:test_post)
    get :show, post.id
    assert_select "body"
  end

The idea is that by sprinkling XSS attacks through your fixtures and using assert_select whenever you’re testing other stuff, the XSS attacks will become apparent.

If you do need to assert that the output is correct, you can call CGI::escapeHTML:

  def test_show
    post = posts(:test_post)
    get :show, post.id
    assert_select "span", :count => 1,
      :text => CGI::escapeHTML(post.detail)
  end

I can’t haz SQL injection attacks

I admit that putting SQL injection attacks in the fixtures is a bit contrived and may not help. A better way to catch SQL injection attacks is to pass apostrophes into the app from your test code, so go ahead and sprinkle your test code with beauties like this:

  def test_update
    post :update, posts(:test_post).id,
      :detail => "sql injection: '; drop table posts;"
  end

The secret to making this work is:

  1. apostrophe
  2. semicolon
  3. SQL statement
  4. another semicolon

You want to use a SQL statement that will cause a test to fail. It would be coolio if there were some way to make the current test succeed and subsequent tests fail, but I’m not sure I know a way to do that consistently. But at least if you use a “drop table” statement, you’re going to cause subsequent tests to fail (if there are any subsequent tests that use that table) because a schema change does not happen in a transaction. So even if you’re using transactional fixtures, the next test will fail anyway cuz the dang table is gone.

Fun with Ruby’s instance_eval and class_eval

January 9, 2009 by Brian Morearty

In an attempt to better understand instance_eval and class_eval, I just read Khaled’s post on Ruby reflection. It helped, and I came up with a memory crutch I can use to remember when to use each of them:

Use ClassName.instance_eval to define class methods.

Use ClassName.class_eval to define instance methods.

That’s right. Not a typo. Here are some examples, shamelessly stolen from his post:

# Defining a class method with instance_eval
Fixnum.instance_eval { def ten; 10; end }
Fixnum.ten #=> 10

# Defining an instance method with class_eval
Fixnum.class_eval { def number; self; end }
7.number #=> 7

I Like Stuff that’s Backwards

Why is it the reverse of what you might expect? Because Fixnum.instance_eval treats Fixnum as an instance (an instance of the Class class), thus any new functions you define can be called on that instance. So it’s equivalent to this:

class Fixnum
  def self.ten
    10
  end
end
Fixnum.ten #=> 10

Fixnum.class_eval treats Fixnum as a class and executes the code in the context of that class, thus any “def” statements are treated exactly as if they were in normal code without any reflection. It’s equivalent to this:

class Fixnum
  def number
    self
  end
end
7.number #=> 7

There are still some things about Ruby reflection that mystify me but at least I think I’ve got this one nailed.

Generate guid ids 2100x faster for ActiveRecord models (but only if you use MySQL)

January 3, 2009 by Brian Morearty

The Rails project I’m working on (the Small Business Help Forums at the Intuit Community) has some tables that use GUIDs for their primary keys instead of autoincrement integers. To implement GUIDs we used the handy usesguid plugin. All you have to do is change your “id” column to a 22-character varchar (make sure it’s a binary varchar and uses binary collation, so upper and lower case are treated differently) and put this in your model:

class MyModel < ActiveRecord::Base
  usesguid
end

Pretty nice.

Just one problem.

It’s HECKA slow.

On my Windows machine it was taking a whopping 0.4 seconds to create a GUID with this plugin. On my Linux VM it was a lot faster, but still slower than it should be (0.0322 seconds–just 31 GUIDs per second).

Download the Faster Plugin

If you use MySQL for your database and you’d like to download my modified usesguid plugin which is way faster, type this from the main directory of your Rails app:

 script/plugin install git://github.com/BMorearty/usesguid.git

Or download it here and copy it into vendor/plugins/usesguid.

Then add the “usesguid” statement (see above) to any models that you want to have guid ids, migrate the id columns to binary varchar(22), and add this to your environment.rb file:

ActiveRecord::Base.guid_generator = :mysql

Here is a sample migration for creating a new table with guids, as opposed to changing an existing one to use them:

create_table :products, :id => false, :o ptions => 'ENGINE=InnoDB' do |t|
# This table uses guid ids
t.binary :id,   :limit => 22, :null => false
t.string :name, :limit => 50, :null => false
end
# Since the t.column syntax can't specify a character set and collation...
execute "ALTER TABLE `products` MODIFY COLUMN `id` VARCHAR(22) BINARY CHARACTER SET latin1 COLLATE latin1_bin NOT NULL;"
execute "ALTER TABLE `products` ADD PRIMARY KEY (id)"

I Like Stuff that’s Fast

Read on to find out why the old code was so slow, and how the code got 2100 times faster.

I investigated to see why it takes so long, and found that every time it creates a GUID, it calls UUID.timestamp_create. This in turn calls UUID.get_mac_address, which spawns a new process (ipconfig on Windows; ifconfig on UNIX-based systems) and parses the output. The reason: to discover the network card’s MAC address. (Hey yeah, even Windows has a MAC address.)

But the MAC address never changes. It’s hard-wired into the network card. So why bother querying it every time you create a GUID? Launching a whole new process every time we need a GUID is overkill.

My first thought was to write a plugin on top of the plugin. My plugin would cache result of UUID.get_mac_address. I tried it, but found a problem: there’s a bug in UUID.timestamp_create. If it executes too quickly on a system whose clock resolution is not high enough, it returns the same GUID multiple times in a row. Whoops! Kind of defeats the purpose of GUIDs.

So I decided to take advantage of the fact that MySQL has a “SELECT UUID()” syntax, and I wrote a new GUID creator in the UUID class that calls MySQL to generate GUIDs. (Obviously this only works if you have MySQL.) I called this new creator “UUID.mysql_create.” The first time it is called, it calls MySQL like this:

SELECT UUID(), UUID(), UUID(), UUID(), UUID(), ... ;

It selects 50 UUIDs in a single round-trip to the database and stores the results in memory. Each time a new GUID is required, it plucks one off the list. When the list is empty and another one is required, it goes and gets another 50.

On my Windows machine, creating a GUID with UUID.mysql_create now takes 0.0001937 seconds, which is over 2100 times faster than the 0.4 seconds it used to take. On my Linux VM it’s 0.0001671 seconds, or 193 times faster than the 0.0322 seconds it used to take.

All these changes were made in a new file, uuid_mysql.rb. But I also made a number of changes to the usesguid.rb file:

  1. Added a configuration option so you can specify which creator to use. The default is still timestamp_create, but to use mysql_create you just put “ActiveRecord::Base.guid_generator = :mysql” in your environment.rb file.
  2. Fixed the code so it respects the :column option, which lets you override the column that stores the primary key.
  3. Delayed the assignment of a guid until just before creation (before_create) rather than just after “new” (after_initialize). This has two benefits:
    1. It more closely mimics the default behavior of autoincrement columns, which doesn’t assign an id until after creation
    2. It is faster. After_initialize gets called every time a model object is instantiated, including all objects return by a call to find. (But don’t worry, it wasn’t generating GUIDs for all those objects; it was just being called and bailing out when it saw there was already an id).  Before_create only gets called for newly created model objects.

I thought about making it even faster by calling CoCreateGuid() on Windows and calling a UNIX C function to create a GUID when on UNIX, but it’s so fast now that it hardly seemed worth the extra effort and the extra platform-specific code.

So that’s it. Enjoy it!

Find tests more easily in your Rails test.log

June 18, 2008 by Brian Morearty

Here’s a nice little trick to make it easier to search test.log for the results of a specific test that’s failing. This trick works with normal Rails unit tests and with Shoulda tests.

When a Rails test fails, I look for it in test.log to see if there are any clues there. But it’s pretty hard to find the portion of the log associated with the test that failed. In this sample section of a log, where does the processing begin for the test called test_should_require_email_on_signup?

test.log without titles
Which test is which? Where does my test start?

It’s hard to find. Now imagine running rake on all your tests and sifting through the whole test.log looking for one test whose name you know, but the test name isn’t in the log.

So the other day I wrote a bit of code in my test_helper.rb file to make the log a lot easier to sift through. Here’s what the above log looks like with this code in place:

test.log with titles
Ooh, nice-n-clear

Ahh, that’s more like it. Now it’s easy to tell where test_should_require_email_on_signup begins. If you scroll up and look at the first log again, you’ll see that there isn’t even a blank line separating that test from the previous one. (See how the test starts on the SELECT count(*) statement?)

Here’s the code. Drop it into test_helper.rb for your Rails project. To me this seems like a nice little example of how Ruby’s open classes can benefit developers (while understandably considered harmful by some). In a language without monkey patching, I would have to resort to something more painful like changing all my tests to be derived from my own subclass of TestCase, and put this code in that class.

Enjoy!

class Test::Unit::TestCase
  # This extension prints to the log before each test.  Makes it easier to find the test you're looking for
  # when looking through a long test log.
  setup :log_test
 
  private
 
  def log_test
    if Rails::logger
      # When I run tests in rake or autotest I see the same log message multiple times per test for some reason.
      # This guard prevents that.
      unless @already_logged_this_test
        Rails::logger.info "\n\nStarting #{@method_name}\n#{'-' * (9 + @method_name.length)}\n"
      end
      @already_logged_this_test = true
    end
  end
end

P.S. I didn’t spend the time to figure out why my callback was being called multiple times for each test. I just inserted the guard you see in the code above to prevent the same test title from being shown multiple times.

Drop me a line in the Reply section below and let me know what you think–especially if you’ve figured out why each one is called multiple times when running from rake or autotest.

Updated 6/18: I corrected the code above because WordPress automatically inserted a “mailto:” tag when it saw the @ sign.

How to Show Response Time in a Rails Page with Mongrel

May 21, 2008 by Brian Morearty

You’ve seen this on Google result pages, right?

You wanna do that in your Rails app that runs with Mongrel? I show you how. Sit down. And along the way I’ll show what I learned about writing custom Mongrel HttpHandlers and why you shouldn’t store instance variable in them.

I remember seeing it there long ago when Google was new. I liked it because:

  1. Faster is better
  2. It shows Google focuses on helping me go fast
  3. It reminds the Google developers to focus on helping me go fast

So I’m developing this awesome web site now and I’m using Ruby on Rails with Mongrel. I wanted to pull a Google and show the server response time as text content in my pages, as a reminder to myself and my co-developer that it’s super-important to keep things fast.

I can haz question

How do you put response time in a page using Rails?

Read the rest of this entry »

What is your Zombie Escape Plan?

April 16, 2008 by Brian Morearty

Ok, so I was talking to my cool niece last month and she told me something that just cracked me up.

Are you ready?

Here goes:

Every teenage boy has a Zombie Escape Plan.

That’s what my niece told me. She was serious. And she thought it was just as weird as I do. (She doesn’t have one.)

Here’s how she found out about it. One day she was listening as two male friends of hers were comparing zombie escape plans. This was new to her. “Does every guy have a zombie escape plan?” she asked them.

“Well duh,” they both said, dead serious.

So she ran an unscientific survey of her teenage male friends to find out if it was true. And guess what.

It was.

Every boy she asked said yes, naturally he has a zombie escape plan.

By now I was busting up. I told her well, at least I have a fire escape ladder in my 2-story house. I could use that as my zombie escape plan too. Her dad (my brother-in-law) said no: that’s lame. A fire escape plan does not serve as a zombie escape plan. As evidence he pointed to his own zombie escape plan: he will dance a jig. Because everyone knows a zombie cannot resist dancing a jig if he sees someone else doing it. But it doesn’t make a very good fire escape plan.

My niece said her dad was right. One boy in her survey said his zombie escape plan involved climbing up on the roof, which is usually not a good idea in a fire. At this point her younger brother, who’s also a teenager, piped up and said that after all, his zombie escape plan is to use a flame thrower. And that never makes a very good fire escape plan.

When I got home I googled it and found that there is even a web site dedicated just to this (side note: what did we ever do before the web?): http://www.zombieescapeplan.com. Except it’s made by a girl. So I guess at least that’s good because when the zombies attack some of the girls will be prepared.

I need a good zombie escape plan. What’s yours? Please comment below so I can get some good ideas.

How to write case (switch) statements in Ruby

April 15, 2008 by Brian Morearty

If you’re like me, when you started coding in Ruby last year you found the “case” statement intriguing. After years of writing in C++ and C# it was hard for you to remember Ruby’s case syntax because it can do so much more than switch statements in those languages.

So you wrote these notes to yourself as you discovered its capabilities. Except you’re not that much like me so you didn’t. But I did. I hope you find them useful.

switch/case syntaxes
(remember: Ruby uses "case" and "when"
where others use "switch" and "case"):

# Basically if/elsif/else (notice there's nothing
# after the word "case"):
[variable = ] case
when bool_condition 
 statements
when bool_condition
 statements
else # the else clause is optional
 statements
end
# If you assigned 'variable =' before the case,
# the variable now has the value of the
# last-executed statement--or nil if there was
# no match.  variable=if/elsif/else does this too.

# It's common for the "else" to be a 1-line
# statement even when the cases are multi-line:
[variable = ] case
when bool_condition 
 statements
when bool_condition
 statements
else statement
end

# Case on an expression:
[variable = ] case expression
when nil
 statements execute if the expr was nil
when Type1 [ , Type2 ] # e.g. Symbol, String
 statements execute if the expr
  resulted in Type1 or Type2 etc.
when value1 [ , value2 ]
 statements execute if the expr
 equals value1 or value2 etc.
when /regexp1/ [ , /regexp2/ ]
 statements execute if the expr
 matches regexp1 or regexp 2 etc.
when min1..max1 [ , min2..max2 ]
 statements execute if the expr is in the range
 from min1 to max1 or min2 to max2 etc.
 (use 3 dots min...max to go up to max-1)
else
 statements
end

# When using case on an expression you can mix &
# match different types of expressions. E.g.,
[variable =] case expression
when nil, /regexp/, Type
 statements execute when the expression
 is nil or matches the regexp or results in Type
when min..max, /regexp2/
 statements execute when the expression is
 in the range from min to max or matches regexp2
end

# You can combine matches into an array and
# precede it with an asterisk. This is useful when
# the matches are defined at runtime, not when
# writing the code. The array can contain a
# combination of match expressions
# (strings, nil, regexp, ranges, etc.)
[variable =] case expression
when *array_1
 statements execute when the expression matches one
 of the elements of array_1
when *array_2
 statements execute when the expression matches one
 of the elements of array_2
end

# Compact syntax with 'then':
[variable =] case expression
when something then statement
when something then statement
else statement
end

# Compact syntax with semicolons:
[variable =] case expression
when something; statement
when something; statement
else statement # no semicolon required
end

# Compact syntax with colons
# (no longer supported in Ruby 1.9)
[variable =] case expression
when something: statement
when something: statement
else statement # no colon required
end

# 1-line syntax:
[variable = ] case expr when {Type|value}
 statements
end

# Formatting: it's common to indent the "when"
# clauses and it's also common not to:
case 
  when 
  when
  else
end

case
when
when
else
end

hello, world

April 13, 2008 by Brian Morearty

I am Brian Morearty. Welcome to my blog.

I’m a software engineer living with my wife and three kids in the San Francisco Bay Area.

I plan to blog about stuff like software development, things that strike me as funny, etc. The technology that currently has my interest is Ruby on Rails and Flex but that may change over time. For most of my career in software I’ve been working on Windows, and I’m also a fan of most Microsoft developer technology (WPF, ASP.NET, and so on.)