- 20 Seconds
- 8 Seconds
- 5.93 Seconds
- 5.09 Seconds
- 4.23 Seconds
- 2.94 Seconds
- 1.57 Seconds
- Results
- Other Things To Try
- References
Contents
How I Made My Ruby Project 10x Faster
This post is about how I got a 10x speedup for my ruby gem contracts.ruby.
contracts.ruby is my project to add code contracts to Ruby. Here's what it looks like:
Contract Num, Num => Num
def add(a, b)
a + b
end
Now, whenever add
is called, the parameters and return value will get checked. Cool!
20 seconds
This weekend I benchmarked this library and found that it has abysmal performance:
user system total real
testing add 0.510000 0.000000 0.510000 ( 0.509791)
testing contracts add 20.630000 0.040000 20.670000 ( 20.726758)
This is after running both functions 1,000,000 times on random input.
So adding contracts on a function results in a massive (40x) slowdown. I dug into why.
8 seconds
I got a big win right away. When a contract passes, I call a function called success_callback
. The function is completely empty. Here's the full definition:
def self.success_callback(data)
end
That's something I had put in "just in case" (yay futureproofing!). Turns out, function calls are very expensive in Ruby. Just removing this call saved me 8 seconds!
user system total real
testing add 0.520000 0.000000 0.520000 ( 0.517302)
testing contracts add 12.120000 0.010000 12.130000 ( 12.140564)
Removing a lot of other extraneous function calls took me to 9.84
-> 9.59
-> 8.01
seconds.
This library is already more than twice as fast!
Now things get a little more complex.
5.93 seconds
There are multiple ways to define a contract: lambdas, classes, plain ol' values, etc. I had a big case
statement that checked to see what kind of contract it was. Based on the type of contract, I would do different things. I saved some time by changing this to an if
statement, but I was still spending unnecessary time going through this decision tree every time the function was called:
if contract.is_a?(Class)
# check arg
elsif contract.is_a?(Hash)
# check arg
...
I changed this to go through the tree once, when the contract is defined, and build lambdas:
if contract.is_a?(Class)
lambda { |arg| # check arg }
elsif contract.is_a?(Hash)
lambda { |arg| # check arg }
...
Then I would validate an argument by passing it into this pre-calculated lambda, bypassing the branching logic completely. This saved me another 1.2 seconds.
user system total real
testing add 0.510000 0.000000 0.510000 ( 0.516848)
testing contracts add 6.780000 0.000000 6.780000 ( 6.785446)
Pre-computing some other if statements saved me almost another second:
user system total real
testing add 0.510000 0.000000 0.510000 ( 0.516527)
testing contracts add 5.930000 0.000000 5.930000 ( 5.933225)
5.09 seconds
Switching out .times
for .zip
saved me almost another second:
user system total real
testing add 0.510000 0.000000 0.510000 ( 0.507554)
testing contracts add 5.090000 0.010000 5.100000 ( 5.099530)
Turns out,
args.zip(contracts).each do |arg, contract|
is slower than
args.each_with_index do |arg, i|
which is slower than
args.size.times do |i|
.zip
spends unnecessary time copying and creating a new array.
And I think .each_with_index
is slower because it just yield
s to .each
in the background, so it involves two yields instead of one.
4.23 seconds
Now we're getting into some nitty-gritty stuff. The way the contracts library works is, for every method, a new method is added using class_eval
(classeval is faster than definemethod).
This new method contains a reference to the old method. When the new method is called, it checks the arguments, then calls the old method with the arguments, then checks the return value, and then returns the return value. All of this happens with two method calls to the Contract
class: check_args
and check_result
. I eliminated these two method calls and did the checks right in the new method instead. This got me another .9 seconds:
user system total real
testing add 0.530000 0.000000 0.530000 ( 0.523503)
testing contracts add 4.230000 0.000000 4.230000 ( 4.244071)
2.94 seconds
Earlier I had explained how I was creating lambda
s based on the type of Contract and then using those to check arguments. I switched this out to generate code instead, which would then get eval'd when I used class_eval
to create my new method. An awful hack! But it avoided a bunch of method calls and saved me another 1.25 seconds:
user system total real
testing add 0.520000 0.000000 0.520000 ( 0.519425)
testing contracts add 2.940000 0.000000 2.940000 ( 2.942372)
1.57 seconds
Finally, I changed how I called overridden method. My earlier approach was using a reference:
# simplification
old_method = method(name)
class_eval %{
def #{name}(*args)
old_method.bind(self).call(*args)
end
}
I changed it to use alias_method
:
alias_method :"original_#{name}", name
class_eval %{
def #{name}(*args)
self.send(:"original_#{name}", *args)
end
}
This bought me a crazy 1.4 seconds. I have no idea why alias_method
is so fast...I'm guessing because it skips a method call as well a call to .bind
.
user system total real
testing add 0.520000 0.000000 0.520000 ( 0.518431)
testing contracts add 1.570000 0.000000 1.570000 ( 1.568863)
Results
We managed to get from 20 seconds to 1.5 seconds! Is it possible to do better than this? I don't think so. I wrote this test script that shows that a wrapped add
method will be 3x as slow as a regular add
method, so these numbers are really good.
But it's only 3x as slow because the method is so simple that more time is spent in calling the method. Here's a more realistic example: a function that reads a file 100,000 times:
user system total real
testing read 1.200000 1.330000 2.530000 ( 2.521314)
testing contracts read 1.530000 1.370000 2.900000 ( 2.903721)
Very little slowdown! And I think most functions will see very little slowdown...the add
function is an outlier.
I decided not to use alias_method
because it pollutes the namespace and those aliased functions would show up everywhere (documentation, auto-complete for IDEs, etc).
Some takeaways:
- Method calls are very slow in Ruby! I enjoy making my code very modular and reusable, but it might be time for me to start inlining more code.
- Benchmark your code! Removing a simple unused method took me from 20 seconds to 12 seconds.
Other things to try
Method combinators
One feature that didn't make it into Ruby 2.0 was method combinators, which would have allowed you to write
class Foo
def bar:before
# will always run before bar, when bar is called
end
def bar:after
# will always run after bar, when bar is called
# may or may not be able to access and/or change bar's return value
end
end
This would've made it easier to write decorators, and might have been faster too.
keyword old
Another feature that didn't make it into Ruby 2.0, that would have allowed you to reference an overwritten method:
class Foo
def bar
'Hello'
end
end
class Foo
def bar
old + ' World'
end
end
Foo.new.bar # => 'Hello World'
redefining a method with redef
This one got rejected by Matz who said:
To eliminate
alias_method_chain
, we introducedModule#prepend
. There's no chance to add redundant feature in the language.
So if redef
is a redundant feature, maybe prepend
could be used to write a decorator?
Other implementations
So far, all of this has been tested on YARV. Maybe Rubinius would allow me to optimize more?
References
- 6 Optimization Tips for Ruby MRI
When monkey patching a method, can you call the overridden method from the new implementation?
If you are already using contracts.ruby, upgrade to v
0.2.1
to see the speedup.