At the risk of being redundant. Let’s just repeat this one more time. Werner Vogels on the scalability challenge:
I like this challenge, given that 50X is likely to be able to make impact, where 2-4X in general can be easily compensated for by the next generation hardware. But something bugs me about the challenge and also about some of the demonstrations in the papers; 50X is still focused on scaling-up, just as many of the current database systems do, instead of scaling out, which is what the world really needs. The evidence in the paper is indeed about single box performance. This continuing N=1 thinking will never yield systems that can break through the current scalability limitations of enterprise software, regardless whether it runs 50 times faster or not.
Not quite there yet. Manuel has a nice summary of processes and threads in Ruby. Although the telling chink in the armor comes in the last comment:
Current Ruby cannot take advantage of fork and COW since the garbage collector walks the object space to mark objects> This means that you can fork and use COW memory *until* the GC is run in the child process. When that happens all the COW pages are marked dirtty so the child processes take asm uch mem as the parent.
The biggest problem with Ruby is not the model but the implementation, and in 1.8.6 forking is not a free lunch.
Read consistency. And obviously you read about Amazon’s Dynamo by now. The short version: it repurposes the database to store data, key/value pairs in this case, not to execute any application logic. Consistency is moved to the application. The long version, while it talks about one particular implementation Amazon uses, which may not be right for you, the document is broad enough to discuss other potential implementation. Lots of good material there.
All this new stuff is confusing me. Search:
In effect, Erlang seems to be moving locks scattered in various places throughout your code (e.g. synchronized blocks in Java) into one single bottleneck.
In effect, Java seems to be moving deallocations scattered in various places throughout your code (e.g. object lifecycles in C++) into one single garbage collector.
Speeding J. At the advice of Charles Oliver Nutter and Nick Sieger, I decided to try JRuby from trunk. Unfortunately, I couldn’t get the compiler to work against my code and no time to change it; it’s still experimental (JRuby trunk and my code), so that’s understandable. But I did — and I have to remember that for future reference — bump again into the -O option.
With the -O option (no ObjectSpace support), I got one of the tests to run at CRuby speed, and another test to run half as fast, though that too (regular expressions) will be sped up in future releases. The rest won’t run since I’m forking.
I’m personally more interested in JRuby tackling the harder problems first, and leaving the trivial cases like regular expressions compilation for later, so in my opinion the twice-as-slow-today is plenty of fast, given the speed at which new JRuby releases come out.
Picture above, nothing to do with the content below.