A collection of thoughts related to the challenges of software engineering

stay connected

I, destructor

Although we edit NoSQL software, I always disliked calling our product a NoSQL database. I know we say in the third paragraph of our web page that wrpme is a NoSQL database, but see, right after it says that we prefer to call it a postmodern database as proposed by Dr. Richard Hipp.

The obvious reason why we use this fancy word is because we’re French which makes us pretty much the pinnacle of snobbery and pedantry. The other reason is because we weren’t able to come up with a better term.

There’s also the thing that we don’t like NoSQL is because it’s a negative term that suggests our software is here to destroy Relational Database Management Systems (RDBMS) which are the one true evil.

I think I can safely say that no one is trying to destroy anything.

Where I say good things about RDMBS

I guess the NoSQL term originated from someone who was fed up with yet another pervert use of relational databases.

I’ve seen my share, I can relate to that feeling.

The first thing that comes to my mind is any software that uses a RDBMS to serialize data. That's pretty overkill, isn't it? After all, I however think I prefer the “one table to rule them all” scheme. Wait. One table is good, but, I’ve heard primary keys are good too so if we make everything a primary key it must be very good. Oh, and more indexes please, I can’t get enough of them. I know that I will somehow need to research those binary blobs, so index them please. And index the indexes to make the indexes faster.

So yes, it can get very silly.

Let’s not forget one thing though: an incredible amount of intelligence has been poured for more than forty years into RDBMS. People much more intelligent than you, maybe almost as intelligent as us (and that’s saying a lot, see paragraph two), have worked very hard to solve extremely complex problems. And succeeded.

Please, please, please, make sure it dwells very well in your mind: RDBMS work, and they work reliability, and they can adapt to your business case very well, and they when you account everything they do. They have their limits - like everything in the universe - but every time you book a flight, order a book or send money they’re proving to yourself and to the world how dependable they are.

NoSQL engines are for the most part crude, useless and unreliable and as for us, we know we still have a long way to go in terms of flexibility, features and proven reliability.

When you complain that your relational database is too slow, the problem is not the database. The problem is most likely how you use it.

Let’s talk performances

Speed limit, curve ahead

So, am I killing our business? Not really.

RDBMS are fast, but NoSQL postmodern databases can be damn fast. Although you may not need the speed, you may like the fact you need less computing power to handle the same load.

Additionally, to be fast, relational databases have to be used properly. Let’s be realistic for a second, it’s hard to be good at SQL. Non-relational databases are “more obvious” and closer to how the typical programmer thinks and for simple use cases you are most likely to do the right thing with a NoSQL engine than with a relational one.

Ever tried hammering a reasonably sized RDBMS with one thousand distinct clients? A real one? With atomic, consistent, isolated and durable (ACID) transactions? With each client querying the database like there’s no freaking tomorrow? Did it also end up with a database administrator in the air vent with a crossbow aimed at you? I think I made my point.

RDBMS, in certain contexts, can be slow because one of their best features, ACID transactions, come with a hefty price. This is not because RDBMS are poorly done, it’s because to truly ensure that your transaction is atomic, consistent, isolated and durable the database needs to do a lot of work.

And while we’re on the topic of ACID transactions, this important feature is also the reason why they don’t scale very well. Distributing ACID transactions is difficult. You can’t just add commodity servers and expect a linear increase in performance.

Did I say difficult? I wanted to say near-impossible.

To scale a modern relational database, you partition data into buckets and spread the load over buckets (this is an over-simplification, but bear with me). This is called partitioning or sharding, depending on how you split the data. The limit with this approach is that it requires carefully planning the partitions as it’s much more difficult to adjust later. This is not unlike partitioning a hard drive when you install your operating system. This is generally not something you tweak later on production systems, even if you can.

I know, I know. Some new engines are coming out, claiming they can offer the speed of NoSQL and the reliability of ACID transactions.

Are they lying?

Well, we haven’t benchmarked them (yet), but one thing is certain: if they offer truly ACID transactions, they have a performance tax to pay. They can be clever about it and there’s clearly room for achieving great things, but they will always be disadvantaged.

In other words, state of the art relational databases with ACID transactions will always be an order magnitude slower than state of the art non-relational databases without ACID transactions.

Let’s talk money

The other big problem is that over time relational databases became bloated with “wtf” features, because you really need to be able to do Java inside SQL, right?

The dark shroud of enterprise software obfuscated the qualities of otherwise fine products. That means you will need someone to shepherd the weak through the valley of darkness: a database administrator.

Do you really need more weird people in your organization? I submit you do not.

Which brings us to a topic top management understands very well: storage is an order of magnitude more expensive on relational databases. You see that terabyte drive you can buy for 50 €? Want to add a terabyte to your RDBMS? That will be 5 000 €, thank you very much (I’m not making this up).

This last piece of information probably helps you understand why there is a strong interest for non-relational databases as we enter the yotta world.

When should you go non-relational?

Architecture

RDBMS are great, but they’re not great at everything and sometimes they truly suck.

That’s why we have non-relational databases, because if you throw away relations and transactions, you can do interesting things in terms of processing, scalability and pure performance.

But do you really want to throw away relations and transactions? Do you need to?

How many gadgets has Batman? We will agree on a number much greater than one. Since you’re probably an order of magnitude below Batman in terms of awesomeness, you will agree that you need to be at least as much prepared as he is. In other words: tool up.

That’s why, if you start a new project, you should definitely consider non-relational databases and include them in your architecture. I guarantee you that you have non-relational data that will be cheap and efficient to store in a post-modern database, and you might even be able to go fully non-relational! Ask us or our beloved competition if you need help.

For existing projects however, we’ve seen that many performance problems can be solved with database tuning and proper caching.

Nevertheless, once you have done that, you may still have performance issues. The thing to understand is that the transition from a relational to a partially (or fully) non-relational schema can be extremely disruptive. One approach is to locate the “hot” data and either duplicate or relocate it into a postmodern database.

I could write a lot more on this topic actually and there’s much to be said.

What I wanted to show is that NoSQL is more about shifting the balance a little less on the relational side than killing RDBMS. Maybe the term AltSQL is a better one as it is a reminder that we’re trying to find new, not trying to demean existing ones.

As for us we will stick to the term postmodern database for now and throw a party for overloading another customer’s network (true story that will be the topic of another post).

January 5th, 2012

For New Year’s Eve, my significant other and I watched Quadrille . An elder man sat next to me and instead of patiently waiting for the play to start, he took his smartphone and played a game.

A couple of weeks ago, a friend of mine had a heart attack. As he felt a strong pain in both arms, he called for help. Less than fifteen minutes later firemen arrived and took him to the hospital. As soon as he arrived he was promptly directed to the operating theater. There, the doctors inserted a needle in his arm, acceded the heart, removed the thrombus and placed a stent. Two days later he was dismissed. Except for his daily medicine intake and a very small scar near the wrist, it’s impossible to know he had a heart attack less than a month ago.

Cars no longer need people to be operated and robots move like humans. If you combine both, it’s not difficult to imagine fully automated delivery services. For example, you could order an item online and have it delivered to you without any human intervention.

And as I type these words, a man-made object is leaving the solar system...

Happy new year 2012. Welcome to the future.

November 5th, 2011
Parrots

Parrots are wonderful creatures of Earth. How many animals can you actually talk to?

Not that many.

One might retort that parrots only stupidly repeat what they hear.

A valid point, except it's completely wrong.

Parrots are amongst the most intelligent birds on Earth. Some species are even able to associate words with meanings. The African Grey Parrot, in particular, has a degree of intelligence comparable to dolphins and great apes. They understand our language to the point they can form sentences and even display a sense of humor.

Owning a parrot is a huge responsibility as it's likely to outlive you. Overmore, it will be depressed if you are gone missing for too long, is bored or hasn’t been bred properly.

Surely, the last thing you want is to have a bird looking at you with sad eyes as she loses her feathers.

On the other hand, if you have the time, dedication and patience, one day you will be able to walk around with your pet seating on your shoulder.

As she shares witty comments with the audience you will realize one thing: you just made one huge step toward becoming a mighty pirate.

September 26th, 2011

I recently came across a great blog post asserting - based on numerous studies - that size is the best predictor of code quality.

According to my own experience, this is spot on.

I can think of a couple of explanations:

  1. Concise code is generally the work of experienced and skilled developers.
  2. Removing dead code greatly increases code quality.
  3. Refactored code tends to be smaller (Adding features is not refactoring).

Point 2 and 3 result in a smaller code base and are a sign of a continuous code review process ("no code is set in stone" principle). In my experience, re-reading the code regularly is one of the greatest contributors to code quality, much more than unit testing and continuous integration (which are nevertheless required to make frequent refactoring possible).

I know that the best bug fixes I did were generally fixes that reduced the line count. Those bug fixes were generally a case of "Who the hell did something that complex for a problem that simple?! Oh wait, I did...".

In other words if you put efforts into code quality, this will generally result in reduced code size. This, to me, explains more the correlation between size and quality: concision is more an indicator of quality than a cause.

March 28th, 2011
Silence

Few are agreeable in conversation, because each thinks more of what he intends to say than of what others are saying and listens no more when he himself has a chance to speak.
-François de la Rochefoucauld

You’ve certainly experienced reunions where you told yourself "This person isn’t saying anything interesting and this meeting is going nowhere.".

In my case it's not really a problem since I'm the boss and I can punch people in the face and get away with it (you probably can't). By the way being allowed to kill people is one of the top reasons why you should start your own startup, provided you keep the death toll within the limits allowed by the law.

But I disgress. What can you do to help the meeting go somewhere?

As strange as it may sound talking less is part of the answer. This is because silence alleviates stress and forces oneself to actually listen. When you listen, you communicate better. Better communication means better meetings.

An experiment

At Bureau 14 we're pretty awesome scientists. And I'm not talking about all the computer science we do all day (C++ is more black magic than science actually), but more like the one that confirmed that interns can't survive without oxygen.

The one you can do is the following: the next meeting you attend, do not talk at all.

It’s highly likely you will notice the following things:

  1. Every attendant is saying the same thing over and over with a varying degree of eloquence;
  2. The amount and the distribution of information is unchanged by the meeting (unmodified entropy);
  3. What you could have said wasn't that useful, intelligent or unique (ouch!).

Problem ? Solution ?

You might tell me: "I'm going to enforce some rule during meetings to fix that":

Unless one has a genuine question, needs information or has something valuable and related to share there is absolutely no reason to speak.

That sounds like a pretty obvious rule on which everybody could easily agree. But that would not change a thing.

Let's be realistic for a minute.

We all think that what we say is pretty clever. Actually what happens is that our overinflated ego makes us believe it's ok to reharse what the person next to us just said because, really, it's important to participate and it's less boring to spend one hour talking than listening and by the way it's probably a good opportunity to disgress and talk about the latest Family Guy episod.

Newsflash: we're not that intelligent and what we have to say doesn't matter that much. Actually most of the time it's completely out of topic.

In other words, we may have the feeling to genuinely contribute when we aren't.

Stop creating rules

Stop believing that every problem must be taken in your own (capable?) hands and needs to be addressed. Stop "managing issues". This is something much more complex and inherent to human nature than just a "problem". It needs education.

The education process is a simple matter of forcing yourself to speak as little as possible, as if you were underwater and you had to keep your precious oxygen for mundane things such as staying alive. Everytime you want to speak, double check that it is relevant and new. Once you've double checked, check again, and again. Listen more, observe.

After a while, you will talk less, be more relaxed and listen more. You'll be a better participant and people will look after your presence. This is the moment when you will realize that contributing to the success of your company is more a matter of listening than a matter of talking.