Fixing the Singleton

When you ask people who know almost, but not completely nothing about patterns, about software design patterns, they probably bring up the singleton pattern. It’s extremely easy: Just make sure you have one single instance of a class.

If you ask more experienced developers about the singleton, they will probably tell you its bad and should be avoided.

What is so bad about the singleton that it fell from the heavon of patterns into the hell of anti patterns?

If you check the description of the singleton pattern in the Gang of Four book it talks about two characteristics of the singleton:

  • There can be only  a single instance
  • There is a central access point to that instance

And there really is nothing wrong about the first point. Sometimes you really can’t have more then one instance, but even if multiple instances don’t hurt, if they don’t give you any benefit (i.e. if they are stateless) you might as well reuse a single instance.

The problem is really with the central access point. This makes the singleton look and behave like a global variable. And how should we obtain our dependencies? Correct through Dependency injection.

So the next time you think: “Looks like a singleton to me”. Go ahead, create a singleton. But also create an interface which gets implemented by the singleton and make it very clear that nobody may use this singleton except via its interface and dependency injection through constructor.

Posted in: Softwaredevelopment by Jens Schauder 6 Comments , ,

Tips for Testing Database Code

Almost everybody understands that source code belongs into version control. Many people understand we need to test our code. Quite a few do that automatically.

But everything seems to change when it comes to databases. A lot of stuff in and around databases goes untested. Heck some of the scripts don’t even live in version control. If you can’t believe that, because you haven’t seen it, you are a damn lucky bastard.

At least one reason for this state of affairs is obviously: It hurts. Database tests tend to be slow, interdependent and hard to maintain. Granted. But you know what: If it hurts, you should do it more often. It will teach you what exactly the things are that cause pain when testing and eventually you’ll find approaches to make it less painful. Here are some ideas I found helpful when testing database related code:

  • Give every developer her own database. This forces you to find a way to set up the database fast, easy and reliable. If your application lives in a single user/schema/namespace it is sufficient for each developer to have his own user/schema/namespace in a single database. For this to work though …
  • … the application should be user/schema/namespace agnostic. It makes it much easier to create multiple instances one a single server.
  • Let the application live in a single user/schema/namespace. If you have multiple interdependent namespaces (e.g. for modules) you’ll have a hard time making them agnostic of the names.
  •  Have separate instances for CI, Demos, QA and so on. Actually ideally it should be trivial to create a fresh instance.
  • Stay away from any tool that comes with its own repository. If have seen about a dozen of such tools and although some looked promising in the beginning, they all completely failed to integrate with other tools on the development side of things. Examples of such tools are tools for code generation from UML or ER models and ETL tools.

The previous points where about the setup of your database and infrastructure. Lets have a look at the code:

  •  Favor a proper Language (like Java, C, PHP …) over database specific languages like T-SQL or PL/SQL. If you are wondering why, make a comparison between your favorite DB language and your all purpose language. For which do you get the better IDE, Unit testing support, code analysis, code coverage and so on. Reconsider your main language if it doesn’t win in that comparison.
  •  Have a small layer that does all interaction with the database. Make sure no SQL or other database dependent code leaks out. Inject that layer as a dependency into anybody in need of it. This will allow you to test almost everything without worrying about the database. Only the tests for that integration layer actually needs a database for testing.
  • Favor database independent SQL or a library abstracting away the differences of various databases. Back in the time people claimed they needed that in case they have to switch database vendors, which never happened. Now it does. See below.

The next points will deal with the actual tests:

  • Consider an in-memory-database for testing. Many databases can run in an in-memory-mode. They are great for testing, because you can throw them away after the test, and they are way faster then any database writing to disk. This of course is only possible when you work with a database system that can run as a in-memory-database or if your code is database independent. Hence the previous point.
  • If you can’t use your database as in memory database, consider putting it on a RAM disk. We got a huge performance gain for our tests with this approach. A solid state disk might be the next best thing although I’m not sure how it would react to the heavy load of continuous database tests.
  • Make test failure messages so explicit that you don’t have to look into the database for analyzing test failures.
  • Use code for setting up your test data. Make it nice and concise. If you need a row in a table without special requirements for its values you should be able to create that with a single trivial line of code, no matter how many foreign keys the table contains. I.e. you should have a little DSL for your test data. Doing it with plain code will enable all the refactoring power of your IDE for your tests. For load and performance tests other approaches like loading production data or large amounts of generated data might be suitable.
  • Make sure your tests clean up after them selves. There are two approaches I found usable in most cases:
  1. Recreate the schema for every test. This is slow but really safe.
  2. Do a rollback after each test. This only works when there is no commit inside the test. The mean thing is: If a test tries to rollback, but fails because there was a commit inside the test some completely different test will fail. It can be a really frustrating task to find the bad test in such a case.

We covered the testing of  database related code inside your application. But there is another kind of code we need to deal with: Scripts for deploying your application (or upgrading it)

  • The scripts that change your database schema from one version to the next are source code just like everything else. Therefore they belong under version control and should get tested continuously.
  • The testing process is really simple: Create a database as it looks now. Apply your change scripts and verify it looks as desired.
  • For verifying the resulting schema it is useful to have a script that creates your target database (or at least the schema) from scratch, so you compare it with the result of the test.
  • For performance reasons you might want to test this with an empty database first.
  • But don’t forget to run it on an instance with realistic data as well. Hint: adding a column to huge tables can take loooooong.

Happy testing everybody.

 

The Right Amount of Up-Front Design?

Maybe its just because I’m following Simon Brown on twitter, but I saw a phrase like “just the right amount up-front design” often enough to make me think.

Am I doing the right amount of up-front design?

I never before thought about this question in my software projects. But I also hardly ever had the feeling of “damn we did to much or to little of it back then”. How can this be possible? The reason might be that in some sense my approach to hitting the sweet spot for the amount of design is extremely easy:

Design as much as possible, as early as possible! But not more or earlier!

So some people might imagine me now sitting for weeks in a row working with UML tools and hammering out diagrams and design documents. Nothing could be farther from the truth. The key is in the interpretation of  when design is possible. For me design is all about solving a problem. So in order to do some designing one needs a problem. If you have found a problem do as much design for it as possible. ‘much’ doesn’t get measured in hours or number of pages or diagrams. ‘much’ gets measured in the extend to which the problem gets solved. If you can design a solution that makes a problem go away. Great! If you only find one that makes it a tiny problem: It will have to do until you find a better solution.

But now we are stuck with trying to find a better solution for a not completely solved problem, right? Wrong! In reality we have lots of problems in a project, so we pick the worst one whip up the solution, until the next problem is bigger then the first.

So if you are spending time with a problem that isn’t the most pressing one, you are doing to much up-front design. But everybody would immediately stop doing it once he realizes it, right? If on the other hand you are knowingly ignoring your biggest problem you are doing to little up-front design. But the more common name for this is sabotage.

Of course it does happen that what you thought is a great solution turns out as a bad idea, or that you just can’t find a solution for a especially bad problem, or a problem jumps at you from behind. Congratulations you found the border of your skills. Its an opportunity to learn an to train you problem identification or solving skills. It is not the time to decide, I should do more of it (or less) next time.

It really is a simple process. Yet software design is considered difficult. So where is the difficult part?

There are actually three difficult parts:

  • Seeing the problems as early as possible, so you can solve them while they are small
  • Finding good solutions fast.
  • Having a great team that understands that every software developer is a software designer
Posted in: Softwaredevelopment by Jens Schauder No Comments