JavaScript Software-Architecture 04 Jun 2016

Software Architecture with JavaScript - In which part of your code should you do sanity & validation checks to your data?

When working in an application which is divided into multiple architectural layers, it is often the case that the developers of a team do not trust the data their function get in one layer from a function in the same or in an another layer. For that reason they often add null checks (very common in C#) or in JavaScript’s case they add if (variable) {...} checks (implicit coercion of values in an if clause) in order to catch an error before it happens and avoid breaking the execution of the code.

But is it that bad for a function to fail or is it that bad for your application to throw an exception when something did not go as it is supposed to go?

Let us consider the following architecture of layers of an application:

Presentation layer (The UI of your application) ->
API layer (The public interface of your backend to the outer world) ->
Business layer (The logic of your application. This could be also done in the API layer) ->
Data layer (The wrapper code which stores data into a permanent medium (SQL, NoSQL database, etc.))

Since this article is focused on JavaScript we suppose that the client is using a JavaScript framework like Angular.js or React and on the backend we use node.js.

In the functions of which layer should we do sanity checks to the data? With sanity checks I do not only mean null-testing but also security validation of incoming data, for example from a web form that was just filled in by a user.

Should you check for null (and its equivalents in JavaScript ””, 0, undefined, false) in every function, so that your code is as much secured as possible? Or should you baptize a layer in your architecture to a virtual safety-border and only there do the validation checks?

I am a big fan of the second option. Defensive coding is a great programming strategy, but it should be applied only in the areas that it is really needed. I prefer to define a point in my application where the incoming data is checked and validated. After this border I can safely consider the data as trusted and validated. Here are some architectural tips for you:

Many lines have been already written about this point, so I am not going to discuss this point into detail, but it is a fact that you have to validation twice. Once in the client and once in your server where your application logic runs. You cannot trust your users; the client validation is only an add-on that only promotes the user experience of your users. Only in the backend you can do the final validation of the data.
The question now is where in the backend? I would suggest you to do the validation (security & sanity validation) on the first-most place where the data from a client lands into the server. This might be for example the functions in your API. I like to call these function guard-functions. Do it there! The principle of lean management suggests us to “fail, but fail fast”, the same applies also here. The point is that you do not want to pass invalid or secure threatening data from the one function to the other. You want to catch the invalid data as soon as possible and based on your application’s strategy inform the client or just add a new log entry to your database.
- In your API functions, before you do any type of processing on the received data, you validate the data in terms of security. If you miss a security validation, then your application might be in a great risk
- You also apply checks for invalid values in this data, for example null. Great caution has to be applied here since the standard if (variable) {...} could invalidate data that should be considered as valid. An example is the number 0, which could be an ID or the number of children that someone has.
Do not work with default values, when an a value of a parameter which is coming into your function is not the one that was expected. In that case there is no reason to hide this problem and assign a default value to this parameter. The developer of the next function in the chain is going to thank you about it! A better approach is to throw an exception and leave the global exception handler (See next point) to do the rest. In JavaScript when you throw an exception of type Error you also get the stacktrace till the time the exception happened. This can greatly help you when you debug your application.
What about the rest of your server functions that they do not belong in the API layer of your application? Since these functions are called after the calls on the guard-functions in your API layer, they do not have to repeat the same validations or check for null, undefined, etc. You already checked that! Why would you do that again? You make your application slower and you create code which is not easily maintainable, since every developer in your team is going to take for granted that they have to check for everything everywhere.

Nevertheless, if you missed a sanity check in the API functions, then a null reference would break the normal execution of your code (which is OK), and the execution would be navigated to a catch all exception handler. After this error your application remains functional and is ready to server the next requests.

This exception handler is the window.onerror in your browser and the uncaughtException event in node.js. You have to register functions to these events and in a case of an unexpected exception they are going to take it over from there and do what it has to be done:
- Log the exception in your database and/ or inform someone from your team with an email
- Send a simplified error message to the client (without any stacktrace information!)
In the Data Layer work with parameterized code, which is nothing more than normal functions with the ability to escape dynamic SQL statements. There are plenty of JavaScript frameworks that work as wrapper to your data between your application and the Database. Doing this way you do not allow for SQL injection attacks in your application.

Let me know your thoughts with a comment about validation of incoming data.