October 17, 2018

Short-Circuit Evaluation in N4JS

Short-circuit evaluation is a popular feature of many programming languages and also part of N4JS. In this post, we show how the control-flow analysis of the N4JS-IDE deals with short-circuit evaluation, since it can have a substantial effect on the data flow and execution of a program.



Short circuit evaluation is a means to improve runtime performance when evaluating boolean expressions. This improvement is a result of skipping code execution. The example above shows an if-statement whose condition consists of two boolean expressions that combine the values of 1, 2 and 3, and its control flow graph. Note that the number literals are placeholders for more meaningful subexpressions.

First the logical and, then the logical or gets evaluated: (1 && 2) || 3. In case the expression 1 && 2 evaluates to true, the evaluation of the subclause 3 will be skipped and the evaluation of the entire condition results to true. This skipping of nested boolean expressions is called short circuit evaluation.

However, instead of skipping expression 3, expression 2 might be skipped. In case condition 1 does not hold, the control flow will continue with condition 3 right away. This control flow completely takes places within the if-condition, whereas the former short circuit targets the then block.

The reasoning behind short circuit evaluation is that the skipped code does not affect the result of the whole boolean expression. If the left hand side of the logical or expression evaluates to true, the whole or expression also does. Only if the left hand side is false, the right hand side will be evaluated. Complementary, the right hand side of a logical and expression is skipped in case the left hand side evaluates to false.


Side Effects


Risks of short circuit evaluation might arise in case a subexpression has side effects: These side effects will not occur if the subexpression is skipped. However, a program that relies on side effects of expressions inside an if-condition can be called fragile (or adventurous). In any case it is recommended to write side-effect free conditions.


Have a look at the example above. In case variable i has a value of zero, the right hand side expression i++ is executed, otherwise, it is skipped. The side effect here is the post-increment the value of i. If the value of i is other than zero, this value will be printed out. Otherwise, the value will be incremented but not printed. The control flow shows this behavior with the edge starting at i and targeting the symbol console.


Loops


Loop conditions also benefit from short circuit evaluation. This is important to know when reasoning about the all possible control flow paths through the loop: Each short circuit will introduce another path. Combining all of them makes data flow in loops difficult to understand in case of side effects in the subconditions.


Creative use of short circuit evaluation


Misusing short circuit evaluation can mimic if-statements by using expressions but without using the language feature of conditional expressions (i.e. condition() ? then() : else()). This could be used when if-statements should be executed e.g. when passing arguments to method calls, or when computing the update part of for-loops.





The picture above shows the two versions: the first uses an if-statement and the second uses an  expression statement. These two statements call the functions condition, then and end. Depending on the return value of condition, the function then is executed or not. Consequently, the printouts are either "condition then end" or "condition end", depending on the control flow.

The corresponding control flows are depicted on the right: The upper three lines refer to the if-statement, and the lower three lines to the expression statement. They reveal that the expression statement behaves similar to the if-statement. Note that the control flow edge in the last line that skips the nodes end and end() is never traversed since the logical or expression always evaluates to true.

The interested reader would find more details about the N4JS flow graphs and their implementation in the N4JS Design Document, Chapter: Flow Graphs.


by Marcus Mews

September 20, 2018

Null/Undefined Analysis in N4JS

The N4JS IDE integrates validations and analyses that are quite common for IDEs of statically typed languages. However, these analyses are seldom available for dynamically typed languages like N4JS or TypeScript. In this post we present the null/undefined analysis for N4JS source code.




TypeError: Cannot read property of undefined
- Developer's staff of life


The runtime error above occurs pretty often for JavaScript programmers: A quick search on Google returned about 1.2 million for the term TypeError: Cannot read property of undefined. When constraining search results to site:stackoverflow.com the query will still yield 126 thousand results. These numbers are comparable to the somewhat similar error NullPointerException which has about 3 million hits on Google and about 525 thousand when constrained to stackoverflow.com. Some of these results are caused by rather simple mistakes that a null/undefined analysis could detect. As a result, the developer could restructure his code and remove these potential errors even before he runs his first test and hence save time.


Null/Undefined Analysis


The N4JS IDE provides static analyses to indicate problems when it detects a property access on a variable which can be null or undefined. The analysis considers all assignments that occur either through a simple assignment expression or via destructuring. Loops, conditional expressions (e.g. i = c? 1 : 0;) and declaration initializers are respected as well.

The screenshot above shows a first example where a potential TypeError is detected. Since there exists at least one control flow from v.length backwards to all reachable definitions of v,  such that one definition assigns null or undefined to v, a warning is issued telling that v may be undefined.

To make sure that the analysis will produce fast results, it is implemented within some limitations. One is that the analysis is done separately for each body of a function, method, etc. (i.e. intra-procedural analysis). Hence it lacks knowledge of variables that cross the borders of these bodies such as the return value of a nested function call. In addition, property variables (such as v.length) are not analyzed since this would require the analysis to be context sensitive to the receiver object (here v). However, these limitations are common for static analyses of statically typed languages and still allow to detect many problems regarding local variables and parameters.

Usually, the analysis makes optimistic assumptions. For instance it can happen that a local variable receives the value of a method call or another non-local variable. In this situation the analysis assumes this value is neither null nor undefined. The same is true for function parameters. Only when there are distinct indications in the source code for a value of a local variable to be null or undefined, the analysis will issue a warning.


Guards

 

Sometimes the programmer knows that a variable may be null or undefined and hence checks the variable explicitly, for instance using if (v) {...}. As a result this check disables the warning in the then-branch that complies to the execution semantics.



As shown in the screenshot above, neither at the expression w.length < 1 nor at the statement return w.length; a warning is shown. Of course, the else-branch of such a check would consequently always indicate a warning when a property of variable v is accessed. Checks for conditional expressions and binary logical expressions (e.g. v && v.length) are also supported. A reader might think: "In case w is null the expression w.length would fail." True, but in this example the analysis detects the value of w being undefined. In case null might have been assigned to w e.g. in an if-condition before, the analysis will issue a warning of w being null at the two w.length expressions.


Data Flow

There are situations where the value of a variable is null or undefined due to a previous assignment of a variable which may have been null or undefined before, like shown in the example above. Then, the null/undefined dereference problem occurs later when a property is accessed. Since the analysis respects data flow, it can follow the subsequent assignments. Hence a warning is shown at a property access indicating the null or undefined problem. Moreover, the warning also indicates the source of the null or undefined value which would be the variable w in the example above.

The interested reader would find more details about the N4JS flow graphs and their implementation in the N4JS Design Document, Chapter: Flow Graphs.



by Marcus Mews

August 29, 2018

Control flow graphs in N4JS

The N4JS IDE comes with several tools to get insights into the source code such as the AST and the control flow. In this post we present the control flow graph view.



When learning a programming language, we probably start with a small Hello World! program. From there on, we learn not only language keywords and libraries, but also get an implicit understanding in which order statements and expressions are executed. This order is called control flow. For instance, we learn the effects of if and for statements which are also called control statements, since they have a big influence on the execution order. The most difficult statement in this respect is probably the try-finally control statement, which is explained later.


Hello World


function f() {
  console.log("Hello World!");
}

Let's have a look at the Hello World! from a control flow perspective, first. The source code of Hello World! is simple, but it already consists of three important elements:
  • the method call console.log,
  • the nested argument "Hello World!", and
  • the function body of f() which contains the two elements mentioned above.
The function body is also called control flow container.


The image above shows the control flow graph of the Hello World! example. The method call is separated into the receiver and its property log. The next element the control flow goes to is the argument "Hello World!" until it reaches the method call of log as the final element.


Loops


The example above showed the succeeding control flow of some code elements. This control flow gets more interesting in statements that introduce branches such as loop statements do. In the example below, the control flow of a for-loop is shown. To indicate the start and end of the function, and also the body of the loop, the function calls start(), loop() and end() are used.

function f() {
  start();
  for (var i=0; i<2; i++) {
  loop();
  }
  end();
}


The control flow of the loop example shows branches and merges. After the condition i<2, either the body is entered via the edge named LoopEnter, or the loop is exited. In case the body was entered, the control flow first targets the call to loop() and then goes back to the entry of the condition.


Try-Finally


Finally blocks have tricky semantics since they can be entered and exited in two specific ways: normally and abruptly. Abrupt control flow occurs after a return, continue, break or throw statement. These will introduce a jump that targets either the end of the function or the entry of a finally block. In case the control flow jumps to a finally block, this finally block will be executed normally. However, since the block was entered abruptly, it will exit abruptly again. This means, that there will be a second jump from the exit of the finally block to either the end of function or the entry of the next finally block.


The following example shows this behaviour by using some dead code elements, which have a grey background colour in the graph image. These dead code elements are not reachable, since there is no normal control flow path that exits the try block.


function t() {
  "start";
  try {
  2;
  return;
  3;
    } finally {
  "finally";
  }
    "end";
}


Final catch


In case you ever wondered how a thrown exception can be caught without using a catch block, have a look at the final example. It is true that once a finally block was entered abruptly, it can only be exited abruptly. However, the kind of abrupt control flow might be changed. In the example below, it is changed from throwing an exception to breaking the loop. Of course, after the loop was exited due to the break statement, the control flow is normal again and the thrown exception remains without effect. Hence, the last statement "end" is executed which would have been skipped otherwise.



function t() {
  "start";
  do {
    try {
      2;
      throw "exception";
    } finally {
      break;
    }
  } while (4);
  "end";
}













The interested reader would find more details about the N4JS flow graphs and their implementation in the N4JS Design Document, Chapter: Flow Graphs.


by Marcus Mews