Kevin's Tcl Optimization Note #1 Numeric type inference for Tcl [Draft 0: 2012-11-21] Kevin B, Kenny Since the announcement that Novem is open for development, and since Karl Lehenbauer's announcement of bounties for performance improvement of the Tcl language, there has been considerable discussion of working on compilation of Tcl either to a better interpreted representation than today's bytecode or directly to machine code. I am writing this note as what I hope to be one in a series that will explore the sorts of analyses that a hypothetical Tcl optimizer could apply to Tcl programs in order to improve them. CONTEXT: "Presuming that the code follows all safety constraints" This note is probably not the right starting place for a reader who is entirely unfamiliar with the discussion to date. In particular, it depends heavily on the half-formed notion of 'code that follows safety constraints.' Rather than detailing exactly what that means here, I will merely state that it informally means that the code in question has no uncontrolled side effects. In particular, at least the following constraints must apply: There must be no execution traces active. No variable access in the block of code being optimized may trigger a read, write, array, or unset trace. No core command used in the compiled code may be renamed, redefined, or overloaded. No variables may be aliased (except to the extent that a compiler is able to track aliasing and determine that it is safe). No command invoked from the optimized code may be able to alter these constraints. Note that these constraints apply to a large swath of actual Tcl code. Tcl's dynamic nature is, of course, something that we all depend on - but most of the code we write does not depend on it. We use it to get extra power when we need it, and to avoid the language bureacracy that is associated with strong typing. WHAT IS A NUMBER? With the concept of 'code subject to safety constraints' swept under the carpet for now, we can continue the discussion of what it would mean to perform optimization in Tcl. One of the more accessible optimizations is that of numeric type inference: determining when a Tcl value is in fact a number, and specializing numeric code. This specialization would greatly improve the little sublanguage of [expr], [incr], and related commands. The type calculus for numbers in Tcl is fairly simple: String / \ Number \ / \ \ / \ \ Double Entier \ | | Boolean | Int / | | / | Boolean&Int \ / \ / \ / BOTTOM In the figure, we read that Everything Is A String. Some strings are numbers, which in turn are divided into doubles and entiers. Doubles and entiers are mutually exclusive; a double has either a decimal point or an exponent, while an entier has neither. Some entiers are also ints; an int is an entier whose value is small enough to be represented in a native integer [NOTE1]. The distinguished integers '0' and '1' are in turn Boolean values (represented as Boolean&Int in the diagram). Other Boolean values exist; 'Boolean' in the figure designates a value that is 0, 1, true, false, yes, no, on, or off. [NOTE1] It may be advantageous to have a hierarchy of int64 and int32 replace the single 'native integer.' This could be done at the expense of some slight complexity in the rules of inference. I ignore this detail for the purposes of this note. MATHEMATICAL TYPE RULES The various math operations and functions yield specific types presuming that their arguments are of specific types. For instance, there are the general rules that Number + Number => Number (a catch-all rule that says that + always requires numeric operands and returns a number) Double + Number => Double Number + Double => Double Entier + Entier => Entier These rules may be used in forward reasoning - to determine what type a given value will have knowing the operation that created it - and backward reasoning - to determine what type a value must have in order to be usable in a given context. A combination of forward reasoning and backward reasoning would be used in any given piece of code to determine data types in play. APPROXIMATE TYPE RULES In addition to the usual type inference rules, effective optimization will require approximate rules - things that are "almost always true" with fallbacks to less performant code when they are not. An example would be Int + Int =?=> Int; Int + Int => Entier This rule may be read as "The result of adding two native ints is 'almost always' a native int. When this rule fails (as a result of overflow), the code must be re-optimized to the assumption that the result of the addition is an entier." Without this escape hatch, actually deducing that native integers are safe becomes intractable in most contexts, since it requires the compiler to prove that overflow is impossible. (In the most general case, the question is undecidable; even in usual programs, it is inordinately difficult.) WORKING THROUGH AN EXAMPLE With the idea of type inference rules in hand, let's try working through an example program and seeing how much we can find out about it without help from the user. I presume that the context follows all safety constraints. Let's try the (somewhat contrived) procedure 'cos': proc cos {x {n 15}} { set j 0 set s 1.0 set t 1.0 for {set i 1} {$i < $n} {incr i} { set t [expr {-$t * $x * $x / [incr j] / [incr j]}] set s [expr {$s + $t}] } return $s } A compiler will not do the type inference on the source code, of course, but on some suitable intermediate representation. What I use in the sketch below is three-address operations (taking the form, result = operand1 @ operand2), and I presume that data flow analysis has reduced the program to Static Single Assignment form, in which each value appears only once on the left-hand side of any operation. Merge points in the program, where multiple assigned values can flow in, are denoted with the pseudo-function 'phi'. (The Wikipedia article on Static Single Assignment form is a good starting point for those unfamiliar with it.) cos: x0 is parameter 0 n1 is parameter 1 j2 := 0 s3 := 1.0 t4 := 1.0 i5 := 1 goto looptest loop: j6 := phi(j2, j15) s7 := phi(s3, s17) t8 := phi(t4, t16) i9 := phi(i5, i18) a10 := - t8 a11 := a10 * x0 a12 := a11 * x0 j13 := j6 + 1 a14 := a12 / j13 j15 := j13 + 1 t16 := a14 / j15 s17 := s7 + t16 i18 := i9 + 1 looptest: i19 := phi(i5, i18) a20 := i19 < n0 if a20 goto loop exit: s20 = phi(s3, s17) return s20 Let's assume the following axioms about typing: The result of unary - is a number The result of unary - is a double if its argument is. The result of binary * is a number The result of binary * is a double if either operand is a double The result of binary / is a number The result of binary / is a double if either operand is a double The result of binary + is a number The result of binary + is a double if either operand is a double The result of binary + is an entier if both args are entiers The result of binary + is "almost always" an int if both args are ints The result of binary < is a boolean First let's reason about this code forward. What can we determine? j2 is an int (It's a constant) s3 is a double (It's a constant) t4 is a double (It's a constant) i5 is an int (It's a constant) a10 is a number (The result of unary - is a number) a11 is a number (The result of binary * is a number) a12 is a number (The result of binary * is a number) j13 is a number (The result of binary + is a number) a14 is a number (The result of binary / is a number) j15 is a number (The result of binary + is a number) t16 is a number (The result of binary / is a number) s17 is a number (The result of binary + is a number) i18 is a number (The result of binary + is a number) i19 is a number (both i5 and i18 are numbers) a20 is a Boolean (the result of binary < is a boolean) s20 is a number (both s3 and s16 are numbers) the return value of 'cos' is a number j6 is a number (both j2 and j15 are numbers) s7 is a number (both s3 and s17 are numbers) t8 is a number (both t4 and t16 are numbers) i9 is a number (both i5 and i18 are numbers) Not very useful yet... OK, now, let's try to improve on this. One common approach is to take convergence points (phi functions) where different types appear, and try to use structural induction to prove that the more constrained type is correct. Let's do this starting from t8 = phi(t4, t16) t4 is a double, and t16 is a number, What would be needed to prove that t16 is a double? t16 := a14 / j15. Either a14 would have to be a double, or j15 would be. a14 := a12 / j13. Either a12 would have to be a double, or j13 would be. a12 := a11 * x0. Either a11 would have to be a double, or x0 would be. a11 := a10 * x0. Either a10 would have to be a double, or x0 would be. a10 := - t8. t8 would have to be a double. t8 := phi(t4, t16). t4 is known to be a double - it's a constant, and t16 is a double by the induction hypothesis. Hence, t16 is a double. t8 is a double (both t4 and t16 are doubles) a10 is a double (the argument to unary - is a double) a11 is a double (the 'a10' arg to binary * is a double) a12 is a double (the 'a11' arg to binary * is a double) a14 is a double (the 'a12' arg to binary / is a double) t16 is a double (the 'a14' arg to binary / is a double) s17 is a double (the 't16' arg to binary + is a double) s20 is a double (both s3 and s17 are doubles) The return value of 'cos' is a double s7 is a double (both s3 and s17 are doubles) We've now determined unambiguously the type of two of our variables. Let's keep going, and work on 'j6'. Here we have j6 := phi(j2, j15). We know that j2 is an integer, and j15 is a number. We work our way down the type ladder from least to most restrictive. What would it take for j6 to be an entier? j6 = phi(j2, j15) Both j2 and j15 would have to be entier. j2 is known to be an entier. What would it take for j15 to be an entier? j15 := j13 + 1 Both j13 and 1 would have to be entier. (1 is, trivially) What would it take for j13 to be an entier? j13 := j6 + 1 Both j6 and 1 would have to be entier. (1 is, trivially) j6 is an entier by the induction hypothesis. Hence, j6, j13 and j15 are all entier. What would it take for i9 to be entier? i9 = phi(i5, i18) i5 and i18 would both have to be entier. i5 is known to be entier. What would it take for i18 to be entier? i9 and 1 would both have to be entier. 1 is entier, trivially. i9 is entier by the induction hypothesis. Hence, i9 and i18 are entier. Now let's try to tighten things up a little bit more. j6 = phi(j2, j15). j2 is an int. j15 is an entier. What would it take for j6 to be an int? j2 and j15 would both have to be ints. j2 is known to be an int. What would it take for j15 to be an int? j15 := j13 + 1 j13 and 1 would both have to be ints, and overflow must not occur. 1 is an int, trivially. What would it take for j13 to be an int? j13 := j6 + 1 j6 and 1 would both have to be ints, and overflow must not occur. 1 is an int, trivially. j6 is an int by the induction hypothesis. Hence, j6, j13 and j15 may all be represented as ints unless overflow occurs. Overflow may occur at either the j13 or j15 steps, and the fallback is to represent j6, j13 and j15 as entier. i9 = phi(i5, i18). i5 is an int and i18 is an entier. What would it take for i9 to be an int? i5 and i18 would have to be ints. i5 is known to be an int. What would it take for i18 to be an int? i9 and 1 would both have to be ints. 1 is an int trivially, and i9 is an int by the induction hypothesis. Hence, i9 and i18 are both ints unless overflow occurs. Overflow may occur at the i18 step, and the fallback is to assume that i18 is an entier. We have now - without the help of declarations - assigned a native data type to every value in the procedure. A little bit more backward analysis will determine that the 'x' and 'n' parameters must both be numbers, which would then allow [cos number] => double [cos number number] => double to be added to inference rules for further analysis in procedures that call [cos]. FAILOVER There are a few important cases that might cause failover. (1) n is not an int. It might be an entier, or a double, or something other than a number. This case must be handling by recompiling the procedure assuming the actual data type of n, and starting again from the beginning. (2) arithmetic overflow There are three places where integer arithmetic might overflow. In each case the overflow can be detected by the rule that sum=a+b has overflowed if (a^sum) < 0 && (a^b) >= 0. When overflow occurs, the procedure must be recompiled with the assumption that the overflowed value (i or j) is an entier, It can then be restarted at the point it left off. There are only three restart points, and the identity of the restart point plus the values of x, n, s, t, i and j are sufficient to restart it. Restart can be implemented by compiling code that includes a switch to jump to the correct place in the code. A later note will discuss the recompilation process in more detail.