Kevin's Tcl Optimization Note #1
Numeric type inference for Tcl
[Draft 0: 2012-11-21]
Kevin B, Kenny
Since the announcement that Novem is open for development, and since
Karl Lehenbauer's announcement of bounties for performance improvement
of the Tcl language, there has been considerable discussion of working
on compilation of Tcl either to a better interpreted representation
than today's bytecode or directly to machine code. I am writing this
note as what I hope to be one in a series that will explore the sorts
of analyses that a hypothetical Tcl optimizer could apply to Tcl
programs in order to improve them.
CONTEXT: "Presuming that the code follows all safety constraints"
This note is probably not the right starting place for a reader who is
entirely unfamiliar with the discussion to date. In particular, it
depends heavily on the half-formed notion of 'code that follows safety
constraints.' Rather than detailing exactly what that means here, I
will merely state that it informally means that the code in question
has no uncontrolled side effects. In particular, at least the
following constraints must apply: There must be no execution traces
active. No variable access in the block of code being optimized may
trigger a read, write, array, or unset trace. No core command used in
the compiled code may be renamed, redefined, or overloaded. No
variables may be aliased (except to the extent that a compiler is able
to track aliasing and determine that it is safe). No command invoked
from the optimized code may be able to alter these constraints.
Note that these constraints apply to a large swath of actual Tcl code.
Tcl's dynamic nature is, of course, something that we all depend on -
but most of the code we write does not depend on it. We use it to get
extra power when we need it, and to avoid the language bureacracy that
is associated with strong typing.
WHAT IS A NUMBER?
With the concept of 'code subject to safety constraints' swept under
the carpet for now, we can continue the discussion of what it would
mean to perform optimization in Tcl. One of the more accessible
optimizations is that of numeric type inference: determining when a
Tcl value is in fact a number, and specializing numeric code. This
specialization would greatly improve the little sublanguage of [expr],
[incr], and related commands.
The type calculus for numbers in Tcl is fairly simple:
String
/ \
Number \
/ \ \
/ \ \
Double Entier \
| | Boolean
| Int /
| | /
| Boolean&Int
\ /
\ /
\ /
BOTTOM
In the figure, we read that Everything Is A String. Some strings are
numbers, which in turn are divided into doubles and entiers. Doubles
and entiers are mutually exclusive; a double has either a decimal
point or an exponent, while an entier has neither. Some entiers are
also ints; an int is an entier whose value is small enough to be
represented in a native integer [NOTE1]. The distinguished integers
'0' and '1' are in turn Boolean values (represented as Boolean&Int in
the diagram). Other Boolean values exist; 'Boolean' in the figure
designates a value that is 0, 1, true, false, yes, no, on, or off.
[NOTE1] It may be advantageous to have a hierarchy of int64 and int32
replace the single 'native integer.' This could be done at the
expense of some slight complexity in the rules of inference. I ignore
this detail for the purposes of this note.
MATHEMATICAL TYPE RULES
The various math operations and functions yield specific types
presuming that their arguments are of specific types. For instance,
there are the general rules that
Number + Number => Number (a catch-all rule that says
that + always requires numeric
operands and returns a number)
Double + Number => Double
Number + Double => Double
Entier + Entier => Entier
These rules may be used in forward reasoning - to determine what type
a given value will have knowing the operation that created it - and
backward reasoning - to determine what type a value must have in order
to be usable in a given context. A combination of forward reasoning
and backward reasoning would be used in any given piece of code to
determine data types in play.
APPROXIMATE TYPE RULES
In addition to the usual type inference rules, effective optimization
will require approximate rules - things that are "almost always true"
with fallbacks to less performant code when they are not. An example
would be
Int + Int =?=> Int; Int + Int => Entier
This rule may be read as "The result of adding two native ints is
'almost always' a native int. When this rule fails (as a result of
overflow), the code must be re-optimized to the assumption that
the result of the addition is an entier."
Without this escape hatch, actually deducing that native integers are
safe becomes intractable in most contexts, since it requires the
compiler to prove that overflow is impossible. (In the most general
case, the question is undecidable; even in usual programs, it is
inordinately difficult.)
WORKING THROUGH AN EXAMPLE
With the idea of type inference rules in hand, let's try working
through an example program and seeing how much we can find out about
it without help from the user. I presume that the context follows all
safety constraints.
Let's try the (somewhat contrived) procedure 'cos':
proc cos {x {n 15}} {
set j 0
set s 1.0
set t 1.0
for {set i 1} {$i < $n} {incr i} {
set t [expr {-$t * $x * $x / [incr j] / [incr j]}]
set s [expr {$s + $t}]
}
return $s
}
A compiler will not do the type inference on the source code, of
course, but on some suitable intermediate representation. What I use
in the sketch below is three-address operations (taking the form,
result = operand1 @ operand2), and I presume that data flow analysis
has reduced the program to Static Single Assignment form, in which
each value appears only once on the left-hand side of any operation.
Merge points in the program, where multiple assigned values can
flow in, are denoted with the pseudo-function 'phi'. (The Wikipedia
article on Static Single Assignment form is a good starting point
for those unfamiliar with it.)
cos:
x0 is parameter 0
n1 is parameter 1
j2 := 0
s3 := 1.0
t4 := 1.0
i5 := 1
goto looptest
loop:
j6 := phi(j2, j15)
s7 := phi(s3, s17)
t8 := phi(t4, t16)
i9 := phi(i5, i18)
a10 := - t8
a11 := a10 * x0
a12 := a11 * x0
j13 := j6 + 1
a14 := a12 / j13
j15 := j13 + 1
t16 := a14 / j15
s17 := s7 + t16
i18 := i9 + 1
looptest:
i19 := phi(i5, i18)
a20 := i19 < n0
if a20 goto loop
exit:
s20 = phi(s3, s17)
return s20
Let's assume the following axioms about typing:
The result of unary - is a number
The result of unary - is a double if its argument is.
The result of binary * is a number
The result of binary * is a double if either operand is a double
The result of binary / is a number
The result of binary / is a double if either operand is a double
The result of binary + is a number
The result of binary + is a double if either operand is a double
The result of binary + is an entier if both args are entiers
The result of binary + is "almost always" an int if both args are ints
The result of binary < is a boolean
First let's reason about this code forward. What can we determine?
j2 is an int (It's a constant)
s3 is a double (It's a constant)
t4 is a double (It's a constant)
i5 is an int (It's a constant)
a10 is a number (The result of unary - is a number)
a11 is a number (The result of binary * is a number)
a12 is a number (The result of binary * is a number)
j13 is a number (The result of binary + is a number)
a14 is a number (The result of binary / is a number)
j15 is a number (The result of binary + is a number)
t16 is a number (The result of binary / is a number)
s17 is a number (The result of binary + is a number)
i18 is a number (The result of binary + is a number)
i19 is a number (both i5 and i18 are numbers)
a20 is a Boolean (the result of binary < is a boolean)
s20 is a number (both s3 and s16 are numbers)
the return value of 'cos' is a number
j6 is a number (both j2 and j15 are numbers)
s7 is a number (both s3 and s17 are numbers)
t8 is a number (both t4 and t16 are numbers)
i9 is a number (both i5 and i18 are numbers)
Not very useful yet...
OK, now, let's try to improve on this. One common approach is to take
convergence points (phi functions) where different types appear, and
try to use structural induction to prove that the more constrained
type is correct. Let's do this starting from t8 = phi(t4, t16)
t4 is a double, and t16 is a number, What would be needed to prove
that t16 is a double?
t16 := a14 / j15. Either a14 would have to be a double, or j15 would
be.
a14 := a12 / j13. Either a12 would have to be a double, or j13
would be.
a12 := a11 * x0. Either a11 would have to be a double, or x0 would
be.
a11 := a10 * x0. Either a10 would have to be a double, or x0
would be.
a10 := - t8. t8 would have to be a double.
t8 := phi(t4, t16). t4 is known to be a double - it's a
constant, and t16 is a double by the induction hypothesis.
Hence, t16 is a double.
t8 is a double (both t4 and t16 are doubles)
a10 is a double (the argument to unary - is a double)
a11 is a double (the 'a10' arg to binary * is a double)
a12 is a double (the 'a11' arg to binary * is a double)
a14 is a double (the 'a12' arg to binary / is a double)
t16 is a double (the 'a14' arg to binary / is a double)
s17 is a double (the 't16' arg to binary + is a double)
s20 is a double (both s3 and s17 are doubles)
The return value of 'cos' is a double
s7 is a double (both s3 and s17 are doubles)
We've now determined unambiguously the type of two of our variables.
Let's keep going, and work on 'j6'. Here we have j6 := phi(j2, j15).
We know that j2 is an integer, and j15 is a number. We work our way
down the type ladder from least to most restrictive.
What would it take for j6 to be an entier? j6 = phi(j2, j15)
Both j2 and j15 would have to be entier. j2 is known to be
an entier.
What would it take for j15 to be an entier? j15 := j13 + 1
Both j13 and 1 would have to be entier. (1 is, trivially)
What would it take for j13 to be an entier? j13 := j6 + 1
Both j6 and 1 would have to be entier. (1 is, trivially)
j6 is an entier by the induction hypothesis.
Hence, j6, j13 and j15 are all entier.
What would it take for i9 to be entier? i9 = phi(i5, i18)
i5 and i18 would both have to be entier. i5 is known to be entier.
What would it take for i18 to be entier?
i9 and 1 would both have to be entier. 1 is entier, trivially.
i9 is entier by the induction hypothesis.
Hence, i9 and i18 are entier.
Now let's try to tighten things up a little bit more.
j6 = phi(j2, j15). j2 is an int. j15 is an entier. What would it
take for j6 to be an int? j2 and j15 would both have to be ints.
j2 is known to be an int.
What would it take for j15 to be an int? j15 := j13 + 1
j13 and 1 would both have to be ints, and overflow must not occur.
1 is an int, trivially.
What would it take for j13 to be an int? j13 := j6 + 1
j6 and 1 would both have to be ints, and overflow must not occur.
1 is an int, trivially. j6 is an int by the induction hypothesis.
Hence, j6, j13 and j15 may all be represented as ints unless overflow
occurs. Overflow may occur at either the j13 or j15 steps, and
the fallback is to represent j6, j13 and j15 as entier.
i9 = phi(i5, i18). i5 is an int and i18 is an entier. What would it
take for i9 to be an int? i5 and i18 would have to be ints. i5 is
known to be an int.
What would it take for i18 to be an int? i9 and 1 would both have
to be ints. 1 is an int trivially, and i9 is an int by the
induction hypothesis.
Hence, i9 and i18 are both ints unless overflow occurs. Overflow may
occur at the i18 step, and the fallback is to assume that i18 is an
entier.
We have now - without the help of declarations - assigned a native
data type to every value in the procedure. A little bit more backward
analysis will determine that the 'x' and 'n' parameters must both be
numbers, which would then allow
[cos number] => double
[cos number number] => double
to be added to inference rules for further analysis in procedures that
call [cos].
FAILOVER
There are a few important cases that might cause failover.
(1) n is not an int.
It might be an entier, or a double, or something other than a
number.
This case must be handling by recompiling the procedure assuming
the actual data type of n, and starting again from the beginning.
(2) arithmetic overflow
There are three places where integer arithmetic might overflow.
In each case the overflow can be detected by the rule that
sum=a+b has overflowed if (a^sum) < 0 && (a^b) >= 0.
When overflow occurs, the procedure must be recompiled with
the assumption that the overflowed value (i or j) is an entier,
It can then be restarted at the point it left off. There are only
three restart points, and the identity of the restart point plus
the values of x, n, s, t, i and j are sufficient to restart it.
Restart can be implemented by compiling code that includes a
switch to jump to the correct place in the code.
A later note will discuss the recompilation process in more detail.