-
Notifications
You must be signed in to change notification settings - Fork 56
Variable elimination
The rationale behind the Spartan programming's desire to eliminate variables is that every variable places a burden on both the reader and the writer.
- Each variable has a name. The code writer must spend precious mental effort on picking a good name for the variable. The code reader must spend similar effort in figuring out the intention of the writer from the name.
- Variables tend to change. The code writer must invest resources in making sure that the variable is changed only when it is supposed to. The code reader must work out the set of locations that may change the variable, and understand the nature of these changes.
- Variables may be used by the code. The code writer must define a role for that variable which suggests where that variable may be read. The code reader must understand that role, and to properly appreciate that role, he must work out the list of code locations in which the variable may be read.
A variable that is used only once, is a perfect candidate for elimination. Simply replace the use of this variable with the expression that computes it. For example, instead of
double a = p.area();
return a;
you can write
return p.area();
In the following example, drawn from the source code of JFLEX, variable b
is used only once.
public int interpret(int[] in, int[] par) {
boolean b = boolexp.interpret(in,par);
if (b)
return exp1.interpret(in,par);
else
return exp2.interpret(in,par);
}
Eliminating this variable, we obtain
public int interpret(int[] in, int[] par) {
if (boolexp.interpret(in,par))
return exp1.interpret(in,par);
else
return exp2.interpret(in,par);
}
and by simple ternarization we get a single statement function
public int interpret(int[] in, int[] par) {
return boolexp.interpret(in,par) ? exp1.interpret(in,par) : exp2.interpret(in,par);
}
another ternarization steps yields
public int interpret(int[] in, int[] par) {
return return (boolexp.interpret(in,par) ? exp1 : exp2).interpret(in,par);
}
Not all inlining opportunities should be exploited. Consider for example
for (a = 0; a < f(); a += g()) {
...
}
which may have resulted from inlining of a more explicit statement of the iteration boundaries
int f = f(), g = g();
for (a = 0; a < f; a += g) {
...
}
The semantics of these two code snippets is different though. In the former, functions f
and g
are evaluated in every iteration, while in the latter, they are evaluated only once.
In the case that these functions' return value is not the same, the two loops will execute differently, and the variables should not be inlined.
Should we use the inlined version of the code in the case that the two functions are pure, that is return the same value in all calls? If the purity of the functions is not obvious, then inlining may confuse the reader, and should not be used.
But, what if purity is obvious? The inlined version is then at least as clear as the non-inlined, is shorter, and uses fewer variables. This version may also mean, at least with some compilers, slower execution. Hasting to optimize may not be a good idea though. Remember that premature optimization is a prime example of dangerous creativity and originality and that the loop may execute only a handful of times, or may not even be reached in most paths of execution.
Many routines define a variable to store their result. This variable can often be eliminated, by returning the result as soon as it is found.
Consider for example the following simple function, designed to check if a given token is in a list of keywords.
private boolean isKeyword(String token, String[] keywords) {
boolean result = false;
for (int i = 0; i < keywords.length; i++)
if (token.equals(keywords[i]))
result = true;
return result;
}
By returning the result as soon as we find it, we can eliminate variable result
. The resulting function is not only shorter but also a bit more efficient.
private boolean isKeyword(String token, String[] keywords) {
for (int i = 0; i < keywords.length; i++)
if (token.equals(keywords[i]))
return true;
return false;
}
Here is another example, taken from the code of the CLASSYCLE project.
private ResultRenderer getRenderer() throws InstantiationException, IllegalAccessException, ClassNotFoundException {
ResultRenderer renderer = new DefaultResultRenderer();
if (_resultRenderer != null)
renderer = (ResultRenderer) Class.forName(_resultRenderer).newInstance();
return renderer;
}
which can be written simply as
private ResultRenderer getRenderer() throws InstantiationException, IllegalAccessException, ClassNotFoundException {
return _resultRenderer != null ? (ResultRenderer) Class.forName(_resultRenderer).newInstance() : DefaultResultRenderer();
}
A third example taken from the CLASSYCLE project is
public boolean isFulfilled(Vertex vertex) {
boolean result = false;
if (vertex != null) {
Attributes attributes = vertex.getAttributes();
if (attributes instanceof NameAttributes)
result = _pattern.matches(((NameAttributes) attributes).getName());
}
return result;
}
Returning the result as soon it is known also eliminates the first conditional
public boolean isFulfilled(Vertex vertex) {
if (vertex == null)
return false;
Attributes attributes = vertex.getAttributes();
return attributes instanceof NameAttributes ? _pattern.matches(((NameAttributes) attributes).getName()) : false;
}
Here is an even trickier example, drawn again from the CLASSYCLE project.
private boolean matches(String string, int indexInString, int indexInConstantParts) {
boolean result = true;
if (indexInConstantParts < _constantParts.length) {
String constantPart = _constantParts[indexInConstantParts];
do {
int index = string.indexOf(constantPart, indexInString);
if (index < 0 || (indexInString == 0 && !_startsWithAnything && index > 0)) {
result = false;
break;
}
indexInString = index + constantPart.length();
result = matches(string, indexInString, indexInConstantParts + 1);
} while (result == false);
} else {
result = result && (_endsWithAnything || indexInString == string.length());
}
return result;
}
This example is quite tricky. The variable result is used in several places, and the flow of control is a bit confusing. Let us therefore start by removing changing the order of the branches in the conditional statement following the shortest first technique.
private boolean matches(String string, int indexInString, int indexInConstantParts) {
boolean result = true;
if (indexInConstantParts >= _constantParts.length) {
result = result && (_endsWithAnything || indexInString == string.length());
} else {
String constantPart = _constantParts[indexInConstantParts];
do {
int index = string.indexOf(constantPart, indexInString);
if (index < 0 || (indexInString == 0 && !_startsWithAnything && index > 0)) {
result = false;
break;
}
indexInString = index + constantPart.length();
result = matches(string, indexInString, indexInConstantParts + 1);
} while (result == false);
}
return result;
}
It now becomes a bit clearer that we can return the correct value immediately in the first branch.
private boolean matches(String string, int indexInString, int indexInConstantParts) {
if (indexInConstantParts >= _constantParts.length)
return _endsWithAnything || indexInString == string.length();
boolean result = true;
String constantPart = _constantParts[indexInConstantParts];
do {
int index = string.indexOf(constantPart, indexInString);
if (index < 0 || (indexInString == 0 && !_startsWithAnything && index > 0)) {
result = false;
break;
}
indexInString = index + constantPart.length();
result = matches(string, indexInString, indexInConstantParts + 1);
} while (result == false);
return result;
}
Examining now the inner conditional statement, we see that if the break is executed, then the returned value is false, so we can rewrite the function as follows:
private boolean matches(String string, int indexInString, int indexInConstantParts) {
if (indexInConstantParts >= _constantParts.length)
return _endsWithAnything || indexInString == string.length();
boolean result = true;
String constantPart = _constantParts[indexInConstantParts];
do {
int index = string.indexOf(constantPart, indexInString);
if (index < 0 || (indexInString == 0 && !_startsWithAnything && index > 0))
return false;
indexInString = index + constantPart.length();
result = matches(string, indexInString, indexInConstantParts + 1);
} while (result == false);
return result;
}
Now, we see that if the loop terminates normally, the value of result must be true. A closer inspection reveals that the value of result is computed right before the loop ending condition. So, we can eliminate it altogether by writing
private boolean matches(String string, int indexInString, int indexInConstantParts) {
if (indexInConstantParts >= _constantParts.length)
return _endsWithAnything || indexInString == string.length();
String constantPart = _constantParts[indexInConstantParts];
do {
int index = string.indexOf(constantPart, indexInString);
if (index < 0 || (indexInString == 0 && !_startsWithAnything && index > 0))
return false;
indexInString = index + constantPart.length();
} while (!matches(string, indexInString, indexInConstantParts + 1));
return true;
}
A famous idiom for iterating over an array is
for (int i = 0; i < array.length; i++) {
.... //Do something with array[i]
}
But, if the loop's body never uses variable i nor modifies array[i], then the loop can be more concisely written as
for (T a: array) {
.... //Do something with a
}
where T
is the type of elements of array array.
The above example of searching through an array of keywords, can thus be rewritten to eliminate variable i
private boolean isKeyword(String token, String[] keywords) {
for (final String element : keywords)
if (element.equals(token))
return true;
return false;
}
The following represents a common structure of the main function of many file processing applications.
public static void main(String argv[]) {
for (int i = 0; i < argv.length; i++) {
try {
System.out.println("Processing ["+argv[i]+"]");
process(new FileReader(argv[i])));
System.out.println("No errors processing " + argv[i]);
} catch (Exception e) {
e.printStackTrace(System.out);
System.exit(1);
}
}
}
With this technique, it is simplified to the following
public static void main(String argv[]) {
for (final String f : argv) {
try {
System.out.println("Processing ["+f+"]");
process(new FileReader(f)));
System.out.println("No errors processing " + f);
} catch (Exception e) {
e.printStackTrace(System.out);
System.exit(1);
}
}
}
Old Java code iterating over collections uses an instance of class Iterator
to carry out the iteration. This iteration variable can be eliminated.
The following function is drawn from class SecureRandom
of package java.security in the Java runtime library.
private static String getPrngAlgorithm() {
List provs = Providers.getProviderList().providers();
for (Iterator t = provs.iterator(); t.hasNext();) {
Provider p = (Provider) t.next();
for (Iterator u = p.getServices().iterator(); u.hasNext();) {
Service s = (Service) u.next();
if (s.getType().equals("SecureRandom")) {
return s.getAlgorithm();
}
}
}
return null;
}
In this function we have two iteration variables, t and u, both of which can be eliminated. Also, variable, provs is used only once, so it is safe to eliminate as well.
The result is:
private static String getPrngAlgorithm() {
for (Provider p : Providers.getProviderList().providers())
for (Service s : p.getServices())
if (s.getType().equals("SecureRandom"))
return s.getAlgorithm();
return null;
}
The NOT metric is reduced by this simplification by 50% (from 113 to 56).
Sometimes, several steps are required to eliminate a variable. Consider for example the following constructore
protected Term(IndexFactory a, Predicate target, List<Attribute> as, Location loc) {
super(a, loc);
predicate = target;
this.loc = loc;
Attribute[] temp = new Attribute[as.size()];
temp = as.toArray(temp);
for (int i = 0; i < temp.length; ++i)
temp[i] = as.get(i);
attributes = Sequence.make(temp);
}
We would like to eliminate both variable temp
and i
. This time, we cannot use an enhanced for loop syntax, since this iteration mechanism does not allow one update the underlying collection.
However, the elimination of i
is rather easy if we remember that the List
interface offers a toArray()
method, so we do not need variable i
to iterate over the array. The tricky part is that for reasons related to the implementation of generics in Java this method requires an array as a parameter. (There is also an overloaded version of this method which takes no parameters. Alas, this version returns an array of Object
s instead of an array of the right type.)
We thus write:
protected Term(IndexFactory a, Predicate target, List<Attribute> as, Location loc) {
super(a, loc);
predicate = target;
this.loc = loc;
Attribute[] temp = as.toArray(new Attribute[as.size()]);
attributes = Sequence.make(temp);
}
Now, we can eliminate temp altogether.
protected Term(IndexFactory a, Predicate target, List<Attribute> as, Location loc) {
super(a, loc);
predicate = target;
this.loc = loc;
attributes = Sequence.make(as.toArray(new Attribute[as.size()]));
}
Finally, re-examining the specification of toArray, we see that it does not need an array of the correct size, so instead
attributes = Sequence.make(as.toArray(new Attribute[as.size()]));
we can write
attributes = Sequence.make(as.toArray(new Attribute[0]));