The ad
module implements reverse-mode automatic differentiation. Computations using AD functions implicitly build a graph of all operations. When the computation finishes, the graph can be walked backward to accumulate partial derivatives from the output back to the inputs.
There is a simple interface for interacting with AD functions and AD values:
var ad = require('adnn/ad');
// Raw Numbers/Tensors can be used with AD functions
ad.scalar.tanh(1.5); // 0.9051...
ad.tensor.tanh(new Tensor([3]).fill(1.5)); // [0.9051, 0.9051, 0.9051]
// To compute derivatives, we must first turn input Numbers/Tensors into AD graph nodes
// by 'lifting' them
var scalarIn = ad.lift(1.5);
var tensorIn = ad.lift(new Tensor([3]).fill(1.5));
// Feeding these nodes into AD functions results in Node outputs, which can be used to
// initialize backpropagation
var scalarOut = ad.scalar.tanh(scalarIn);
scalarOut.backprop();
// We can then retrieve the values and derivatives of different nodes
ad.value(scalarOut); // 0.9051...
ad.derivative(scalarIn); // 0.1807...
// It's also possible to check whether a value is a lifted AD Node or not
ad.isLifted(scalarIn); // true
ad.isLifted(1.5); // false
adnn comes with a large number of built-in AD primitives:
- Unary operators
- Defined for both scalars (in
ad.scalar
) and tensors (inad.tensor
) - floor, ceil, round, sqrt, exp, log, abs, sin, cos, tan, asin, acos, atan, sinh, cosh, tanh, asinh, acosh, atanh, sigmoid
- Defined for both scalars (in
- Binary operators
- Defined for both scalars (in
ad.scalar
) and tensors (inad.tensor
). Tensor operators can accept a scalar as their second argument. - add, sub, mul, div, pow, min, max, atan2
- Defined for both scalars (in
- Comparators
- Currently defined for scalars only (in
ad.scalar
) - eq (==), neq (!=), peq (===), pneq (!==), gt (>), lt (<), geq (>=), leq (<=)
- Currently defined for scalars only (in
- Reductions
ad.scalar.sum(lst)
: Returns the sum of the list of scalarslst
.ad.tensor.sumreduce(x)
: Returns the sum of entries of a tensorx
.ad.tensor.allreduce(x)
: Returns true if all entries of a tensorx
are nonzero.ad.tensor.anyreduce(x)
: Returns true if any entry of a tensorx
is nonzero.
- Indexing and Reshaping
ad.tensor.get(x, i)
: Extracts thei
th element ofx
and returns it as a scalar.ad.tensor.toScalars(x)
: Turns tensorx
into an array of scalars.ad.tensor.fromScalars(lst)
: Turns a list of scalarslst
into a tensor.ad.tensor.range(x, i, j)
: Returns a tensor constructed from elementsi
throughj
(non-inclusive) of tensorx
.ad.tensor.split(x, sizes)
: Split tensorx
intosizes.length
tensors, where the size of the output tensors are given bysizes
.ad.tensor.concat(lst)
: Concatenate a list of tensorslst
into one tensor.
- Linear Algebra
- Requires inputs to be rank 2 tensors (i.e. matrices are NxN, vectors are Nx1).
ad.tensor.transpose(x)
: Returns the transpose of the matrixx
.ad.tensor.diagonal(x)
: Returns a diagonal matrix whose diagonal entries are the entries of the vectorx
.ad.tensor.inverse(x)
: Returns the inverse of the matrixx
.ad.tensor.determinant(x)
: Returns the determinant of the matrixx
.ad.tensor.dot(x, y)
: Returns the inner product of the matricesx
andy
.ad.tensor.cholesky(x)
: Returns the Cholesky decomposition of the matrixx
.
- Miscellaneous
ad.tensor.softmax(x)
: Compute the Softmax function for a tensorx
.
For more information, see functions.js.
adnn also provides an interface for creating your own AD primitive functions:
var ad = require('adnn/ad');
// Defining unary functions
var newUnaryFn = ad.newUnaryFunction({
OutputType: // Either Number or Tensor
name: // The name of the new function
forward: function(x) {...} // Implements the function
backward: function(xnode) {...} // Accumulates into derivative(xnode). Output node available as 'this'
});
// Defining binary functions
var newBinaryFn = ad.newBinaryFunction({
OutputType: // Either Number or Tensor
name: // The name of the new function
forward: function(x, y) {...} // Implements the function
backward1: function(xnode, y) {...} // Accumulates into derivative(xnode). Output node available as 'this'
backward2: function(x, ynode) {...} // Accumulates into derivative(ynode). Output node available as 'this'
});
// Defining arbitrary functions
var newFunction = ad.newFunction({
OutputType: // Either Number or Tensor
name: // The name of the new function
forward: function(...) {...} // Implements the function
backward: function(...) {...} // Accumulates into derivatives of all Node inputs. Output node available as 'this'
getParents: function(...) {...} // Returns a list of inputs which are Nodes.
});
// Can also 'lift' non-differentiable functions to operate on Nodes
var nan = ad.liftUnaryFunction(isNaN);
var eq = ad.liftBinaryFunction(function(x, y) { return x == y; });
For more information, see func.js.
Code which uses scalar math functions can be automatically converted to use scalar AD functions via a Sweet.js macro transform (see macros.sjs). There are several different ways to accomplish this:
In this setting, the simplest way to use the macro transform is via the ad.macroRequire
function:
// In a file called 'dist.js':
function dist(x1, y1, x2, y2) {
var xdiff = x1 - x2;
var ydiff = y1 - y2;
return Math.sqrt(xdiff*xdiff + ydiff*ydiff);
}
module.exports = dist;
// -------------------------------------
// In a separate file:
var ad = require('adnn/ad');
var dist = ad.macroRequire('./dist.js'); // 'dist' is now an AD function
See transform.js to learn more about this function.
Currently, ad.macroRequire
is not available in the browser, as attempting to load a browserified script which includes Sweet.js will throw an error. For the time being, one workaround is to directly use Sweet.js to macro-transform your code (see their FAQ section on How to run Sweet.js in the browser).
If you wish to include macro transformation as part of a compile / package / minify pipeline for creating a browser script, then check out the sweetify transform plugin for browserify.