Twin brother of Alike. They look quite alike. Except this brother is more heavy-weight and likes playing in trees.
A simple-yet-powerful KD-tree library for NodeJS, with support for lightning-fast k-Nearest Neighbour queries. Supports normalization, weights, key and filter parameters.
But if Look-Alike is more powerful than Alike, why use Alike at all? Well, TANSTAAFL
Alike is more nimble and functional and does not require state. You can call it with thousands of rows and it will still be fast (n log n
). It is also better when the rows change often (no need to rebuild the tree).
Look-Alike builds the tree in advance (which is the only way to beat n log n
) and holds it in memory. It supports kNN queries that can be as fast as k log n
, so is the prefered brother when you have more than a few thousand rows. The tree does not change -- it would need to be rebuild (at n log2 n
) before the query when the rows change.
To install and add it to your package.json
$ npm install look-alike --save
Now you can load up the module and use it like so:
LA = require('look-alike');
tree = new LA([rows]);
top3 = tree.query(subject, {k:3});
Given [rows]
is an array of gazillion objects with x
numerical attributes, top3
now holds an array of the 3 closest objects to subject
. The second-easiest Recommender System you will ever see (aside from Alike).
Look-Alike exports a single class with a constructor and a single instance method.
The constructor expects an array of objects, and optionally an object of options:
objects[Array]
- The rows from which we build our treeoptions[Object] (optional)
- which may include:attributes[Array] (optional)
- An array of strings which list attributes/dimensions to use in the KD-tree. By default, will use all keys of the first object inobjects
. Specifying which keys to use for the tree is useful if you want to include additional information in the objects (such as an id, or a label, or more).key[Function] (optional)
- Akey
function, which is used to map overobjects
to get to theattributes
. Useful when the attributes are nested in the parent object.
The instance method is called query
and expects the following parameters:
subject[Object]
- The reference point that we want to find the Nearest Neighbors ofoptions[Object] (optional)
- which may include:k[Int] (default = 1)
- The number of objects to return. The query complexity isk log n
, so the higher this number, the longer the algorithm takes (on average).normalize[Bool] (default = true)
- When true, will normalize the attributes when calculating distances (recommended if attributes are not on the same scale). Note: turn off for small number of rows to avoid unexpected results! In those cases you should be using Alike anyway!weights[Object] (optional)
- Define weights per attribute (e.g.{x:0.3, y:0.7}
would weight attributey
at 70% andx
at 30%. Defaults to equal weights)filter[Function] (optional)
- When provided, only consider objects in the tree that pass the filter function (i.e. return true)
And returns an array of objects in sorted order. You may look at the test-cases if you don't believe me.
Look-Alike is written in Literate CoffeeScript in the coffee/
folder. Browse around on Github for the best annotated source experience!
Unit tests are in the test/
folder. You can run the tests with npm test
or if you are developing, you may use make watch-test
to watch while you TDD. :)
Look-Alike is licensed under the terms of the GNU Lesser General Public License, known as the LGPL.