Skip to content

List of aggregators

Pedro B. Arruda edited this page Aug 30, 2020 · 3 revisions

Aggregators are functions that summarize a collection of JSON values into a single JSON value. Since selections are normally comprised of more than one element, extractors alone cannot specify what is to be done with the data in its entirety. We need aggregators to do that. You can read more on aggregators and how they fit into the bigger picture here. This is only a reference list.

  • count: counts the number of elements in the selection.
  • count(extractor): counts the number of instances that the extractor evaluates to true.
  • first(extractor): retrieves only the first element in the selection and applies the extractor to it. If the selection is empty, this aggregator evaluates to null.
  • collect(extractor): retrieves all the elements in the selection, applying the extractor to each element and puts all values in an array.
  • distinct(extractor): retrieves all the elements in the selection, applying the extractor to each element and puts all values in an array, removing duplicates.
  • sum(extractor): retrieves all the elements in the selection, applying the extractor to each element and summing the result.
  • group(extractor, aggregator): applies the aggregator only on the group which evaluate to the same result when the extractor is applied. The result is returned as a map where the extracted values are the keys and the aggregated results are the values. For example, the rule below returns a map from element name to the number of times that that element appears in the whole webpage:
select * {
   element-type-frequency: group(name, count);
}