Skip to content
This repository has been archived by the owner on Apr 25, 2024. It is now read-only.

Add RESTful end point to access continous aggregate group queries #143

Draft
wants to merge 31 commits into
base: master
Choose a base branch
from

Conversation

camallen
Copy link
Contributor

This PR adds continuous aggregates to sumate a group's daily classification contributions (CA) using the timescale's continuous aggregates https://legacy-docs.timescale.com/v1.7/api#continuous-aggregates and exposes a RESTful API to access these counts.

Specifically this PR

  1. upgrades the timescale extension to 1.7.4 (the latest Azure currently allows)
  2. adds rake tasks to enable the CA
  3. ensures the CA are setup in the test env
  4. creates a CA on the group_id attribute in the events table
  5. adds a read only AR model to query the materialized CA view
  6. adds a RESTful group count route controller action
  7. adds a simple JSON events serializer that does simple AR scope limits and ordering
  8. adds how to use API docs including the returned JSON schema format
  9. removes unused rails components from loading (align to API style rails service)
  10. updates / removes dev & test gems (maintenance to get the setup working well)

This PR is intended for use by the FACTSet team to build the group query dashboard functionality.

Longer term this PR will be employed by the Zooniverse team to expand the current timescale Stats API to use continuous aggregates for exposing improved API query types.

Some items of additional work could be to:

  1. add per (user|project|workflow) continuous aggregates in the (hour|day|month|year) time buckets
  2. expose the above metrics via RESTful API end points
  3. add serializer decorator object to the AR scopes to build backwards compatible API end points for clients that consume the https://github.com/zooniverse/zoo-event-stats/ RESTful API
  4. look at adding compression https://legacy-docs.timescale.com/v1.7/api#compression and data management drop chunks policies https://legacy-docs.timescale.com/v1.7/api#add_drop_chunks_policy to ensure we only keep the relevant data in the DB
  5. launch this independently of the current timescale DB to avoid upgrades on old data and start with a blank slate.

camallen added 30 commits June 24, 2021 12:26
simplify the creation of development databases with continuous aggregates views
used in the groups continuous aggregates view
avoid serializing all the tmp / log cruft to the docker image build context
simple PORO serializer format JSON and apply limits and order to the AR scopes
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant