should step size conversions result in a warning? #356

brharrington · 2016-05-06T15:16:01Z

Currently if a user enters an invalid step size it will get silently converted to the next valid step. Should this result in a warning?

The side question here is usage of step in general. It is generally considered deprecated for direct use by the user.

briangann · 2016-11-23T14:31:55Z

Being able to specify the step size is useful when the metrics being stored are at different intervals than the step-size of Atlas. It's useful for Grafana integration also :)

For example: Running a synthetic transaction every 2 minutes results in metrics being ingested every 2 minutes. When you look at the data at the 1 second interval, it looks like "holes", but at 2 minutes the results are correct. Any holes are "true" misses.

I was surprised when my queries to Atlas auto-jumped from 1s to 5s when I specified a step-size of 2s, and ended up modifying the code to allow 1s,2s,3s,4s,5s (and a few more)

Thanks!

Brian

brharrington · 2016-12-14T14:22:12Z

Thanks for the comment. How does Grafana use that setting?

Being able to specify the step size is useful when the metrics being stored are at different intervals than the step-size of Atlas.

We typically avoid doing that. One example is cloudwatch S3 metrics that we import which get updated once a day. We report them into Atlas at minute level which has a number of benefits:

Visually it is easier for a user to see what is going on, they see the stair step pattern clearly.
The user doesn't need to worry about the step size for the data. For correctly comparing with other signals, aggregation, etc it needs to be understood and in the past this was a source of a lot of confusion and mistakes.
We want the gaps to be no data being available not gaps due to reporting interval. The difference between measured values and nothing reporting is often quite important.

There is a bit of overhead with this, but for us it hasn't come up much and it will get compressed to a constant block in storage so the overhead isn't that high. For use-cases where we do need different step sizes we run those as separate stacks, we don't mix them in the same instance.

I was surprised when my queries to Atlas auto-jumped from 1s to 5s when I specified a step-size of 2s, and ended up modifying the code to allow 1s,2s,3s,4s,5s (and a few more)

I haven't looked at the auto-selection in a while. In general they were selected to be evenly divisible to common time units. For example, we wouldn't want 7 because if I have 1m blocks it would cross the boundaries for a consolidated data point. We also reduced the number of available options to improve caching behavior.

We could probably make it configurable.

briangann · 2017-01-06T06:21:05Z

How does Grafana use that setting?

When you expand or decrease the time range of the metric you are viewing the Grafana datasource plugin for Atlas adjusts the step size. It doesn't "have to" do that, but that's how it was implemented.

What I see for the step-size question is this scenario:

I have a metric collection script (a synthetic transaction really) that can take more than a minute to execute, but always less than 2 minutes. This script is scheduled in Sensu to run every two minutes, any timeouts are "nulls" in Atlas.

I then have a check that queries Atlas (via Sensu) for the metric value, with a step size of 2 minutes, and alerts if there is a null, or if the metric exceeds a threshold value. This check is also run every 2 minutes.

The cloudwatch example I can understand - that's a bulk load of historical "minute stepped" data, but in my case it's always "2 minute stepped" data. It's nice to be able to set my step size to 2minutes can get back a clean series of data, and if there are nulls, they are always timeouts.

Thanks for all the hard work on Atlas, it's working great for us :)

svachalek · 2017-05-30T22:40:25Z

Seems sensible to warn about, if it's specified explicitly. AFAIK our UIs don't add step= unless a user specifies it, which is pretty rare. Most of the time, it's trying to get a step size smaller than the minimum dictated by the time interval so it's probably best to be straightforward that it isn't going to work. The UI could also warn more directly but currently I don't think there's a sound way for the UI to know the minimum step size for a given interval, plus there are still plenty of queries produced manually.

brharrington added this to the 1.6.0 milestone May 6, 2016

brharrington added the discussion label May 27, 2017

brharrington modified the milestones: 1.6.0, 1.7.0 Jun 21, 2018

brharrington added the enhancement label Feb 28, 2019

brharrington added the usability label Mar 28, 2019

brharrington modified the milestones: 1.7.0, 1.8.0 Mar 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

should step size conversions result in a warning? #356

should step size conversions result in a warning? #356

brharrington commented May 6, 2016

briangann commented Nov 23, 2016

brharrington commented Dec 14, 2016 •

edited

Loading

briangann commented Jan 6, 2017

svachalek commented May 30, 2017

should step size conversions result in a warning? #356

should step size conversions result in a warning? #356

Comments

brharrington commented May 6, 2016

briangann commented Nov 23, 2016

brharrington commented Dec 14, 2016 • edited Loading

briangann commented Jan 6, 2017

svachalek commented May 30, 2017

brharrington commented Dec 14, 2016 •

edited

Loading