Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finding unhealthy data sources #19

Open
amotl opened this issue Dec 10, 2021 · 7 comments
Open

Finding unhealthy data sources #19

amotl opened this issue Dec 10, 2021 · 7 comments
Labels
pitch A pitch for a new feature

Comments

@amotl
Copy link
Contributor

amotl commented Dec 10, 2021

Hi there,

within our conversation at [1], @chenlujjj asked for another feature:

Another feature request is to find out invalid data souces.
By invalid I mean the data source cannot pass test when press “Save & Test” button in its page.
I have written a tiny go script to do this. Maybe it can be added to grafana-wtf.

With kind regards,
Andreas.

[1] https://community.grafana.com/t/how-to-find-out-unused-datasources/56920/7

@amotl amotl added the pitch A pitch for a new feature label Dec 10, 2021
@amotl
Copy link
Contributor Author

amotl commented Dec 10, 2021

Hi @chenlujjj,

thank you for suggesting that feature. Sure, that can well become an additional functionality to grafana-wtf, gradually and progressively making it a more complete swiss-army knife tool. That is very much appreciated.

I will be happy to take a look at your go program if you can share it with me by any means? If you don't want to spend a dedicated repository on it, maybe upload it as a gist?

With kind regards,
Andreas.

@chenlujjj
Copy link

Sure, I will upload the script when I go back to company next Monday.

@jangaraj
Copy link

I would say better word for invalid is unhealthy here. Save & test button executes some simple query test, which depends on used datasource. That's IMHO not easy to implement here. It will need to support all current and future built-in & 3rd party datasource types. Also simple TCP connectivity can be a problem, because grafana-wtf can be running on different host as Grafana.

@amotl amotl changed the title Finding invalid data sources Finding unhealthy data sources Dec 11, 2021
@amotl
Copy link
Contributor Author

amotl commented Dec 11, 2021

Hi,

thanks for your guidance, Jan. On order to shed some more light on this topic, I wanted to reference [1] here:

Test your data source

testDatasource implements a health check for your data source. For example, Grafana calls this method whenever the user clicks the Save & Test button, after changing the connection settings.

async testDatasource()

In order to pick some arbitrary examples, [2-5] are the corresponding health check implementations for PostgreSQL, InfluxDB, Tempo, and Prometheus. We can clearly see that those implementations differ significantly, just as @jangaraj described it.

In order to bring this in as a sensible feature for grafana-wtf, it would surely need to support any kind of datasource, so I am sharing the same concerns as @jangaraj. Specifically because the healthcheck logic is apparently implemented in TypeScript, i.e. running on the browser, which probably only could be reused by means of browser automation, instead of being able to just call a HTTP healthcheck endpoint.

So, I will be excited to see how @chenlujjj might have solved it.

With kind regards,
Andreas.

[1] https://grafana.com/tutorials/build-a-data-source-plugin/#test-your-data-source
[2] https://github.com/grafana/grafana/blob/v8.3.2/public/app/plugins/datasource/postgres/datasource.ts#L178-L186
[3] https://github.com/grafana/grafana/blob/v8.3.2/public/app/plugins/datasource/influxdb/datasource.ts#L437-L487
[4] https://github.com/grafana/grafana/blob/v8.3.2/public/app/plugins/datasource/tempo/datasource.ts#L185-L196
[5] https://github.com/grafana/grafana/blob/v8.3.2/public/app/plugins/datasource/prometheus/datasource.ts#L791-L820

@amotl
Copy link
Contributor Author

amotl commented Dec 11, 2021

Saying the above, the actual outcome from the testDatasource() routines seems to be relatively "simple". In the case of InfluxDB, clicking the Save & Test button just makes a HTTP request like

GET /api/datasources/proxy/1/query?db=ldi_v2&q=SHOW%20RETENTION%20POLICIES%20on%20%22ldi_v2%22&epoch=ms HTTP/1.1

@chenlujjj
Copy link

chenlujjj commented Dec 13, 2021

I have to admit that I haven't considered about non-prometheus type data sources.

The code to check if a prometheus data source is healthy or not (omit unrelated parts):

import "github.com/grafana-tools/sdk"

// grafana url and token
type Env struct {
	url, token string
}

// Client for grafana
type Client struct {
	env Env
	c   *sdk.Client
}

func (c *Client) validateDatasource(ds sdk.Datasource) error {
	client := http.Client{Timeout: 10 * time.Second}
	queryUrl := fmt.Sprintf("%s/api/datasources/proxy/%d/api/v1/query?query=%s", c.env.url, ds.ID, "1%2B1") // query=1+1
	if ds.Access == "direct" {
		queryUrl = fmt.Sprintf("%s/api/v1/query?query=%s", ds.URL, "1%2B1")
	}
	req, err := http.NewRequest("GET", queryUrl, nil)
	req.Header.Set("Authorization", "Bearer "+c.env.token)
	if err != nil {
		return err
	}
	resp, err := client.Do(req)
	if err != nil {
		return err
	}
	if resp.StatusCode != http.StatusOK {
		return fmt.Errorf("StatusCode is %d", resp.StatusCode)
	}
	defer resp.Body.Close()
	var response Response
	err = json.NewDecoder(resp.Body).Decode(&response)
	if err != nil {
		return err
	}
	return nil
}

We can notice that the HTTP request made by grafana when clicking the Save & Test button is related to the Access type of the data source.

Access mode controls how requests to the data source will be handled. Server should be the preferred way if nothing else is stated.

Server access mode (Default):
All requests will be made from the browser to Grafana backend/server which in turn will forward the requests to the data source and by that circumvent possible Cross-Origin Resource Sharing (CORS) requirements. The URL needs to be accessible from the grafana backend/server if you select this access mode.

Browser access mode:
All requests will be made from the browser directly to the data source and may be subject to Cross-Origin Resource Sharing (CORS) requirements. The URL needs to be accessible from the browser if you select this access mode.

So maybe the TCP connectivity mentioned by @jangaraj is not a problem any more, right ?

@amotl
Copy link
Contributor Author

amotl commented Jun 20, 2022

Hi again,

we will be conceiving the foundation for this feature within grafana-client. I made a start with grafana-toolbox/grafana-client#19 and added an introduction / call for participation at grafana-toolbox/grafana-client#20, where the details of this matter can be discussed further.

As soon as the new feature will be ready over there, we will return here in order to use it within grafana-wtf appropriately. It will be a very sweet improvement.

With kind regards,
Andreas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pitch A pitch for a new feature
Projects
None yet
Development

No branches or pull requests

3 participants