-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query and error logs #12
Comments
Current format for args: queryTime[TAB]queryText[TAB]ipAddress[TAB]url[NEWLINE] But if we want to extend this to several services, we should rather use some key-value-based format to allow for special cases that some services have and others have not. We then need a common vocabulary for the usual fields. Plausible would be one JSON per line (as suggested by Michael). We can first discuss here, but should then find a place for the result. Probably in the web services notes: https://webis.de/facilities.html?q=service What could we use as a starting point? Also pinging @mam10eks in case he wants to chime in. |
The basic idea is that each log file uses the JSON Lines format. This gives us the ability to log structured data and maximum flexibility but it also means that we have should standardize the JSON values to some extent (e.g. requiring timestamps). As for staring points: Log4J JSON (Java), Bunyan (JS) |
I am indeed interested in that topic ;) But I guess that for error logs or similar we may use services that we already have running in our infrastructure: I think ChatNoir (maybe others too?) use LOKI for that. Maybe @phoerious also has an oppinion on that? |
So which fields should we standardize?
It is also good to have a "query" field with the query text. This will allow us to log also other requests into the same log, where the queries can then be identified easily and one does not need to parse the URL to work with them. |
I have had the same thoughts for ChatNoir already. My conclusion was that logging to Elasticsearch would be the best thing. For long-term storage, the Elasticsearch bulk format can be used. I would not log directly to CephFS, since that only creates problems when logging from multiple instances at the same time and for any meaningful analytics you'd need to index it to Elasticsearch anyway. If you don't want to log to Elasticsearch directly, you can also write to a local log and then use Logstash to write batches to Elasticsearch.
That's an ISO datetime and the standard format in both Python and Elasticsearch. HTTP dates are notoriously hard to work with. |
How easy is it to export the Elasticsearch logs to a text file? It is a requirement for args to easily get access to the logs in a text format. But if there is somewhere a button that you can click (accessible over VPN), I think this would be fine. We solved logging from multiple instances for our purposes by using the hostname in the filename. Logging to the filesystem is nice when I start a server locally for testing, but it would be ok-ish for me to have different possibilities here (for testing/production). |
Let's please use the ISO format. Virtually every language has native support for this and I don't want to use a third-party library just for parsing timestamps.
gRPC/gRPC-web services don't use URL params. They use Protobuf messages in the request body, so we have to log that as well (Protobuf has native support for converting messages to JSON). So gRPC services need to log a To summaries: Every log entry should look like this so far: {
"timestamp": string, // ISO date time
"user": string, // identifier for the current user
"url"?: string, // (optional) the full request URL
"message"?: unknown, // the request Protobuf message (only for gRPC services)
"query"?: string, // (optional) the query text of the request
} Services are allowed to add more custom fields. Edit: Made the |
Looks good! Not sure if we should then make the "url" mandatory. Should we create an own Java repo for creating/writing/parsing log records? Or just have one document at some place that describes the fields? Like in the FAQ or the services howto? |
I don't see why one wouldn't want to log it. Could you an example?
We should most definitely write this down instead of having just a reference implementation. Maybe in FAQ under "How to do logging"? I don't know if we need a dedicated Java project for that for now. How many services will use this kind of logging? |
If a service has just one URL, it does not make much sense to log it. At the moment I'd indeed then favor a FAQ entry, but I have to think a bit more about this. However, in order to get things done, it might be good to just start the FAQ entry and then move it somewhere else if necessary (but keep a link). I agree that this has priority over a reference implementation. |
I also agree that elasticsearch makes sense, and from the application-view it should not make any overhead, since probably all popular logging-libraries should have a logstash plugin. @johanneskiesel: I think the best way to transform the elasticsearch logs into plain text files like JSONL is by using the scroll-api. (Janek recommendet this in the wstud-stustu-kolleg channel.) @RunDevelopment Since this is a c/c++ project, do you know good logging libraries that may have out of the box support for logstash? I just googled a bit and found that log4cplus for example does not seem to have out of the box support (e.g. here). |
Ok, I will have a look at logstash. But Janek suggested that it is possible to first write to disk and then use Logstash to write Batches to Elasticsearch. I think I like this option a lot. Then the log files are already accessible as text, but we still get the uniform place and analytical powers of Elasticsearch. I started the entry here: https://git.webis.de/code-generic/code-webis-faq/-/blob/master/README.md#how-to-code-logging Feel free to extend/improve |
I thought about using |
Nice! But even when it is only a few lines, this is (maybe) complex:
Of course logging does not need to have to be perfect, but I would hope that available implementations/plugins would apply some reasonable best practices fur such stuff. |
Still, logging to the file-system can also fail (e.g., temporary problems with CephFS). So I guess it would still be ok to log directly to elasticsearch, but I would feel safer when we add an prometheus alert (e.g., no submitted queries within the last 48 hours) that we recognize potential problems within a few days. |
Previously, a TomCat Java server handeled the logging but after its removal, this application has to do it.
This issue should be used to discuss the log format, how logging will be enabled, and where logs are saved (= the responsibilities of this application).
The text was updated successfully, but these errors were encountered: