From 077679ebe735207b7c0a79104c9a1ee45ea5a165 Mon Sep 17 00:00:00 2001 From: Kamesh Akella Date: Wed, 19 Jun 2024 06:14:31 -0400 Subject: [PATCH] Initial docs for JVM tuning and observability Closes #837 Signed-off-by: Kamesh Akella Signed-off-by: Alexander Schwartz Co-authored-by: Alexander Schwartz --- doc/kubernetes/modules/ROOT/nav.adoc | 8 +++ .../modules/ROOT/pages/running/index.adoc | 14 ++++- .../ROOT/pages/running/jvm/jvm_metrics.adoc | 23 +++++++ .../ROOT/pages/running/jvm/jvm_options.adoc | 61 +++++++++++++++++++ 4 files changed, 104 insertions(+), 2 deletions(-) create mode 100644 doc/kubernetes/modules/ROOT/pages/running/jvm/jvm_metrics.adoc create mode 100644 doc/kubernetes/modules/ROOT/pages/running/jvm/jvm_options.adoc diff --git a/doc/kubernetes/modules/ROOT/nav.adoc b/doc/kubernetes/modules/ROOT/nav.adoc index 09135d310..c6ccb1dda 100644 --- a/doc/kubernetes/modules/ROOT/nav.adoc +++ b/doc/kubernetes/modules/ROOT/nav.adoc @@ -13,6 +13,14 @@ include::partial$subnav-openshift.adoc[] * xref:testing/index.adoc[] * xref:running/index.adoc[] +** xref:running/infinispan-deployment.adoc[] +** xref:running/loadbalancing.adoc[] +** xref:running/split-brain-stonith.adoc[] +** xref:running/timeout_tunning.adoc[] +** xref:running/take-active-site-offline.adoc[] +** xref:running/bring-active-site-online.adoc[] +** xref:running/jvm/jvm_metrics.adoc[] +** xref:running/jvm/jvm_options.adoc[] * xref:customizing-deployment.adoc[] * xref:storage-configurations.adoc[] ** xref:storage/postgres.adoc[] diff --git a/doc/kubernetes/modules/ROOT/pages/running/index.adoc b/doc/kubernetes/modules/ROOT/pages/running/index.adoc index b732aea36..395a34aec 100644 --- a/doc/kubernetes/modules/ROOT/pages/running/index.adoc +++ b/doc/kubernetes/modules/ROOT/pages/running/index.adoc @@ -8,16 +8,26 @@ It summarizes the logic which is condensed in the Helm charts and scripts in thi IMPORTANT: Most of the guides are now available as the High availability guides on https://www.keycloak.org/high-availability/introduction[Keycloak's main website]. Once they had been published as part of the Keycloak 23 release, they have been removed from this site. +These guides will eventually be published Keycloak's main web page. + [#building-blocks] -== Building blocks not yet published on keycloak.org +== Building blocks * xref:running/infinispan-deployment.adoc[] * xref:running/loadbalancing.adoc[] * xref:running/split-brain-stonith.adoc[] * xref:running/timeout_tunning.adoc[] +* xref:running/jvm/jvm_metrics.adoc[] +* xref:running/jvm/jvm_options.adoc[] [#operational-procedures] -== Operational procedures not yet published on keycloak.org +== Operational procedures * xref:running/take-active-site-offline.adoc[] * xref:running/bring-active-site-online.adoc[] + +[#jvm-tuning] +== JVM tuning guides + +* xref:running/jvm/jvm_metrics.adoc[] +* xref:running/jvm/jvm_options.adoc[] diff --git a/doc/kubernetes/modules/ROOT/pages/running/jvm/jvm_metrics.adoc b/doc/kubernetes/modules/ROOT/pages/running/jvm/jvm_metrics.adoc new file mode 100644 index 000000000..fb469b832 --- /dev/null +++ b/doc/kubernetes/modules/ROOT/pages/running/jvm/jvm_metrics.adoc @@ -0,0 +1,23 @@ += {project_name} JVM Metrics +:description: This documentation covers the information about the key JVM metrics for observing the performance of {project_name}. + +{description} + +== JVM info + +jvm_info_total:: This provides important information about the JVM such as version, runtime and vendor. + +== Heap Memory Usage +jvm_memory_committed_bytes:: This measures the amount of memory that the JVM has committed for use, reflecting the portion of the allocated memory that is guaranteed to be available for the JVM to use. +jvm_memory_used_bytes:: This measures the amount of memory currently used by the JVM, indicating the actual memory consumption by the application and JVM internals. + +== Garbage Collection Metrics +jvm_gc_pause_seconds_max:: It represents the maximum duration, in seconds, of garbage collection pauses experienced by the JVM due to a particular cause, which helps you quickly differentiate between types of GC (minor, major) pauses. + +jvm_gc_pause_seconds_sum:: It represents the total cumulative time spent in garbage collection pauses, indicating the impact of GC pauses on application performance in the JVM. + +jvm_gc_pause_seconds_count:: This metric counts the total number of garbage collection pause events, helping to assess the frequency of GC pauses in the JVM. + +jvm_gc_overhead_percent:: The percentage of CPU time spent on garbage collection, indicating the impact of GC on application performance in the JVM. It refers to the proportion of the total CPU processing time that is dedicated to executing garbage collection (GC) operations, as opposed to running application code or performing other tasks. This metric helps determine how much overhead GC introduces, affecting the overall performance of the {project_name}'s JVM. + +Additional information on the {project_name} `metrics` endpoint can be found https://www.keycloak.org/server/configuration-metrics[here]. diff --git a/doc/kubernetes/modules/ROOT/pages/running/jvm/jvm_options.adoc b/doc/kubernetes/modules/ROOT/pages/running/jvm/jvm_options.adoc new file mode 100644 index 000000000..b9af1c98e --- /dev/null +++ b/doc/kubernetes/modules/ROOT/pages/running/jvm/jvm_options.adoc @@ -0,0 +1,61 @@ += {project_name} JVM Options +:description: This documentation covers the information about the JVM Options for tuning the performance of {project_name}. + +{description} + +== Why JVM Heap tuning is relevant to performance of {project_name} + +{project_name}, being a Java-based application, relies on the JVM for memory management. Proper heap sizing ensures that the application has enough memory to handle its operations without encountering memory-related issues. Efficient garbage collection (GC) is a significant factor in this process. If the heap is too small, the GC will run frequently, increasing CPU usage and potentially causing pauses. Conversely, an overly large heap can result in longer GC pauses. By tuning the heap size appropriately, the time spent in garbage collection is minimized, enhancing overall application throughput. + +Furthermore, adequate heap tuning helps prevent out-of-memory (OOM) errors, +contributing to the stability and reliability of {project_name}. +It also improves latency and response times, which are crucial for authentication and authorization tasks. +Proper memory management enables the application to scale effectively, +handling increased loads without performance degradation. +Additionally, optimized heap settings ensure efficient resource utilization, +preventing both under utilization and overconsumption of system resources. + +=== Set up the JVM Options +We can set the JVM options in the deployment where the specific variables such as JVM_OPTS/JAVA_OPTS_KC_HEAP could be overridden, and the special flags are enabled. Remember, if you have multiple containers/ servers, make sure the configuration is applied in a consistent manner to all the Keycloak JVMs. + +We can also verify the configuration if it's applied by running the below command on the Keycloak server node, which prints the VM.flags that are applied to the specific JVM. + +[source,bash] +---- +jcmd 1 VM.flags +---- + +=== Standard JVM Options + +-XX:MetaspaceSize:: Set the initial metaspace size. +-XX:MaxMetaspaceSize:: Set the maximum metaspace size. + +=== JAVA_OPTS_KC_HEAP +==== Container-specific workload JVM Heap Options + +-XX:MaxRAMPercentage:: Set the maximum percentage of the system's physical memory that the JVM can use. +-XX:MinRAMPercentage:: Set the minimum percentage of the system's physical memory that the JVM can use. +-XX:InitialRAMPercentage:: Set the initial percentage of the system's physical memory allocated to the JVM. + +==== Non-Container specific workload JVM Heap Options +-Xms:: Set the initial heap size for the JVM. +-Xmx:: Set the maximum heap size for the JVM. + +==== Garbage Collection Tuning Options +-XX:+UseG1GC:: Enables the G1 garbage collector. +-Xlog:gc:file="path/to/file":: We can set this to generate GC logs which then could be collected to perform GC log analysis. +-XX:MaxGCPauseMillis:: Set the target for maximum GC pause time. + +==== Performance Tuning Options +-XX:MinHeapFreeRatio:: Set the minimum percentage of free heap space to maintain before expanding the heap. +-XX:MaxHeapFreeRatio:: Set the maximum percentage of free heap space to maintain before shrinking the heap. +-XX:GCTimeRatio:: Set the desired ratio of garbage collection time to application time. +-XX:AdaptiveSizePolicyWeight:: Adjusts the weight of the adaptive size policy decisions in the JVM. +-XX:ConcGCThreads:: Specify the number of threads used for concurrent garbage collection. +-XX:CICompilerCount:: Set the number of compiler threads for just-in-time (JIT) compilation. + +==== Additional JVM Options for analysis +-XX:+ExitOnOutOfMemoryError:: Exits on OutOfMemoryError. +-XX:FlightRecorderOptions=stackdepth=512:: Set the Java Flight recorder depth configuration for a JFR that could be recorded for heap analysis. + +