You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
There appears to be a race condition causing silent failures when the EventRepository object is first initialized. This prevents the app from sending events to Rudderstack.
We are working on a React Native project using the Rudderstack React Native SDK. We noticed that we were not receiving any messages from Android devices and, through a long and tedious process, uncovered that this Android SDK has an uncaught race condition which is silently causing the SDK Initialization process to hang and never complete.
In the example, if the ExecutorService performs the brokenThread() logic before normalThread() then the call to lock.unlock() is never performed and normalThread() cannot proceed as it can never obtain the lock.
More specifically, the scenario being played out in our testing is as follows:
The call to RudderServerConfigManager.getConfig() is implicitly dependent on the logic from RudderServerConfigManager.fetchConfig() having been completed, which is not guaranteed
If the logic submitted by RudderServerConfigManager.fetchConfig() has not completed yet (ie. Slow download, device thread prioritization), then initiateSDK() will continue and enter its retry-loop
In the case where the Data Plane URL is not available, ReportManager.enableStatsCollection() throws an uncaught NullPointerException
Keep in mind that the following example is from manually surrounding logic with a Try-Catch and injecting custom log messages so the line references will not match the mainline branch
2024-12-09 13:26:53.315 5514-5596 System.err W java.lang.NullPointerException: Attempt to invoke virtual method 'com.rudderstack.android.sdk.core.SourceConfiguration$Metrics com.rudderstack.android.sdk.core.SourceConfiguration$StatsCollection.getMetrics()' on a null object reference
2024-12-09 13:26:53.315 5514-5596 System.err W at com.rudderstack.android.sdk.core.ReportManager.enableStatsCollection(ReportManager.java:166)
2024-12-09 13:26:53.315 5514-5596 System.err W at com.rudderstack.android.sdk.core.EventRepository.$r8$lambda$YGTjZjmH_we7SltuiZkqwISvAUw(Unknown Source:0)
2024-12-09 13:26:53.315 5514-5596 System.err W at com.rudderstack.android.sdk.core.EventRepository.lambda$initiateRudderReporterFromPrefetchedConfig$0(EventRepository.java:169)
2024-12-09 13:26:53.315 5514-5596 System.err W at com.rudderstack.android.sdk.core.EventRepository$$ExternalSyntheticLambda2.onConfigFetched(D8$$SyntheticClass:0)
2024-12-09 13:26:53.316 5514-5596 System.err W at com.rudderstack.android.sdk.core.RudderServerConfigManager.lambda$getFetchedConfig$1(RudderServerConfigManager.java:98)
2024-12-09 13:26:53.316 5514-5596 System.err W at com.rudderstack.android.sdk.core.RudderServerConfigManager.$r8$lambda$D1rwBTLDndjjqf3X7BU51T5tnRk(Unknown Source:0)
2024-12-09 13:26:53.316 5514-5596 System.err W at com.rudderstack.android.sdk.core.RudderServerConfigManager$$ExternalSyntheticLambda1.run(D8$$SyntheticClass:0)
2024-12-09 13:26:53.316 5514-5596 System.err W at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:487)
2024-12-09 13:26:53.316 5514-5596 System.err W at java.util.concurrent.FutureTask.run(FutureTask.java:264)
2024-12-09 13:26:53.316 5514-5596 System.err W at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
2024-12-09 13:26:53.316 5514-5596 System.err W at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:644)
2024-12-09 13:26:53.316 5514-5596 System.err W at java.lang.Thread.run(Thread.java:1012)
When the uncaught NPE is thrown, it kills the Thread it's running within.
Since the use of the ReentrantLock in RudderServerConfigManager is not using the following suggested structure, the Thread that previously claimed the lock will never release it and all other Threads waiting for it will be stuck.
// The "Correct" way according to the Javadocslock.lock();
try {
// Critical Section
} finally {
lock.unlock();
}
What interested us the most is that this does not appear to be breaking the app that Rudderstack is being used within, but simply prevents Rudderstack from doing anything
For example, calls to flush() will throw an error (if log output is explicitly configured), but otherwise not kill the app
Also worth noting; The act of catching the NPE prevents this issue from occurring as the Thread is not killed, but rather carries on following the printing of the Stacktrace
To Reproduce
Steps to reproduce the behaviour:
We have a simple recreation in a running React Native project if you need to see it in action.
The Zip is 300MB so please download from Google Drive as GitHub won't allow me to upload it
Download the example app and launch it (virtual or real device; doesn't matter)
The example app comes with a README which describes the process of launching and recreating the issue
Expected Behavior
The event should be fired off and no errors should be observed
Screenshots
Custom log capture painting a picture of the timeline of events in the app:
---------------------------- PROCESS STARTED (6097) for package [REDACTED] ----------------------------
2024-12-09 13:44:00.332 6097-6152 RudderSDK D Debug: [TEST] (Thread ID 204) EventRepository constructor entered
2024-12-09 13:44:00.373 6097-6152 RudderSDK D Debug: [TEST] (Thread ID 204) RudderServerConfigManager constructor entered
2024-12-09 13:44:00.373 6097-6152 RudderSDK D Debug: [TEST] (Thread ID 204) fetchConfig() entered
2024-12-09 13:44:00.373 6097-6152 RudderSDK D Debug: [TEST] (Thread ID 204) fetchConfig() exited
2024-12-09 13:44:00.374 6097-6152 RudderSDK D Debug: [TEST] (Thread ID 204) RudderServerConfigManager constructor exited
2024-12-09 13:44:00.379 6097-6152 RudderSDK D Debug: [TEST] (Thread ID 204) initiateSDK() entered
2024-12-09 13:44:00.380 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) SDK Not Initialized. Retry 0
2024-12-09 13:44:00.380 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) Calling configManager.getConfig()
2024-12-09 13:44:00.380 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) getConfig() entered
2024-12-09 13:44:00.380 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) claiming lock
2024-12-09 13:44:00.380 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) successfully claimed lock
2024-12-09 13:44:00.380 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) releasing lock
2024-12-09 13:44:00.380 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) successfully released lock
2024-12-09 13:44:00.380 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) getConfig() exited
2024-12-09 13:44:00.380 6097-6152 RudderSDK D Debug: [TEST] (Thread ID 204) initiateRudderReporterFromPrefetchedConfig() entered
2024-12-09 13:44:00.381 6097-6152 RudderSDK D Debug: [TEST] (Thread ID 204) getFetchedConfig() entered
2024-12-09 13:44:00.381 6097-6152 RudderSDK D Debug: [TEST] (Thread ID 204) getFetchedConfig() exited
2024-12-09 13:44:00.381 6097-6152 RudderSDK D Debug: [TEST] (Thread ID 204) initiateRudderReporterFromPrefetchedConfig() exited
2024-12-09 13:44:00.381 6097-6152 RudderSDK D Debug: [TEST] (Thread ID 204) DataPlaneUrl at end of EventRepository constructor: null
2024-12-09 13:44:00.381 6097-6152 RudderSDK D Debug: [TEST] (Thread ID 204) Exiting EventRepository constructor: null
2024-12-09 13:44:00.381 6097-6174 RudderSDK D Debug: [TEST] (Thread ID 218) Executing fetchConfig() logic
2024-12-09 13:44:00.381 6097-6174 RudderSDK D Debug: [TEST] (Thread ID 218) downloadConfig() entered
2024-12-09 13:44:00.386 6097-6174 RudderSDK D Debug: [TEST] (Thread ID 218) RudderServerConfigManager: downloadConfig: configUrl: [REDACTED]
2024-12-09 13:44:00.391 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) getFetchedConfig() logic entered
2024-12-09 13:44:00.391 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) claiming lock
2024-12-09 13:44:00.391 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) successfully claimed lock
2024-12-09 13:44:00.391 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) getFetchedConfig() logic critical section entered
2024-12-09 13:44:00.391 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) getFetchedRudderServerConfig() entered
2024-12-09 13:44:00.396 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) getFetchedRudderServerConfig() exited
2024-12-09 13:44:00.396 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) EventRepository: constructor: Prefetched source serverConfig is available
2024-12-09 13:44:00.396 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) getDataPlaneUrl() entered
2024-12-09 13:44:00.397 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) getDataResidencyUrl() entered
2024-12-09 13:44:00.397 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) Data Residency URL is using Default Region
2024-12-09 13:44:00.397 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) getDataResidencyUrl() exited
2024-12-09 13:44:00.397 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) Data Residency URL was empty. Returning Config Value: [REDACTED]
2024-12-09 13:44:00.397 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) getDataPlaneUrl() exited
2024-12-09 13:44:00.397 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) enableStatsCollection() entered
2024-12-09 13:44:00.397 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) isStatsReporterAvailable() entered
2024-12-09 13:44:00.397 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) isStatsReporterAvailable() exited
2024-12-09 13:44:00.397 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) releasing lock
2024-12-09 13:44:00.397 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) successfully released lock
2024-12-09 13:44:00.397 6097-6176 RudderSDK D Debug: [TEST] (Thread ID 220) getFetchedConfig() logic exited
2024-12-09 13:44:00.667 6097-6174 RudderSDK D Debug: [TEST] (Thread ID 218) RudderServerConfigManager: downloadConfig: configJson: {
"source": {
"config": {},
"id": "[REDACTED]",
"name": "sandbox_mobile_reactnative",
"writeKey": "[REDACTED]",
"enabled": true,
"sourceDefinitionId": "[REDACTED]",
"deleted": false,
"createdAt": "Thu May 04 2023 15:47:00 GMT-0400 (Eastern Daylight Time)",
"updatedAt": "Thu May 04 2023 15:47:00 GMT-0400 (Eastern Daylight Time)",
"sourceDefinition": {
"id": "[REDACTED]",
"name": "ReactNative",
"options": null,
"displayName": "React Native",
"category": null,
"createdAt": "2020-04-16T11:22:37.660Z",
"updatedAt": "2020-06-18T11:54:09.089Z"
},
"destinations": []
},
"metadata": {
"version": "1.0.2"
}
}
2024-12-09 13:44:00.668 6097-6174 RudderSDK D Debug: [TEST] (Thread ID 218) saveRudderServerConfig() entered
2024-12-09 13:44:00.670 6097-6174 RudderSDK D Debug: [TEST] (Thread ID 218) saveRudderServerConfig() exited
2024-12-09 13:44:00.670 6097-6174 RudderSDK I Info: [TEST] (Thread ID 218) RudderServerConfigManager: downloadConfig: server config download successful
2024-12-09 13:44:00.670 6097-6174 RudderSDK D Debug: [TEST] (Thread ID 218) downloadConfig() exited
2024-12-09 13:44:00.670 6097-6174 RudderSDK D Debug: [TEST] (Thread ID 218) claiming lock
2024-12-09 13:44:00.670 6097-6174 RudderSDK D Debug: [TEST] (Thread ID 218) successfully claimed lock
2024-12-09 13:44:00.670 6097-6174 RudderSDK D Debug: [TEST] (Thread ID 218) getFetchedRudderServerConfig() entered
2024-12-09 13:44:00.671 6097-6174 RudderSDK D Debug: [TEST] (Thread ID 218) getFetchedRudderServerConfig() exited
2024-12-09 13:44:00.672 6097-6174 RudderSDK D Debug: [TEST] (Thread ID 218) releasing lock
2024-12-09 13:44:00.672 6097-6174 RudderSDK D Debug: [TEST] (Thread ID 218) successfully released lock
2024-12-09 13:44:00.672 6097-6174 RudderSDK D Debug: [TEST] (Thread ID 218) Finished fetchConfig() logic
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) SDK Not Initialized. Retry 1
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) Calling configManager.getConfig()
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) getConfig() entered
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) claiming lock
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) successfully claimed lock
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) releasing lock
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) successfully released lock
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) getConfig() exited
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) serverConfig is not null
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) SDK is enabled
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) DataPlaneUrl is: null
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) getDataResidencyUrl() entered
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) Data Residency URL is using Default Region
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) getDataResidencyUrl() exited
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) DataResidencyUrls is set to: null
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) getDataPlaneUrl() entered
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) getDataResidencyUrl() entered
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) Data Residency URL is using Default Region
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) getDataResidencyUrl() exited
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) Data Residency URL was empty. Returning Config Value: [REDACTED]
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) getDataPlaneUrl() exited
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) enableStatsCollection() entered
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) isStatsReporterAvailable() entered
2024-12-09 13:44:02.381 6097-6175 RudderSDK D Debug: [TEST] (Thread ID 219) isStatsReporterAvailable() exited
{{No further [TEST] logs | App continues to "operate"}}
Version of the Android SDK
Android 14 (API Level 34)
Version of the Rudderstack Android SDK
1.25.1
Mobile device mode integration
Rudderstack React Native SDK 1.14.0 (Issue present up to 1.15.0)
The example app is relying on the Rudderstack React Native SDK to perform it's tasks so I don't believe that qualifies as calling on the "native SDK" directly
SDK initialisation snippet
See Rudderstack React Native SDK source code for this initialization logic
Check for Correct Usage of writeKey and dataPlaneUrl
Confirm that the correct writeKey and dataPlaneUrl are utilized during SDK initialization.
This issue prevents dataPlaneUrl from being correctly initialized
The text was updated successfully, but these errors were encountered:
Thanks for raising this issue. It seems you're using the self-hosted control plane URL and the metrics config is missing. I would request you to add the following config:
Or, try using the default control plane and try again:
await rudderClient.setup(rudderConfig.writeKey, {
dataPlaneUrl: rudderConfig.dataPlaneUrl,
// Just remove the controlPlaneUrl field and SDK will automatically use the default controlPlaneUrl.
trackAppLifecycleEvents: true,
recordScreenViews: false,
logLevel: RUDDER_LOG_LEVEL.VERBOSE,
});
Let us know if you are still facing any issues after performing the above steps.
Describe the bug
There appears to be a race condition causing silent failures when the
EventRepository
object is first initialized. This prevents the app from sending events to Rudderstack.We are working on a React Native project using the Rudderstack React Native SDK. We noticed that we were not receiving any messages from Android devices and, through a long and tedious process, uncovered that this Android SDK has an uncaught race condition which is silently causing the SDK Initialization process to hang and never complete.
The issue stems from how the RudderServerConfigManager class is using
ReentrantLock
.brokenThread()
logic beforenormalThread()
then the call tolock.unlock()
is never performed andnormalThread()
cannot proceed as it can never obtain the lock.More specifically, the scenario being played out in our testing is as follows:
EventRepository
is createdRudderServerConfigManager
constructor calls on RudderServerConfigManager.fetchConfig() which submits a task to be performed asynchronouslyfetchConfig()
method is exited regardless of whether or not the async logic is performedEventRepository
constructor then eventually calls EventRepository.initiateSDK()Server Config
existsRudderServerConfigManager.fetchConfig()
having been completed, which is not guaranteedRudderServerConfigManager.fetchConfig()
has not completed yet (ie. Slow download, device thread prioritization), theninitiateSDK()
will continue and enter its retry-loopinitiateSDK()
spawned a new Thread, it returns almost immediately andEventRepository
works its way down to EventRepository.initiateRudderReporterFromPrefetchedConfig()RudderServerConfigManager.fetchConfig()
having completedData Plane URL
Data Plane URL
is not available,ReportManager.enableStatsCollection()
throws an uncaught NullPointerExceptionReentrantLock
inRudderServerConfigManager
is not using the following suggested structure, the Thread that previously claimed the lock will never release it and all other Threads waiting for it will be stuck.flush()
will throw an error (if log output is explicitly configured), but otherwise not kill the appTo Reproduce
Steps to reproduce the behaviour:
Expected Behavior
The event should be fired off and no errors should be observed
Screenshots
Version of the Android SDK
Version of the Rudderstack Android SDK
Mobile device mode integration
SDK initialisation snippet
Check for Correct Usage of writeKey and dataPlaneUrl
Confirm that the correct
writeKey
anddataPlaneUrl
are utilized during SDK initialization.dataPlaneUrl
from being correctly initializedThe text was updated successfully, but these errors were encountered: