Lead-up:
Google Play introduced a new mandatory version requirement
We released compliant version 1.64.xx in early Jan and have been observing normal operations for 3 weeks and several minor release iterations
Incident:
On Feb 1 we started observing early signs of deteriorating detection on Android for a very small subset of user and started investigation
On Feb 4 volume of impacted users started approaching ~10% with disproportionate bulk of the affected users being on Samsung devices running Android 12 and 13
On Feb 6 impacted users count escalated to ~15% and thorough our investigation efforts we have isolated the issue to be in the transport layer of the app
On Feb7 we started testing potential fix and created backend workaround to facilitate the drive capture
On Feb 8 we started seeing mitigation measures take effect and we rolled out an app fix, percentage of users started seeing recovery at that time
On Feb 9 we created additional app update and rolled it out to 100%, effectively resolving the issue
Learnings:
We will be prioritizing transport layer robustness as a part of our ongoing product modernization
We were under-monitored in some areas and will focus on additional telemetry that would have sped up the detection and remediation
We will allow for more time to soak test mandatory Play Store target SDK updates