Recent Issues with Crash Catch (Registration and Flutter)

Chris Board

Jul 21, 20227 min read

I recently identified some issues with Crash Catch and want to let you know how it was identified and what I’m doing to ensure they don’t happen again.

How Were Issues Identified

As you may be aware, Crash Catch, is still relatively new and is currently running in a beta, therefore, issues are to be expected, however, the issues that I recently discovered should not have been there regardless of being in beta and I am embarrassed that these issues went unnoticed.

Over the last month, I’ve had a few signups of Crash Catch, each most signups have registered, created a project and not gone any further, with one or two registering and then not even getting as far as a creating a project.

Before we get into the issues, let me breakdown some of the monitoring and error handling that is done by Crash Catch.

First of all there are multiple components as listed below:

  • Engine - This receives and processes the crashes submitted by the Crash Catch libraries, plus runs some automated tasks such as sending emails when trials are about to end, housekeeping old data etc.
  • Crash Catch API - This is the main API used by the Crash Catch website. This provides an API for the web pages such as authentication, retrieving account information, retrieving crash information etc.
  • Main Web App - This is the main website where you login and view your projects and crashes.

Within the engine and the API should any error occur then an alarm is raised and I then receive an email so I am aware of any fault that happens in real time so I can investigate.

When I noticed people signing up and not going any further than creating a project, I never received dany alarms, and looked through the logs and couldn’t see any errors in the log file (I thought maybe something was failing but had forgotten to an alarm submission).

I then suspected there was something wrong with the registration process so I tried signing up myself and creating a project and everything worked as expected.

I emailed each of the signups but never got a reply with any feedback or details of any issues.

A month later another few signups again, the same thing where they sign up and then no further progress made.

I decided to retest the registering again, this time I got an error. When you login, the app saves a global state and this is updated as part of the registration when the registration response is returned from the API.

It turns out the reason why it worked the first time I tried it was because I had previously logged in, and therefore had a version of the global state already saved. The second time I tried, I had cleared my browser cache (by luck as I was having an at home network issue) and because I cleared the browser cache, I no longer had a global state object, which meant when I registered, it tried to update a setting in the global state object which failed as the object wasn’t created and therefore a javascript error was thrown. Ironically, I don’t have Crash Catch library added to the main app s tako this went unnoticed.

The second issue was around Flutter projects. The majority of people that signed up created a Flutter project and then no further action, there was no crash submitted.

I looked through the logs for the project id, and found that the new user had registered the project, and then successfully added the library to a flutter project and sent an initialisation request which was successful but no further action taken on that project.

I decided to retest registering a new flutter project and seeing what happens. I created a project that initialised and sent a crash straight away (similar to the example in the docs) and found that although I successfully initialised the project with the library, no crash was sent to the Crash Catch backend.

I found that this was an odd timing issue. On certain devices, it appeared to work and on others it failed. This was due to a timing issue where the crash being sent was attempted before another part of the library completed a task, which meant the crash was never attempted to be sent.

This then identified another issue. If the crash was able to be submitted, there was then an issue on the engine side, although this took some time to identify as I initially dismissed the crash processing being at fault as no alarm was raised.

This was unfortunately a decision that I made during the development. I had a concern that, maybe not relevant, where if invalid data was sent to the engine API, I didn’t want to raise an alarm, as I could potentially receive a lot of alarms if someone, or something is trying to send an invalid request. This meant that I didn’t get alerted to there being a problem with the library submitting data.

What Am I Doing To Resolve

The issues identified above have already been fixed, so you will now be able to register and send flutter crashes (make sure you are using the latest version tagged on GitHub).

If you have been following me on Twitter, you may be aware that I am in the process of redesigning Crash Catch. The web interface hasn’t changed all that much since the initial release, and I using it myself, have found certain aspects to be a little clunky and overly complicated in places, so I am currently going through a redesign.

As part of this project I am going to further test and make some improvements to the backend, these are detailed below:

  • I am going to further test each individual project library from start to finish. This will mean registering a new project, confirming the instructions for setting up the project is accurate and easy to follow, setting up a new project and sending a test crash. This will then ensure that the official library and backend engine works as expected, and ensures that the main web app displays the crash information correctly for the new redesign.
  • I am also going to add further alarms in to the backend engine. If the API endpoint is for an official library, and the request is invalid, then an alarm will be raised so I am aware. This won’t happen forever for custom community built SDKs as their is the potential someone could miss a parameter during their own development so I wouldn’t want to receive an alarm for that everytime as there’s nothing I can do. The engine does return a response that the request was invalid so this would be picked up during community developers progress on the library.
  • Investigate improvements to Datadog monitoring. For errors such as bad request I don’t believe currently that these errors are recorded. For anything that fails within the engine outside of bad requests, such as server errors, project not found, not authenticated etc, counts are sent to Datadog so I can see if there’s an unexpected increase in certain HTTP status codes. However, API bad requests are thrown an exception outside currently of where I can send responses, but I want to investigate the ability to catch these errors and then report them to Datadog.

Conclusion

I want to apologise again that these issues were present, and unnoticed for so long. I strive to make software, web apps etc, that are reliable and trustworthy and unfortunately this has not been the case.

I have contacted everyone who signed up in the last month to let them know these issues have been identified and resolved, but if you signed up earlier on and were also affected by these issues, then please send me an email at [email protected] with the email you used to signup and I can reactivate your trial.

As for the redesign, I am in the final stages of testing and hope to have a release in the not too distant future, but you can follow me on Twitter as I’m providing updates there.

Test Track

Are you a developer or involved in Quality Assurance Testing or User Acceptance Testing, you might be interested in Test Track

A simple and affordable test planning and management solution.