There are five different kinds of "worst-case scenario" situations that may impact the service that OnINBOX provides. This article will explain what they are, any associated user impact, and what we have in place to mitigate any risks.
1. An unexpected error occurs while scanning an individual email.
If OnINBOX experiences an unrecoverable error while scanning an email, it will simply move the email out of the processing folder and into the user's inbox unscanned, without adding the usual security banner.
2. OnINBOX doesn't receive a new email notification from Office 365 / Google
Generally, OnINBOX scans emails in real-time, receiving push notifications from the email provider whenever a new email arrives. There is the possibility that the email provider experiences an issue and might not send us an email notification for a particular email, or stop sending us notifications altogether.
To protect our customers from this scenario, we have a secondary method, where we pull notifications by frequently checking the processing folder to see if any emails are in the folder waiting to be processed. With this system in place, any emails that we do not receive a push notification for will be processed within approximately a minute. We also have monitoring in place to let us know when this second method is picking up emails.
3. Something behaves unexpectedly with an updated version of the product
We use a staggered release process where any new change is first tested in our development environment, then on Red Sift's accounts in production, and lastly a gradual update process for all of our customers. All new product releases are highly monitored in case something unexpected happens and we are able to roll back any new changes very quickly.
Generally, we release new features and changes when they are ready without announcing them in advance, and then let the marketing team promote the new features afterward. There are some exceptions to this rule, where significant changes are expected to the user experience or where there might be risks associated with the change, where we will provide at least several days' notice (where possible).
4. OnINBOX experiences downtime
We have a large number of various monitors in place to detect whether OnINBOX is likely to start to experience a degradation in service or any downtime. Including outside of regular office hours, we always have a Software Engineer assigned to quickly respond to any alerts, with a clear escalation process. If we are unable to immediately recover the service, we will contact our customers via email and activate our emergency protocol.
Our emergency protocol involves activating an isolated, standalone service that temporarily disables OnINBOX from all of the affected user's email accounts. The service deletes the rule that moves messages to the processing folder and moves any emails in the processing folder into the user's inbox to ensure email deliverability is unaffected by the outage. The emergency protocol only takes a couple of minutes to complete and once the issue has been resolved, we are able to just as quickly reverse the emergency process.
5. Google API / Office 365 API experiences downtime
The Google / Office 365 API serves as the 'middle man' between OnINBOX and your email accounts. Like the previous scenario, we have a number of various monitors in place to detect whether any degradation is starting to take place or any downtime and we would contact you if this scenario was to happen. Depending on the type of downtime or degradation, we may be able to activate our emergency protocol to ensure full email deliverability. There is a chance, however, that the type of downtime may affect our ability to carry out our emergency protocol effectively.
To mitigate this risk, unlike an email gateway, we never stop or delay emails from arriving to a user's email account. Instead, we move emails into a processing folder until OnINBOX has scanned the email. That means that in this scenario, users will still receive all of their emails, but they will be sitting in the processing folder until the downtime has resolved enough for the emergency protocol to be completed, or until the email provider's service is resolved (whichever happens first).
Fortunately, being two of the largest global services, this scenario is also the most unlikely of the worst case scenarios, and in the event that this was to occur, we would expect a small army to be working on solving the issue very quickly. This scenario would not only affect OnINBOX, but also any other products using the API.