Disaster Recovery Plan

SchoolsPLP Team response to major disruptions to service or disaster are outlined in the following document. SchoolsPLP functions as a distributed company, which innately reduces point source failures to a single facility or area. We know that no disaster plan can ever be truly complete, so we are adopting a process of continuous improvement in which we identify key next steps in preparedness with the goal of ever-increasing contingency planning with redundant processes. SchoolsPLP leadership reviews and updates this plan every six months. 

  1. Small-Scale Data Corruption
    1. Whether accidental or deliberate, the process for dealing with small-scale data corruption is identical. 
    2. Within SchoolsPLP databases, we create daily snapshots of data that are kept for 30 days. One of these snapshots can be restored to a temporary database in order to recover lost data. An additional option to restore functionality could be to re-sync recent data from Agilix DLAP databases.
    3. In case of small-scale data corruption in Agilix DLAP databases, all entity, item, and resource data is versioned and can be restored to a previous version after a change. If corrupt student activity records were stored, we would work with Agilix engineers to recover it using their backups.
  2. Loss of Database Servers and/or Large-Scale Data Corruption
    1. This risk is reduced because of our use of Amazon Web Service’s “serverless” database technology. So there is no single, physical server upon which we rely, which reduces point-source failure.
    2. Within SchoolsPLP databases, we create daily snapshots of data that are kept for 30 days. One of these can be restored onto a new database cluster within a few hours.
  3. Loss of Application Servers
    1. This risk is reduced due to SchoolsPLP’s use of Amazon Web Service’s virtual servers so that there is no single, physical server upon which we rely.
    2. All application server configuration and deployment data is kept in an independent version control system, which allows us to deploy new application servers in a very short time.
  4. Loss of AWS Availability Zone
    1. Continued operation during a short-term AWS failure, while possible, would be unjustifiably expensive. These events are rare enough that our plan to weather short-term events and be able to recover in case of a long-term event.
    2. Recovery in this situation is identical to a simultaneous loss of database and application servers, and would follow those plans.
  5. Loss of AWS Region
    1. While an event such as this has never taken place, our intention is to implement an automated process of duplicating all backup data into an alternate AWS Region in case of unprecedented infrastructure loss.
  6. Loss of Key Workstations/Local Data
    1. Mitigation Strategies: (1) Staff laptops use whole-disk encryption (2) Policy dictates that company data is stored in the cloud.
    2. In case of workstation theft, all user credentials will be changed immediately.
  7. Agilix Cloud Server Instance Downtime or Loss
    1. SchoolsPLP relies on Agilix’s DLAP service. Agilix’s DRP is incorporated herein by reference.