NARA needed to validate archival data from President Barack Obama’s administration prior to transferring it to its new cloud-based home, the Executive Office of the President (EOP) 44. Softek answered this critical by providingdetailed validation of all elements and submission ingest packages (SIP) usinga gateway, the Data Transport and Control (DATR) environment.
The Challenge – Ensuring a Smooth Transition of EOP 44 to a Digital Platform
NARA needed to transition the Obama Administration’s paper and digital archives to a newly modernized, more secure cloud-based platform. The Obama Library worked diligently to digitize the paper textual records (DTR), which amounted to roughly 30 million pages. Each DTR had to be formatted into submission ingest packages (SIP) for compliance with NARA’s governance model. NARA needed to:
- Help define a SIP specification,
- Coordinate transfer and acceptance of SIPs,
- Validate SIPs against the SIP specification, and
- Make any required changes to ensure each DTR is accessible in the EOP.
These activities take place within the DATR environment – a safe space for data validation, inventory, and virus control prior to entering Dev/Test. Ensuring transmission and ingestion with 100 percent integrity was essential.
Softek designed and deployed the DATR dev/test environment to closely mirror that of the EOP 44 cloud dev/test and allow for isolation and validation of incoming SIPs. Validation included:
- Initial compliance
- Anti-virus screening
- Fixity and inventory
- Archival arrangement
- Data format and Structure Validation, and
- Image attribute validation.
Softek placed DATR in the optimized AWS GovCloud between NARA and the Obama Foundation. DATRs with capabilities included development, test, tools, scripts, and services for conducting automated transfer, acceptance, and quality controls. We worked with the Obama Foundation library to create the best SIP design, including content and controls, and meet NARA EOP SIP specifications.
Softek automated specification checks in AWS Lambda to create triggers to move SIPs through each validation gate while generating real-time progress metrics. Metrics were visualized with dashboards and provided information such as the number of SIPs expected, received, and validated. When Invalid SIPs were identified, like fixity, inventory,, image quality, etc., an issue summary was automatically generated to provide a fix. Softek also:
- Optimized access management activities with AWS IAM
- Provided data encryption at rest and in transit with AWS KMS and AWS ACM, an
- Enabled auditing with AWS CloudWatch.
Our Impact – Fully Validated SIPs, Zero Data Loss
Softek leveraged the AWS GovCloud by taking advantage of its capacity-based pricing model and flexibility. This created cost savings as NARA’s ingest loads were sporadic, but high capacity, and the service remained zero-cost when not in use.
Our expertise in the cloud allowed us to create secure systems with tightened security to ensure data was transferred and received no chance of a virus intrusion or security breach.
We also provided outstanding validation of SIPs and ingestion, resulting in zero loss or loss of integrity of the transferred data.