Huntington Bank: Redacting sensitive data from 400M+ documents with AWS
Huntington Bank utilized Amazon Textract, Amazon SageMaker, AWS Step Functions, and AWS Lambda to design a scalable redaction workflow, reducing the timeline for processing 400 million documents from years to months. The solution ensured data encryption at rest and in transit, met strict access requirements, and achieved redaction accuracy of 95% or higher. By leveraging AWS services, Huntington was able to efficiently process large volumes of documents while maintaining compliance with PCI DSS requirements. This approach has significant implications for engineers building AI systems that require large-scale document processing and redaction.
⚡ Key Takeaways
- Amazon Textract was used to detect sensitive data such as Social Security numbers, account numbers, and personal addresses in documents.
- AWS Step Functions and AWS Lambda were used to orchestrate the document processing pipeline and improve accuracy in detecting sensitive information.
- AWS DataSync, AWS Direct Connect, and Amazon S3 were used to securely transfer over 400 million documents from an on-premises file share to an Amazon S3 bucket.
- The solution achieved redaction accuracy of 95% or higher, meeting compliance requirements.
- The use of AWS services such as AWS Key Management Service (AWS KMS) ensured data encryption at rest and in transit.
The Huntington Bank case study demonstrates the effectiveness of using AWS services to design a scalable and secure document processing pipeline, which is crucial for organizations that need to process large volumes of sensitive documents while maintaining compliance with regulatory requirements. This approach can be applied to other industries that require large-scale document processing and reda
✅ Practical Steps
- Use Amazon Textract to detect sensitive data in documents and improve redaction accuracy.
- Design a scalable document processing pipeline using AWS Step Functions and AWS Lambda.
- Utilize AWS DataSync, AWS Direct Connect, and Amazon S3 to securely transfer large volumes of documents.
- Implement data encryption at rest and in transit using AWS Key Management Service (AWS KMS).
Want the full story? Read the original article.
Read on AWS ML Blog ↗