← Back
AWS ML Blog

Huntington Bank: Redacting sensitive data from 400M+ documents with AWS

7 min read
#deployment#enterprise#amazon
Level:Intermediate
For:AI Engineers
TL;DR

Huntington Bank utilized Amazon Textract, Amazon SageMaker, AWS Step Functions, and AWS Lambda to design a scalable redaction workflow, reducing the timeline for processing 400 million documents from years to months. The solution ensured data encryption at rest and in transit, met strict access requirements, and achieved redaction accuracy of 95% or higher. By leveraging AWS services, Huntington was able to efficiently process large volumes of documents while maintaining compliance with PCI DSS requirements. This approach has significant implications for engineers building AI systems that require large-scale document processing and redaction.

⚡ Key Takeaways

  • Amazon Textract was used to detect sensitive data such as Social Security numbers, account numbers, and personal addresses in documents.
  • AWS Step Functions and AWS Lambda were used to orchestrate the document processing pipeline and improve accuracy in detecting sensitive information.
  • AWS DataSync, AWS Direct Connect, and Amazon S3 were used to securely transfer over 400 million documents from an on-premises file share to an Amazon S3 bucket.
  • The solution achieved redaction accuracy of 95% or higher, meeting compliance requirements.
  • The use of AWS services such as AWS Key Management Service (AWS KMS) ensured data encryption at rest and in transit.
💡 Why It Matters

The Huntington Bank case study demonstrates the effectiveness of using AWS services to design a scalable and secure document processing pipeline, which is crucial for organizations that need to process large volumes of sensitive documents while maintaining compliance with regulatory requirements. This approach can be applied to other industries that require large-scale document processing and reda

✅ Practical Steps

  1. Use Amazon Textract to detect sensitive data in documents and improve redaction accuracy.
  2. Design a scalable document processing pipeline using AWS Step Functions and AWS Lambda.
  3. Utilize AWS DataSync, AWS Direct Connect, and Amazon S3 to securely transfer large volumes of documents.
  4. Implement data encryption at rest and in transit using AWS Key Management Service (AWS KMS).

Want the full story? Read the original article.

Read on AWS ML Blog

More like this

Retrofit, don’t rebuild: Agentic overlays for transforming legacy enterprise services

AWS ML Blog#agents

Salesforce launches Help Agent to simplify AI customer service deployment

SiliconANGLE AI#enterprise

Improving the speed and energy-efficiency of AI agents

MIT News AI#agents

The fuel of the future is already here: Why TRISO matters

Amazon Science#amazon

EXPLORE AI NEWS

Daily hand-picked stories on LLMs, RAG, agents and production AI — curated for engineers who ship.

BROWSE NEWS

GET THE WEEKLY DIGEST

Join engineers getting the Monday signal-over-noise AI breakdown. No spam, unsubscribe anytime.

LEARN AI ENGINEERING

Curated courses, research papers, repos and tutorials built for engineers leveling up in AI.

START LEARNING