Home Trending News Blog Jobs

Reads Videos ShortsNEW Podcasts About

Home›Amazon

Amazon

19 curated articles on Amazon for AI engineers

19 articles

Build interactive PDF text extraction from Amazon S3

AWS ML Blog· 15 min read· Yesterday

Build interactive PDF text extraction from Amazon S3

This article presents a solution for building an interactive PDF text extraction server from Amazon S3, providing real-time access to text inside PDFs without batch pipelines or heavy infrastructure. The solution utilizes a Model Context Protocol (MCP) server approach, which sits between custom scripts and batch pipelines, offering interactive access with minimal setup. This approach is suitable for text-based PDFs in development and proof of concept settings, whereas Amazon Textract is recommended for complex document processing. The practical implication for engineers building AI systems is that they can leverage this solution to provide on-demand access to text inside PDFs, enhancing the efficiency of compliance, legal, financial services, and executive teams.

Key Takeaways Read →

The fuel of the future is already here: Why TRISO matters

Amazon Science· 5 min read· 3 days ago

The fuel of the future is already here: Why TRISO matters

Amazon is investing in next-generation nuclear technology, specifically tristructural isotropic (TRISO) fuel particles, to meet the rising energy demands of AI infrastructure and cloud computing. TRISO particles have a ceramic shell with three layers, providing exceptional mechanical integrity and thermal resilience, with a failure fraction of ≤ 6.6 × 10⁻⁵ at 1600°C. This technology offers greater flexibility in fuel form and reactor design, enabling new operational modes and potentially reducing waste. The practical implication for engineers building AI systems is the potential for more efficient and sustainable energy sources to power their infrastructure.

Key Takeaways Read →

Graviton5’s improved design increases speed and energy efficiency — beyond Moore’s law

Amazon Science· 5 min read· Jun 10, 2026

Graviton5’s improved design increases speed and energy efficiency — beyond Moore’s law

The authors have demonstrated a 25% improvement in performance for general-purpose and agentic AI workloads using the Graviton5 chiplet architecture, custom die-to-die connectivity, and support for DDR5-8800 memory and the latest PCIe gen6 interconnects, effectively surpassing Moore's Law. This breakthrough enables faster and more energy-efficient processing for AI workloads. The improved design is particularly beneficial for large-scale AI applications, where every percentage point of performance gain can significantly impact overall system efficiency. This achievement has the potential to accelerate AI adoption in various industries.

Key Takeaways Read →

Production-grade AI agents for financial compliance: Lessons from Stripe

AWS ML Blog· 16 min read· Yesterday

Production-grade AI agents for financial compliance: Lessons from Stripe

Stripe built a production-grade AI agent system on AWS using Amazon Bedrock, reducing review handling time by 26 percent while maintaining human oversight and achieving over 96 percent helpfulness ratings. The system, based on Stripe's ReAct agent framework, utilizes task decomposition, orchestration patterns, and cost optimization through prompt caching to scale compliance operations. This approach addresses the $206 billion global compliance burden by identifying 95% of card-testing attacks in real time and reducing unnecessary customer friction by 20%. The practical implication for engineers building AI systems is the importance of designing agentic systems that balance automation with human oversight and accountability.

Key Takeaways Read →

Optimize model training on Amazon SageMaker AI with NVIDIA Blackwell

AWS ML Blog· 13 min read· 2 days ago

Optimize model training on Amazon SageMaker AI with NVIDIA Blackwell

The introduction of NVIDIA Blackwell GPUs on Amazon SageMaker AI enables the optimization of model training for large AI models by reducing constraints such as batch sizes limited by GPU memory and sequence lengths cut short to avoid out-of-memory errors. With Blackwell's expanded memory and new precision formats, users can train models with larger batch sizes, longer sequence lengths, and reduced model sharding, resulting in improved throughput and reduced communication overhead. The use of PyTorch Fully Sharded Data Parallel (FSDP) and strategic application of activation checkpointing can further optimize training configurations. This leads to faster iteration cycles, less networking overhead, and lower infrastructure costs. By properly configuring Blackwell training jobs, users can process larger batch sizes without aggressive sharding and achieve better results for long-range depende

Key Takeaways Read →

Amazon Research Awards recipients announced

Amazon Science· 6 min read· May 27, 2026

Amazon Research Awards recipients announced

The Amazon Research Awards (ARA) recipients have been announced, spanning 49 universities across 11 countries, with access to Amazon public datasets, AWS AI/ML services, and tools. This collaboration enables researchers to leverage Amazon's resources, accelerating AI/ML advancements. The recipients will utilize these resources to drive innovation and push the boundaries of AI research. The ARA program fosters a collaborative environment between academia and industry, promoting knowledge sharing and advancements in AI.

Key Takeaways Read →

Implementing super resolution by deploying SeedVR2 on Amazon SageMaker AI

AWS ML Blog· 11 min read· 2 days ago

Implementing super resolution by deploying SeedVR2 on Amazon SageMaker AI

The SeedVR2 model, an open-source video restoration model developed by ByteDance's Seed team, can be deployed on Amazon SageMaker AI to address the challenge of upscaling lower-resolution video content to higher resolutions. This approach provides a scalable solution for super resolution, analyzing visual information frame by frame to restore details and improve video quality. By leveraging SageMaker's managed infrastructure, users can process video collections at scale while maintaining cost efficiency and performance. The solution architecture utilizes a three-tier AWS architecture defined with AWS Cloud Development Kit (AWS CDK) for infrastructure as code. The practical implication for engineers building AI systems is the ability to implement video upscaling using SeedVR2 on SageMaker AI, enabling the restoration of historical footage, enhancement of subscriber experiences, and effici

Key Takeaways Read →

Build self-service AWS Health analytics to find actionable health insights with AI agents powered by Amazon Bedrock

AWS ML Blog· 23 min read· 2 days ago

Build self-service AWS Health analytics to find actionable health insights with AI agents powered by Amazon Bedrock

The Chaplin solution utilizes AI agents powered by Amazon Bedrock and exposed through the Model Context Protocol (MCP) to provide self-service health event analytics for AWS Health notifications. This approach enables teams to ask questions in natural language and receive precise, contextualized answers without relying on AWS Support. With Chaplin, teams can identify actionable health insights, prioritize events, and make informed decisions. The practical implication for engineers building AI systems is that they can leverage Chaplin to streamline health event management and focus on innovation rather than reactive firefighting.

Key Takeaways Read →

Building agentic AI applications with a modern data mesh strategy on AWS

AWS ML Blog· 22 min read· 2 days ago

Building agentic AI applications with a modern data mesh strategy on AWS

Building agentic AI applications on a modern data mesh strategy on AWS requires fine-grained access control enforced at every layer of the data interaction chain. The proposed architecture extends the original with three key changes: replacing Amazon OpenSearch Serverless with Amazon S3 Vectors, replacing general-purpose Amazon S3 with Amazon S3 Tables governed by AWS Lake Formation, and exposing the data mesh as Model Context Protocol (MCP) tools through AgentCore Gateway with AWS Lambda-backed interceptors. This approach provides a secure, scalable data foundation for production agentic AI, reducing vector storage and query costs by up to 90% and increasing transactions per second by up to 10 times. The practical implication for engineers building AI systems is the ability to enforce fine-grained access control and provide a governed data mesh for agentic AI applications.

Key Takeaways Read →

Huntington Bank: Redacting sensitive data from 400M+ documents with AWS

AWS ML Blog· 7 min read· 3 days ago

Huntington Bank: Redacting sensitive data from 400M+ documents with AWS

Huntington Bank utilized Amazon Textract, Amazon SageMaker, AWS Step Functions, and AWS Lambda to design a scalable redaction workflow, reducing the timeline for processing 400 million documents from years to months. The solution ensured data encryption at rest and in transit, met strict access requirements, and achieved redaction accuracy of 95% or higher. By leveraging AWS services, Huntington was able to efficiently process large volumes of documents while maintaining compliance with PCI DSS requirements. This approach has significant implications for engineers building AI systems that require large-scale document processing and redaction.

Key Takeaways Read →

Build a healthcare appointment agent with Amazon Nova 2 Sonic

AWS ML Blog· 13 min read· 3 days ago

Build a healthcare appointment agent with Amazon Nova 2 Sonic

This article demonstrates how to build a healthcare appointment agent using Amazon Nova 2 Sonic and Amazon Bedrock AgentCore, achieving 90% accuracy in appointment reminder conversations and reducing manual data entry by 75%. The agent leverages voice authentication, appointment management, and pre-visit health information collection. This solution enables healthcare providers to streamline patient interactions and improve operational efficiency. The tradeoff is a potential increase in upfront development costs due to the need for custom voice models and integration with existing systems.

Key Takeaways Read →

How Loka Built a Natural, Low-Latency Voice Agent with Amazon Nova 2 Sonic

AWS ML Blog· 11 min read· 3 days ago

How Loka Built a Natural, Low-Latency Voice Agent with Amazon Nova 2 Sonic

Loka built a conversational AI agent with Amazon Nova 2 Sonic, achieving high speech reasoning accuracy and low latency, outperforming traditional voice AI pipelines. The native speech-to-speech model processed audio end-to-end, capturing tone, emotion, and subtle cues, and scored 87.0 on the Big Bench Audio benchmark. This approach solved the common frustration of robotic, slow voice assistants, delivering natural and responsive experiences. The practical implication for engineers building AI systems is that native speech-to-speech models can provide a better solution for voice AI adoption, with lower costs and faster response times.

Key Takeaways Read →

9 ways AI is reshaping enterprise operations: Key insights from AWS Summit NYC

SiliconANGLE AI· 4 days ago

9 ways AI is reshaping enterprise operations: Key insights from AWS Summit NYC

The AWS Summit NYC 2026 highlighted the evolving role of AI in enterprise operations, shifting from experimentation to practical deployment. Key discussions centered around the use of physical robots and agentic systems to address labor shortages and reshape operations. Not mentioned are specific numbers, model names, or benchmark results. The practical implication for engineers building AI systems is the increasing focus on deployment and real-world applications.

Key Takeaways Read →

Build a protein research copilot with Amazon Bedrock AgentCore

AWS ML Blog· 15 min read· 4 days ago

Build a protein research copilot with Amazon Bedrock AgentCore

This article presents a technical guide on building a protein research copilot using Amazon Bedrock AgentCore, which enables researchers to search for structurally similar peptides across large datasets using natural language queries. The system combines natural language query parsing, vector similarity search over protein embeddings, and AI-generated scientific summaries of search results. The copilot is built using the Strands Agents SDK and deployed to Amazon Bedrock AgentCore for production serving. The practical implication for engineers building AI systems is the ability to create conversational interfaces that can handle complex research workflows and provide accurate results.

Key Takeaways Read →

Embed the world: Multimodal AI for searchable aerial imagery at scale

AWS ML Blog· 25 min read· 5 days ago

Embed the world: Multimodal AI for searchable aerial imagery at scale

The AWS Generative AI Innovation Center (GenAIIC) partnered with Vexcel to develop a multimodal AI system for searchable aerial imagery at scale, leveraging Amazon Bedrock and Amazon OpenSearch Serverless. The system uses multimodal embeddings, large language model (LLM) captioning, and vector search to enable natural-language-searchable knowledge bases. The evaluation methodology, built on OpenStreetMap ground truth, compared embedding models, fusion strategies, captioning, and search methods, with Amazon Nova Multimodal Embeddings delivering the highest F1 scores. This approach removes the per-feature training step, allowing for faster and more efficient semantic search. The practical implication for engineers building AI systems is the potential to apply this architecture to other domains, enabling faster and more efficient search capabilities.

Key Takeaways Read →

Running ComfyUI workflows on Amazon SageMaker AI processing jobs

AWS ML Blog· 12 min read· 5 days ago

Running ComfyUI workflows on Amazon SageMaker AI processing jobs

ComfyUI workflows can be deployed on Amazon SageMaker AI processing jobs to automate content generation at scale, allowing enterprises to generate hundreds of high-quality images in a single batch. This solution utilizes AWS Cloud Development Kit (AWS CDK) for infrastructure setup, GPU-accelerated processing, and automation of image generation. By leveraging ComfyUI and SageMaker, businesses can accelerate campaigns, boost conversions through personalization, and protect brand equity. The practical implication for engineers building AI systems is the ability to scale their creative pipeline and automate repetitive tasks, freeing creative teams to focus on high-impact strategy.

Key Takeaways Read →

Introducing Web Search on Amazon Bedrock AgentCore

AWS ML Blog· 10 min read· Jun 19, 2026

Introducing Web Search on Amazon Bedrock AgentCore

Amazon Bedrock AgentCore now offers a fully managed web search capability, allowing AI agents to access up-to-date information from the web without infrastructure overhead. This feature, compatible with the Model Context Protocol (MCP), provides a purpose-built web index spanning tens of billions of documents, updated continually to reflect new content within minutes. The privacy model ensures that queries stay within AWS, and retrieval can combine a knowledge graph with semantic snippet extraction. This development has significant implications for engineers building AI systems, as it addresses the limitation of frozen knowledge at training time and enables agents to respond to real-time queries.

Key Takeaways Read →

Accelerate campaign workflow with insights from Adobe Marketing Agent for Amazon Quick

AWS ML Blog· 14 min read· Jun 19, 2026

Accelerate campaign workflow with insights from Adobe Marketing Agent for Amazon Quick

The Adobe Marketing Agent for Amazon Quick integration enables marketing teams to access campaign insights within governed conversations in seconds, using natural language to ask questions about campaign performance, audiences, and journeys. The integration is configured using the Model Context Protocol (MCP) and provides capabilities such as campaign review and monitoring, campaign planning, audience insights, journey insights, and journey conflict analysis. The solution applies governance controls, including least privilege, tenant isolation, and audit logging, to ensure secure and compliant data access. This integration has practical implications for engineers building AI systems, as it demonstrates the potential for AI-powered analysis and automation in marketing workflows.

Key Takeaways Read →

Monitor and debug generative AI inference with SageMaker detailed metrics and Insights dashboard on CloudWatch

AWS ML Blog· 14 min read· Jun 18, 2026

Monitor and debug generative AI inference with SageMaker detailed metrics and Insights dashboard on CloudWatch

Amazon SageMaker AI now provides detailed inference metrics and a SageMaker Insights dashboard in Amazon CloudWatch to monitor and debug generative AI inference endpoints. The dashboard supports both single-model endpoints (SME) and inference component (IC) endpoints, and provides over 100 metrics, including GPU health, token-level latency, and KV cache pressure. This allows machine learning platform engineers, MLOps teams, and site reliability engineers (SREs) to keep inference endpoints healthy, responsive, and cost-efficient. The practical implication for engineers building AI systems is that they can now easily monitor and troubleshoot their generative AI inference endpoints, reducing downtime and improving overall performance. The SageMaker Insights dashboard provides a fully managed observability solution, removing the need for custom Grafana dashboards and Prometheus configuration

Key Takeaways Read →