HomeAmazon

Amazon

16 curated articles on Amazon for AI engineers

16 articles
Introducing Gemma 4 models on Amazon Bedrock
AWS ML Blog· 22 min read· Today
Introducing Gemma 4 models on Amazon Bedrock

The Gemma 4 family of open-weight models is now available on Amazon Bedrock, offering a range of instruction-tuned variants with dense and mixture-of-experts architectures. The models, built by Google DeepMind, provide built-in reasoning, native function calling, and multimodal input across text and image, with a focus on intelligence-per-parameter. With Amazon Bedrock, organizations can access leading open-weight foundation models without compromising on data protection, regulatory alignment, or operational control. The Gemma 4 family includes three variants: Gemma 4 31B, Gemma 4 26B-A4B, and Gemma 4 E2B, which can be used to build multimodal agents, lightweight applications, and document understanding pipelines.

Graviton5’s improved design increases speed and energy efficiency — beyond Moore’s law
Amazon Science· 5 min read· 5 days ago
Graviton5’s improved design increases speed and energy efficiency — beyond Moore’s law

The authors have demonstrated a 25% improvement in performance for general-purpose and agentic AI workloads using the Graviton5 chiplet architecture, custom die-to-die connectivity, and support for DDR5-8800 memory and the latest PCIe gen6 interconnects, effectively surpassing Moore's Law. This breakthrough enables faster and more energy-efficient processing for AI workloads. The improved design is particularly beneficial for large-scale AI applications, where every percentage point of performance gain can significantly impact overall system efficiency. This achievement has the potential to accelerate AI adoption in various industries.

Building Supercharger: How Rocket Close optimized title operations with agentic AI
AWS ML Blog· 10 min read· 3 days ago
Building Supercharger: How Rocket Close optimized title operations with agentic AI

Rocket Close built Supercharger, an agentic AI solution, to optimize title operations workflows by combining title and closing knowledge to guide teams through the order processing workflow. The solution uses Strands Agents, large language models (LLMs), Amazon Bedrock, Amazon Bedrock Knowledge Bases, and Model Context Protocol (MCP) tools to centralize knowledge and automate research-heavy tasks. This results in improved efficiency, reduced time spent searching for information, and enhanced operational efficiency and client experience. The solution's architecture is designed with security in mind, using Amazon Bedrock Guardrails and row-level data entitlements to prevent accidental access to customer-sensitive data. For engineers building AI systems, this solution demonstrates the potential of agentic AI to streamline complex workflows and improve productivity.

Amazon Research Awards recipients announced
Amazon Science· 6 min read· May 27, 2026
Amazon Research Awards recipients announced

The Amazon Research Awards (ARA) recipients have been announced, spanning 49 universities across 11 countries, with access to Amazon public datasets, AWS AI/ML services, and tools. This collaboration enables researchers to leverage Amazon's resources, accelerating AI/ML advancements. The recipients will utilize these resources to drive innovation and push the boundaries of AI research. The ARA program fosters a collaborative environment between academia and industry, promoting knowledge sharing and advancements in AI.

Build a meeting prep and follow-up assistant with Amazon Quick and Cisco Webex MCP servers
AWS ML Blog· 15 min read· 3 days ago
Build a meeting prep and follow-up assistant with Amazon Quick and Cisco Webex MCP servers

This article demonstrates the integration of Amazon Quick and Cisco Webex MCP servers to build a custom meeting prep and follow-up assistant. The assistant uses a single prompt to gather information from prior meeting summaries, transcripts, and Vidcast highlights, providing a comprehensive review of upcoming meetings. This solution leverages the strengths of both Amazon Quick and Webex MCP to streamline meeting preparation and follow-up. However, the complexity of integrating multiple services may lead to increased development time and potential compatibility issues.

From PDFs to insights: Architecting an intelligent document processing pipeline with AWS generative AI services
AWS ML Blog· 14 min read· 3 days ago
From PDFs to insights: Architecting an intelligent document processing pipeline with AWS generative AI services

This article presents a cost-effective and scalable intelligent document processing pipeline on AWS, utilizing Amazon Bedrock and its BDA service to automate insights extraction from documents. The pipeline is demonstrated to extract key information from PDFs with a high degree of accuracy, achieving a 95% accuracy rate. This solution enables businesses to unlock valuable insights from large volumes of documents, improving operational efficiency and decision-making. The pipeline's scalability and cost-effectiveness make it an attractive option for organizations with extensive document collections.

Extract Data with On-demand and Batch Pipelines Dynamically
AWS ML Blog· 13 min read· 4 days ago
Extract Data with On-demand and Batch Pipelines Dynamically

This article presents an intelligent document processing pipeline that utilizes both on-demand and batch inference options on Amazon Bedrock, enabling flexible document processing in terms of time and cost. The pipeline can dynamically specify large language models and prompts at the document level, allowing for the extraction of data from multiple types of documents. The on-demand pipeline processes documents one-by-one, returning results within seconds, while the batch pipeline processes multiple documents asynchronously. The pipeline uses AWS SQS FIFO queues, AWS Lambda functions, and Amazon Bedrock Prompt Management to manage prompts and extract data from documents. The practical implication for engineers building AI systems is the ability to design flexible and cost-effective document processing pipelines that can handle large volumes of documents.

Optimize blueprint extraction accuracy in Amazon Bedrock Data Automation
AWS ML Blog· 15 min read· 4 days ago
Optimize blueprint extraction accuracy in Amazon Bedrock Data Automation

Amazon Bedrock Data Automation's blueprint instruction optimization feature can refine extraction instructions to improve accuracy in minutes, with a 10-example document input, resulting in improved blueprint extraction accuracy. This feature directly addresses the challenge of optimizing blueprint extraction and reduces the time required from weeks to minutes. By leveraging this feature, engineers can improve the accuracy of their data extraction pipelines, enabling faster and more reliable data processing. This optimization is particularly useful for large-scale data processing tasks where accuracy is critical.

How frontier teams are reinventing AI-native development
AWS ML Blog· 8 min read· 5 days ago
How frontier teams are reinventing AI-native development

Frontier teams are revolutionizing AI-native development by treating AI as the foundation of how software is built, resulting in 4.5x to 10x productivity gains. At Amazon, three paths to AI-native development have been identified, including a pathfinder initiative, structured sprint, and in-situ experiment, which have led to significant increases in developer productivity and code quality. The pathfinder initiative, for example, achieved a 20x increase in individual developer productivity and delivered a project in 76 days that was originally estimated to take 30 developers 12 to 18 months. This approach has significant implications for engineers building AI systems, as it enables them to focus on high-level goals and outcomes rather than discrete tasks.

Stop hand-tuning kernels: How Neuron Agentic Development accelerates AWS Trainium optimizations
AWS ML Blog· 12 min read· 5 days ago
Stop hand-tuning kernels: How Neuron Agentic Development accelerates AWS Trainium optimizations

AWS has introduced Neuron Agentic Development, a collection of AI agents and skills that accelerates kernel development for AWS Trainium and AWS Inferentia, reducing the need for manual kernel tuning. This capability is expected to streamline the development process and improve performance on these hardware accelerators. By leveraging AI-driven optimization, developers can focus on higher-level tasks, such as model development and deployment, while the system automatically fine-tunes the kernels for optimal performance. The Neuron Agentic Development capabilities are designed to work seamlessly with the existing AWS Trainium and AWS Inferentia infrastructure.

Build an AI-Powered Equipment Repair Assistant Using Amazon Bedrock AgentCore
AWS ML Blog· 13 min read· 5 days ago
Build an AI-Powered Equipment Repair Assistant Using Amazon Bedrock AgentCore

The authors demonstrate a practical AI-powered equipment repair assistant built using Amazon Bedrock AgentCore, which integrates natural language processing (NLP) capabilities to diagnose equipment issues, identify required parts, and provide manufacturer-approved repair procedures. This solution utilizes AgentCore Runtime, a cloud-based service that enables seamless integration with Amazon SageMaker and other AWS services. By leveraging AgentCore's capabilities, the repair assistant can process user queries and generate relevant responses, reducing the time and effort required for equipment maintenance. This solution showcases the potential of AI-powered tools in improving agricultural productivity and efficiency.

Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI
AWS ML Blog· 24 min read· 6 days ago
Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI

NVIDIA Isaac Lab on Amazon SageMaker AI enables the scaling of robot reinforcement learning by providing a managed infrastructure for distributed training and inference. This allows robotics teams to iterate quickly during research and run production-grade training jobs without the operational burden of maintaining compute clusters. With Amazon SageMaker HyperPod, teams can achieve cluster resiliency and control, while SageMaker Training Jobs provide a flexible compute option for shorter iterative experiments. The practical implication for engineers building AI systems is that they can focus on developing robot policies rather than managing infrastructure.

Hands-free first notice of loss: Using Strands Agents and Amazon Bedrock AgentCore Browser Tool for intelligent claims intake
AWS ML Blog· 22 min read· 6 days ago
Hands-free first notice of loss: Using Strands Agents and Amazon Bedrock AgentCore Browser Tool for intelligent claims intake

We present a hands-free first notice of loss (FNOL) intake system that integrates Strands Agents and Amazon Bedrock AgentCore Browser Tool, leveraging domain reasoning and live portal interaction to automate repetitive tasks, thereby preserving human expertise. This system demonstrates a 30% reduction in manual data entry time and a 25% increase in accuracy. The integration enables seamless communication between agents and the portal, streamlining the FNOL process. This solution can be applied to various industries, including insurance and healthcare, where FNOL is a critical step in the claims process.

Build an agentic incident triage assistant with Amazon Quick and New Relic
AWS ML Blog· 10 min read· 6 days ago
Build an agentic incident triage assistant with Amazon Quick and New Relic

Engineers can now build an agentic incident triage assistant using Amazon Quick and New Relic, leveraging the Model Context Protocol (MCP) Server to orchestrate a response. This assistant can be integrated with existing incident triage workflows, reducing mean time to detect (MTTD) and mean time to resolve (MTTR) by 30%. The assistant can be trained on New Relic's MCP Server to learn from historical data and adapt to new patterns, enabling more accurate and efficient incident triage.

Unlocking AI flexibility in Europe: A guide to cross-region inference for EU data processing and model access
AWS ML Blog· 11 min read· Jun 8, 2026
Unlocking AI flexibility in Europe: A guide to cross-region inference for EU data processing and model access

Amazon Bedrock's Cross-Region Inference (CRIS) capability allows customers to automatically route model inference requests across multiple AWS Regions within predefined geographic boundaries, enabling more resilient generative AI applications. CRIS offers system-defined inference profiles with global or geographic scopes, optimizing model throughput at low latency overhead. For EU customers, CRIS helps meet local data protection and processing requirements, including GDPR compliance. By using CRIS, customers can take advantage of model availability and capacity across multiple Regions while ensuring security and privacy.

It’s safe to close your laptop now: Hosting coding agents on Amazon Bedrock AgentCore
AWS ML Blog· 24 min read· Jun 8, 2026
It’s safe to close your laptop now: Hosting coding agents on Amazon Bedrock AgentCore

Amazon Bedrock AgentCore Runtime enables the concurrent execution of multiple AI coding agents, such as Claude Code, Codex, Kiro, and Cursor, in isolated microVMs with persistent workspaces and secure tool access, allowing developers to close their laptops without interrupting the workflow. This solution provides built-in observability and eliminates the need to share secrets, ports, or filesystems. The result is a more efficient and secure way to run AI-powered coding agents in parallel. This tradeoff is achieved by sacrificing some overhead in terms of resource allocation and management. To integrate this solution, developers can use the Amazon Bedrock AgentCore API and Gateway services.

EXPLORE AI NEWS

Daily hand-picked stories on LLMs, RAG, agents and production AI — curated for engineers who ship.

BROWSE NEWS

GET THE WEEKLY DIGEST

Join engineers getting the Monday signal-over-noise AI breakdown. No spam, unsubscribe anytime.

LEARN AI ENGINEERING

Curated courses, research papers, repos and tutorials built for engineers leveling up in AI.

START LEARNING