article thumbnail

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

AWS Machine Learning

Solution overview Our solution implements a verified semantic cache using the Amazon Bedrock Knowledge Bases Retrieve API to reduce hallucinations in LLM responses while simultaneously improving latency and reducing costs. Lets assume that the question What date will AWS re:invent 2024 occur? is within the verified semantic cache.

article thumbnail

Unlock cost savings with the new scale down to zero feature in SageMaker Inference

AWS Machine Learning

Today at AWS re:Invent 2024, we are excited to announce a new feature for Amazon SageMaker inference endpoints: the ability to scale SageMaker inference endpoints to zero instances. You can retrieve the number of copies of an inference component at any time by making the DescribeInferenceComponent API call and checking the CurrentCopyCount.

APIs 110
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Your guide to generative AI and ML at AWS re:Invent 2024

AWS Machine Learning

Seamlessly bring your fine-tuned models into a fully managed, serverless environment, and use the Amazon Bedrock standardized API and features like Amazon Bedrock Agents and Amazon Bedrock Knowledge Bases to accelerate generative AI application development.

APIs 110
article thumbnail

How AWS Sales uses generative AI to streamline account planning

AWS Machine Learning

In 2024, this activity took an account manager (AM) up to 40 hours per customer. To help improve this process, in October 2024 we launched an AI-powered account planning draft assistant for our sales teams, building on the success of Field Advisor , an internal sales assistant tool.

article thumbnail

Benchmarking Amazon Nova and GPT-4o models with FloTorch

AWS Machine Learning

OpenAI launched GPT-4o in May 2024, and Amazon introduced Amazon Nova models at AWS re:Invent in December 2024. Amazon Bedrock APIs make it straightforward to use Amazon Titan Text Embeddings V2 for embedding data. The growing need for cost-effective AI models The landscape of generative AI is rapidly evolving.

Benchmark 114
article thumbnail

GraphStorm 0.3: Scalable, multi-task learning on graphs with user-friendly APIs

AWS Machine Learning

adds new APIs to customize GraphStorm pipelines: you now only need 12 lines of code to implement a custom node classification training loop. To help you get started with the new API, we have published two Jupyter notebook examples: one for node classification, and one for a link prediction task. Specifically, GraphStorm 0.3

APIs 120
article thumbnail

From RAG to fabric: Lessons learned from building real-world RAGs at GenAIIC – Part 2

AWS Machine Learning

An alternative approach to routing is to use the native tool use capability (also known as function calling) available within the Bedrock Converse API. In this scenario, each category or data source would be defined as a ‘tool’ within the API, enabling the model to select and use these tools as needed.

APIs 120