2024, Accountability and Benchmark - Customer Contact Central

Benchmarking Amazon Nova and GPT-4o models with FloTorch

AWS Machine Learning

MARCH 11, 2025

Using its enterprise software, FloTorch conducted an extensive comparison between Amazon Nova models and OpenAIs GPT-4o models with the Comprehensive Retrieval Augmented Generation (CRAG) benchmark dataset. OpenAI launched GPT-4o in May 2024, and Amazon introduced Amazon Nova models at AWS re:Invent in December 2024.

Benchmark

Benchmark APIs Enterprise Scripts

Your guide to generative AI and ML at AWS re:Invent 2024

AWS Machine Learning

NOVEMBER 19, 2024

Learn how they created specialized agents for different tasks like account management, repos, pipeline management, and more to help their developers go faster. Explore the significant productivity gains and efficiency improvements achieved across the organization.

APIs

APIs Enterprise Best practices Government

Anthropic Claude 3.5 Sonnet ranks number 1 for business and finance in S&P AI Benchmarks by Kensho

AWS Machine Learning

JULY 9, 2024

Sonnet currently ranks at the top of S&P AI Benchmarks by Kensho , which assesses large language models (LLMs) for finance and business. For example, there could be leakage of benchmark datasets’ questions and answers into training data. Anthropic Claude 3.5 Kensho is the AI Innovation Hub for S&P Global. Anthropic Claude 3.5

Finance

Finance Benchmark industry standards Accountability

Totango recognized as CS leader in G2 Winter 2024 reports

Totango

DECEMBER 20, 2023

We are delighted to share that Totango was recognized in the G2 Winter 2024 reports for its exemplary performance in the customer success software category. Totango’s high score from G2 users not only underscores our efficient implementation but also sets a best-in-class benchmark for other software in the customer success category.

Customer centricity

Customer centricity Enterprise Benchmark Upselling

From RAG to fabric: Lessons learned from building real-world RAGs at GenAIIC – Part 2

AWS Machine Learning

NOVEMBER 15, 2024

The analyst may ask questions such as “Show me all wells that produced oil on June 1st 2024,” “What well produced the most oil in June 2024?”, or “Plot the monthly oil production for well XZY for 2024.” has 92% accuracy on the HumanEval code benchmark. Each question requires different treatment, with varying complexity.

APIs

APIs Engineering Chatbots Construction

Expansion is a Team Sport: Strategies for Aligning CS and Sales in 2024

Totango

MAY 7, 2024

By our math, that means nine out of 10 revenue-accountable leaders need a hand getting on the same page as their fellow go-to-market (GTM) counterparts. That way, both teams can use those outcomes as a benchmark of success throughout the customer journey. This doesn’t just provide a useful shared goal.

Sales

Sales Upselling Metrics Accountability

Secure AccountantAI Chatbot: Lili’s journey with Amazon Bedrock

AWS Machine Learning

JULY 18, 2024

Small business proprietors tend to prioritize the operational aspects of their enterprises over administrative tasks, such as maintaining financial records and accounting. While hiring a professional accountant can provide valuable guidance and expertise, it can be cost-prohibitive for many small businesses.

Chatbots

Chatbots APIs Accountability Finance

Four key insights for SaaS and CS leaders from the 2024 B2B SaaS Benchmarking Survey

ChurnZero

MAY 2, 2024

The 2024 B2B SaaS Benchmarking Survey by SaaS Capital is the most comprehensive and up-to-date source of its kind for SaaS and customer success leaders who want to know where they stand compared to peers and competitors. We’re operating under the concept of rule of 60, with growth account ing for 20% and margins for 40%.

SaaS

SaaS Benchmark B2B Surveys

GraphStorm 0.3: Scalable, multi-task learning on graphs with user-friendly APIs

AWS Machine Learning

AUGUST 2, 2024

With GraphStorm, you can build solutions that directly take into account the structure of relationships or interactions between billions of entities, which are inherently embedded in most real-world data, including fraud detection scenarios, recommendations, community detection, and search/retrieval problems. Dataset Num. of nodes Num.

APIs

APIs Benchmark Construction Enterprise

Get started with Amazon Titan Text Embeddings V2: A new state-of-the-art embeddings model on Amazon Bedrock

AWS Machine Learning

MAY 2, 2024

We published a follow-up post on January 31, 2024, and provided code examples using AWS SDKs and LangChain, showcasing a Streamlit semantic search app. A common way to select an embedding model (or any model) is to look at public benchmarks; an accepted benchmark for measuring embedding quality is the MTEB leaderboard.

Benchmark

Benchmark Metrics Enterprise APIs

Mistral-Small-24B-Instruct-2501 is now available on SageMaker Jumpstart and Amazon Bedrock Marketplace

AWS Machine Learning

FEBRUARY 24, 2025

The 2501 version follows previous iterations (Mistral-Small-2409 and Mistral-Small-2402) released in 2024, incorporating improvements in instruction-following and reliability. The model is deployed in a secure AWS environment and under your VPC controls, helping to support data security for enterprise security needs.

APIs

APIs Enterprise Benchmark Feedback

Improve Amazon Nova migration performance with data-aware prompt optimization

AWS Machine Learning

APRIL 29, 2025

Since its launch in 2024, generative AI practitioners, including the teams in Amazon, have started transitioning their workloads from existing FMs and adopting Amazon Nova models. It includes evaluating and benchmarking the old and new models, optimizing prompts on the new model, and testing and deploying the new models in your production.

Metrics

Metrics Engineering Best practices Benchmark

Five essential truths for customer success leaders in mid-2024

ChurnZero

JUNE 12, 2024

Take a mid-2024 dive into these hot CS topics with ChurnZero CEO You Mon Tsang , who recently joined ESG’s Michael Harnum , and Melissa Langworthy for an in-depth discussion on how to retain and grow customers this year and beyond. The post Five essential truths for customer success leaders in mid-2024 appeared first on ChurnZero.

Benchmark

Benchmark SaaS Management Accountability

What Do Customers Want From A Typical Customer Service Situation?

Ansafone

MAY 12, 2025

Zendesk’s 2024 CX Trends Reportstates that 70% of customers expect brands to collaborate internally so they dont need to repeat themselves. Businesses should establish benchmarks and compare the performance across departments, regions, and service channels. Training also fosters a culture of growth and accountability.

Customer Service

Customer Service Interactive Voice Response Brand ambassadors Feedback

AWS empowers sales teams using generative AI solution built on Amazon Bedrock

AWS Machine Learning

AUGUST 26, 2024

Our field organization includes customer-facing teams (account managers, solutions architects, specialists) and internal support functions (sales operations). Personalized content will be generated at every step, and collaboration within account teams will be seamless with a complete, up-to-date view of the customer.

Sales

Sales Accountability Feedback Metrics

How Mixbook used generative AI to offer personalized photo book experiences

AWS Machine Learning

JULY 15, 2024

The buffer was implemented after benchmarking the captioning model’s performance. The benchmarking revealed that the model performed optimally when processing batches of images, but underperformed when analyzing individual images. AWS enables us to scale the innovations our customers love most.

Personalization

Personalization Engineering Benchmark Analytics

Data Governance in the Age of AI: A Competitive Edge for Business Leaders

COPC

OCTOBER 8, 2024

In 2024 alone, 11x more AI models were put into production than last year, showing a clear shift from experimentation to real-world application. These five pillars will help you maximize your return on investment (ROI): Quality Standards Set high benchmarks for accuracy and consistency.

Government

Government Healthcare Benchmark Technology

Achieve ~2x speed-up in LLM inference with Medusa-1 on Amazon SageMaker AI

AWS Machine Learning

FEBRUARY 12, 2025

Prerequisites To build the solution yourself, there are the following prerequisites: You need an AWS account with an AWS Identity and Access Management (IAM) role that has permissions to manage resources created as part of the solution (for example AmazonSageMakerFullAccess and AmazonS3FullAccess ).

Scripts

Scripts Metrics Engineering Accountability

Best practices for building robust generative AI applications with Amazon Bedrock Agents – Part 1

AWS Machine Learning

OCTOBER 2, 2024

Laying the groundwork: Collecting ground truth data The foundation of any successful agent is high-quality ground truth data—the accurate, real-world observations used as reference for benchmarks and evaluating the performance of a model, algorithm, or system. None What is the balance for the account 1234? Your appointment ID is XXXX.

Best practices

Best practices APIs Metrics Accountability

Mixtral 8x22B is now available in Amazon SageMaker JumpStart

AWS Machine Learning

MAY 17, 2024

What is Mixtral 8x22B Mixtral 8x22B is Mistral AI’s latest open-weights model and sets a new standard for performance and efficiency of available foundation models , as measured by Mistral AI across standard industry benchmarks. making the model available for exploring, testing, and deploying. Therefore, she sold the car for $18,248.33.

Benchmark

Benchmark APIs Personalization Enterprise

A review of purpose-built accelerators for financial services

AWS Machine Learning

SEPTEMBER 11, 2024

In March 2024, AWS announced it will offer the new NVIDIA Blackwell platform, featuring the new GB200 Grace Blackwell chip. Both Inferentia2 and Trainium use the same basic components, but with differing layouts, accounting for the different workloads they are designed to support.

Benchmark

Benchmark Banking Analytics Big data

The executive’s guide to generative AI for sustainability

AWS Machine Learning

APRIL 22, 2024

Figure 1: Examples of generative AI for sustainability use cases across the value chain According to KPMG’s 2024 ESG Organization Survey , investment in ESG capabilities is another top priority for executives as organizations face increasing regulatory pressure to disclose information about ESG impacts, risks, and opportunities.

Best practices

Best practices Benchmark Transportation Engineering

Evaluate conversational AI agents with Amazon Bedrock

AWS Machine Learning

JULY 25, 2024

Although existing large language model (LLM) benchmarks like MT-bench evaluate model capabilities, they lack the ability to validate the application layers. Begin by creating and testing the agent in your development account. You want to make sure the agent provides correct and reliable information about existing claims to end-users.

APIs

APIs Engineering Best practices Virtual Agent

Build a RAG-based QnA application using Llama3 models from SageMaker JumpStart

AWS Machine Learning

SEPTEMBER 12, 2024

On Hugging Face, the Massive Text Embedding Benchmark (MTEB) is provided as a leaderboard for diverse text embedding tasks. It currently provides 129 benchmarking datasets across 8 different tasks on 113 languages. She helps key enterprise customer accounts on their data, generative AI and AI/ML journeys.

APIs

APIs Benchmark Enterprise Construction

How to Solve the 2025 Customer Retention Problem

The Petrova Experience

JANUARY 14, 2025

In 2024 alone, hospitality saw a 20% decrease in already troubling customer retention rates. But larger trends like these do not account for all customer attrition. Think about the impact of benchmarking performance, processes, procedures, and even product and experience design and development to customer retention.

Customer retention

Customer retention Feedback B2C Metrics

How Clearwater Analytics is revolutionizing investment management with generative AI and Amazon SageMaker JumpStart

AWS Machine Learning

DECEMBER 13, 2024

trillion in assets across thousands of accounts worldwide. As of September 2024, the AI solution supports three core applications: Clearwater Intelligent Console (CWIC) Clearwaters customer-facing AI application. Multiple investment policies can be applied to individual accounts and aggregates of accounts.

Analytics

Analytics Management Accountability Engineering

Advancing Cybersecurity for Fraud Detection With AI

24-7 InTouch

MARCH 6, 2024

In 2024, the fintech industry finds itself at the forefront of technological evolution, driven by the unprecedented surge in digital transactions. Transparency in AI algorithms and governance mechanisms ensures accountability, promoting trust among your users.

Government

Government Benchmark Banking Accountability

Philips accelerates development of AI-enabled healthcare solutions with an MLOps platform built on Amazon SageMaker

AWS Machine Learning

NOVEMBER 16, 2023

Enable a data science team to manage a family of classic ML models for benchmarking statistics across multiple medical units. These capabilities are essential for demonstrating compliance with regulatory standards and ensuring transparency and accountability in AI/ML workflows.

Healthcare

Healthcare Government Engineering APIs

Rocketlane Raises $24M in Series B to Create a Post-sales CRM for Professional Services Teams

CSM Magazine

JUNE 25, 2024

Rocketlane helps SaaS providers provide an accelerated onboarding journey that holds customers accountable, shortens time-to-value, and helps begin expansion conversations earlier (which increases Net Revenue Retention), right from the first touchpoint. This growth follows another significant milestone for the company.

Sales

Sales CRM SaaS Enterprise

Medallia vs. Qualtrics vs. Lumoa: A Buyer’s Guide

Lumoa

APRIL 28, 2024

More than 80% of business leaders see customer experience as a growing priority in 2024. Every interaction with your business has the power to build or damage the relationship with your customer – and can directly impact your revenue. 78% of customers have backed out of a purchase due to a poor customer experience (CX).

Feedback

Feedback Consulting Consulting Enterprise

The Pros and Cons of Standardized Testing in Higher Education

Comm100

AUGUST 21, 2023

According to FairTest’s website , there are currently 1,900 colleges in the US that offer test-optional or test-blind policies for students seeking to enroll in Fall 2024 or beyond. Objective benchmark : Standardized tests provide schools with a consistent measure that can be compared across all students.

Education

Education Benchmark Analytics Surveys

Experts Reveal: The Art of Managing Key Customers

SmartKarrot

JANUARY 30, 2024

Using the OGSM framework, QBRs focus on quantifiable benchmarks, indicating whether the client is on the right track to achieving their goals. Customer success managers will handle a broader range of accounts with a focus on scaled activities. Predictive analytics will aid in understanding usage patterns and goal alignment.

Management

Management Enterprise industry standards Analytics

Experts Reveal: The Art of Managing Key Customers

SmartKarrot

JANUARY 30, 2024

Using the OGSM framework, QBRs focus on quantifiable benchmarks, indicating whether the client is on the right track to achieving their goals. Customer success managers will handle a broader range of accounts with a focus on scaled activities. Predictive analytics will aid in understanding usage patterns and goal alignment.

Management

Management Enterprise industry standards Analytics

Enable data sharing through federated learning: A policy approach for chief digital officers

AWS Machine Learning

MARCH 15, 2024

Policies and regulations like General Data Protection Regulation (GDPR), Health Insurance Portability and Accountability Act (HIPPA), and California Consumer Privacy Act (CCPA) put guardrails on sharing data from the medical domain, especially patient data.

Healthcare

Healthcare Government Best practices Engineering

Pixtral 12B is now available on Amazon SageMaker JumpStart

AWS Machine Learning

DECEMBER 10, 2024

Pixtral 12B overview Pixtral 12B represents Mistrals first VLM and demonstrates strong performance across various benchmarks, outperforming other open models and matching larger models, according to Mistral. Also, make sure that you have the account-level service limit for using ml.p4d.24xlarge 24xlarge or ml.pde.24xlarge

Benchmark

Benchmark Sales Accountability Engineering

How Cato Networks uses Amazon Bedrock to transform free text search into structured GraphQL queries

AWS Machine Learning

JANUARY 22, 2025

On our SASE management console, the central events page provides a comprehensive view of the events occurring on a specific account. 2024-10-{01/00:00:00--02/00:00:00}. To validate this approach, weve created a benchmark with hundreds of text queries and their corresponding expected JSON outputs.

APIs

APIs Benchmark Engineering Enterprise

Amazon Bedrock Custom Model Import now generally available

AWS Machine Learning

OCTOBER 21, 2024

models demonstrate state-of-the-art performance on a wide range of industry benchmarks and introduce features to help you build a new generation of AI experiences. The maximum concurrency that you can expect for each model will be 16 per account. The default import quota for each account is three models.

APIs

APIs Scripts Finance Real estate

Llama 4 family of models from Meta are now available in SageMaker JumpStart

AWS Machine Learning

APRIL 7, 2025

Prerequisites To try the Llama 4 models in SageMaker JumpStart, you need the following prerequisites: An AWS account that will contain all your AWS resources. lower due to economic conditions) | | 2023 | 33.36 | | 2024 | 37.68 (projected) | Amazon's net income has grown from $3.03 billion to a projected $574.78

Engineering

Engineering Marketing Personalization Metrics

Reducing Call Abandonment Rate by 78% Using Purpose-Built Technology

CSM Magazine

APRIL 17, 2025

In a matter of months, call abandonment rate dropped to consistently less than 3%, two times better than the industry benchmark. Good People, Held Back by Inefficiencies Caduceus Healths call center department handles more than 30,000 calls per month related to patient scheduling, accounts inquiries, and billing.

Abandon rate

Abandon rate Technology Employee engagement Call Center

Improve factual consistency with LLM Debates

AWS Machine Learning

NOVEMBER 22, 2024

This post and the subsequent code implementation were inspired by one of the International Conference on Machine Learning (ICML) 2024 best papers on LLM debates Debating with More Persuasive LLMs Leads to More Truthful Answers. It uses a different dataset, TofuEval. Details on the exact dataset can be found in the GitHub repository.

Consulting

Consulting Consulting APIs Calibration

Evaluate models or RAG systems using Amazon Bedrock Evaluations – Now generally available

AWS Machine Learning

APRIL 4, 2025

When we launched LLM-as-a-judge (LLMaJ) and Retrieval Augmented Generation (RAG) evaluation capabilities in public preview at AWS re:Invent 2024 , customers used them to assess their foundation models (FMs) and generative AI applications, but asked for more flexibility beyond Amazon Bedrock models and knowledge bases.

Metrics

Metrics APIs Engineering Benchmark

Benchmarking Amazon Nova and GPT-4o models with FloTorch

Your guide to generative AI and ML at AWS re:Invent 2024

Trending Sources

Anthropic Claude 3.5 Sonnet ranks number 1 for business and finance in S&P AI Benchmarks by Kensho

Totango recognized as CS leader in G2 Winter 2024 reports

From RAG to fabric: Lessons learned from building real-world RAGs at GenAIIC – Part 2

Expansion is a Team Sport: Strategies for Aligning CS and Sales in 2024

Secure AccountantAI Chatbot: Lili’s journey with Amazon Bedrock

Four key insights for SaaS and CS leaders from the 2024 B2B SaaS Benchmarking Survey

GraphStorm 0.3: Scalable, multi-task learning on graphs with user-friendly APIs

Get started with Amazon Titan Text Embeddings V2: A new state-of-the-art embeddings model on Amazon Bedrock

Mistral-Small-24B-Instruct-2501 is now available on SageMaker Jumpstart and Amazon Bedrock Marketplace

Improve Amazon Nova migration performance with data-aware prompt optimization

Five essential truths for customer success leaders in mid-2024

What Do Customers Want From A Typical Customer Service Situation?

AWS empowers sales teams using generative AI solution built on Amazon Bedrock

How Mixbook used generative AI to offer personalized photo book experiences

Data Governance in the Age of AI: A Competitive Edge for Business Leaders

Achieve ~2x speed-up in LLM inference with Medusa-1 on Amazon SageMaker AI

Best practices for building robust generative AI applications with Amazon Bedrock Agents – Part 1

Mixtral 8x22B is now available in Amazon SageMaker JumpStart

A review of purpose-built accelerators for financial services

The executive’s guide to generative AI for sustainability

Evaluate conversational AI agents with Amazon Bedrock

Build a RAG-based QnA application using Llama3 models from SageMaker JumpStart

How to Solve the 2025 Customer Retention Problem

How Clearwater Analytics is revolutionizing investment management with generative AI and Amazon SageMaker JumpStart

Advancing Cybersecurity for Fraud Detection With AI

Philips accelerates development of AI-enabled healthcare solutions with an MLOps platform built on Amazon SageMaker

Rocketlane Raises $24M in Series B to Create a Post-sales CRM for Professional Services Teams

Medallia vs. Qualtrics vs. Lumoa: A Buyer’s Guide

The Pros and Cons of Standardized Testing in Higher Education

Experts Reveal: The Art of Managing Key Customers

Experts Reveal: The Art of Managing Key Customers

Enable data sharing through federated learning: A policy approach for chief digital officers

Pixtral 12B is now available on Amazon SageMaker JumpStart

How Cato Networks uses Amazon Bedrock to transform free text search into structured GraphQL queries

Amazon Bedrock Custom Model Import now generally available

Llama 4 family of models from Meta are now available in SageMaker JumpStart

Reducing Call Abandonment Rate by 78% Using Purpose-Built Technology

Improve factual consistency with LLM Debates

Evaluate models or RAG systems using Amazon Bedrock Evaluations – Now generally available

Stay Connected