
From finance to healthcare to consumer tech, enterprises are racing to move beyond pilots and bring Generative AI into production at scale. But doing so raises hard questions: How do you keep systems responsive under massive loads? How do you balance latency, cost, and compliance? And how do you turn experimentation into reliable, business-ready outcomes?
These were the issues that shaped an exclusive closed-door roundtable hosted by YourStory in partnership with Redis and AWS at Conrad Bengaluru. Titled “The Responsiveness Mandate: Scaling GenAI with Redis and AWS,” the session combined a guided coffee cupping exercise with a candid peer-to-peer discussion among 13 senior technology leaders.
The session brought together a diverse group of senior technology leaders representing some of India’s most innovative enterprises. In the room were Sumeet Singhal (Visa), Prabhash Mishra (Tata Digital), Debayan Bose (Redbus), Charu Venkatraman (Automation Anywhere), Karthik Jaya (Signify), Ankit Shrivastava (Porter), Gaurav Konar (Glance), Anshul Sharma (Scaler), Karthikeyan Ramasamy (Freshworks), Chaitanya Bharadwaj (Apollo 24|7), Narayan Babu (Zeta), Nihal Singh (Redis), Jitender Singh Gahlot (Redis) and Avinash Venkatagiri (AWS). The discussion was steered by Shivani Muthanna, Director of Strategic Partnerships and Content at YourStory Media, who moderated the dialogue across five interlinked themes of GenAI adoption — trust, infrastructure, economics, responsiveness, and personalization.
Brewing parallels: Coffee and code
The evening began with a guided coffee cupping session, where participants sampled light, medium, and dark roasts while reflecting on attributes like freshness, latency, and complexity. The exercise provided a metaphor that would frame the rest of the discussion. Just as a coffee’s character depends on the right blend of beans, roast, and brewing method, scaling GenAI requires balancing infrastructure, performance, cost, and compliance.
From pilots to production
When asked where they were in their AI journey, not a single leader described themselves as still experimenting. GenAI had already moved from sandbox to production across the board: clinical AI in healthcare, copilots for logistics, real-time personalization in consumer apps, and workflow automation in SaaS.
But enthusiasm was tempered by pragmatism. Panelists agreed that while foundational models grab the headlines, the real work lies in operationalizing AI: handling data governance, scaling infrastructure, managing costs, and integrating GenAI into business-critical workflows. The consensus was clear — experimentation is easy, but production demands a different mindset.
Building trust into adoption
Despite the momentum, adoption is not frictionless. Voices around the table pointed to two recurring barriers: employee usability and customer trust. Business teams often struggle with prompt engineering, while end users demand transparency and guardrails before they rely on AI-driven recommendations.
The group agreed that trust cannot be retrofitted. It must be baked into system design through explainability, careful rollouts, and strict safeguards, especially in regulated industries like finance and healthcare. Education and change management were highlighted as equally important levers.
Responsiveness as a design principle
Responsiveness — speed, accuracy, and context-awareness — emerged as the central theme of the evening. In consumer environments like e-commerce or ticketing, even a few milliseconds of latency can break conversion. In healthcare, on the other hand, panelists noted that accuracy often outweighs speed. The challenge is designing systems that adapt to these different thresholds of responsiveness.
What became clear is that responsiveness is not just a front-end concern but an architectural principle. It depends on embeddings, vector search, caching, state management, and orchestration of multiple models. This is where Redis Enterprise and AWS were seen as enablers, providing the low-latency data access and scalable infrastructure required to deliver consistent, real-time AI experiences.
Engineering responsiveness into scale
Examples from across industries highlighted how responsiveness is being embedded into enterprise systems. Leaders discussed agentic AI, where systems can troubleshoot, diagnose, and remediate issues in seconds instead of hours. But most agreed that pure LLMs are not ready for unfettered use. Instead, hybrid architectures — combining deterministic models with generative reasoning — are proving more reliable.
Security and compliance added another layer of complexity. Fintech and healthcare leaders in particular voiced concerns about sensitive data persisting unintentionally. The shared view: AI must be context-aware without becoming intrusive.
The economics of GenAI
Scaling GenAI hinges a lot on economics. B2C leaders spoke about the pressure of running GPU-intensive workloads for millions of users, where cost and latency are directly tied to business viability. The challenge is to deliver experiences at a fraction of today’s cost while keeping performance uncompromised.
B2B companies, meanwhile, face the complexity of global deployments: data residency requirements, local regulations, and compliance frameworks. Even when technology is ready, these factors slow down enterprise-scale adoption.
The discussion also surfaced a notable shift: users are increasingly willing to pay for AI-driven services. Subscription models priced at $20–$30 a month are gaining traction, replacing the once-dominant “free tier” culture. Leaders noted that this willingness to pay could fundamentally change product strategies, pushing companies to focus on high-value, monetizable use cases rather than scattered experimentation.
Mission-critical workflows and the edge
In sectors like logistics, automation, and ride-hailing, responsiveness is mission-critical. Here, panelists emphasized that not every problem needs a large model. Often, cached decisions, optimized smaller models, or rule-based systems deliver results in milliseconds. The skill lies in knowing when to invoke heavyweight LLMs and when to lean on leaner approaches.
On personalization, leaders converged on a pragmatic lens: AI should be deployed where it creates measurable value. Edtech, healthcare, and consumer internet companies shared examples of AI-driven interviews, adaptive learning, and layered CX personalization that users perceive as worth paying for. The message was clear — personalization must be purposeful, not performative.
Blending the perfect cup
The evening closed with a collaborative activity where participants designed their own coffee blends. Much like the cupping session that opened the evening, the exercise underscored the metaphor that ran through the discussion: great outcomes come from the right blend. For GenAI, that means balancing responsiveness, cost, compliance, and user value in equal measure.
The overarching consensus was that the future of GenAI will be defined less by the models themselves and more by the responsiveness of the systems that deliver them. Enterprises that succeed will be those that engineer responsiveness as a principle — from architecture to operations — and align it with sustainable business models.
Redis and AWS, with their combined strengths in real-time data, low-latency infrastructure, and scalable cloud platforms, are positioning themselves as key partners in that journey. But as the voices in the room agreed, the real differentiator lies in how enterprises blend it carefully into systems and workflows where every millisecond, every decision, and every dollar matters.

