As AI technologies mature, organizations are moving beyond simple chat interfaces to build more sophisticated, context-aware products. Hilary Mason's insights reveal the challenges and opportunities in this next generation of AI applications.
The current landscape of AI products represents a significant evolution from the early days of simple chat interfaces to more complex, context-aware systems. According to Hilary Mason, co-founder and CEO of Hidden Door, we're in a moment of chaos where everyone is talking about AI, but few have a clear understanding of what constitutes truly effective AI products.
The Shift in AI Product Development
Mason traces her journey from academia to building AI products at scale, highlighting the fundamental shift from discrete engineering to probabilistic mindsets. This transition requires technologists to embrace uncertainty and develop systems that can handle the inherent unpredictability of AI technologies.
"The technology, while it can be understood mathematically and technically, is not well understood in the market," Mason explains. "You have a lot of people learning about it in ways that are not giving them robust mental models for making good decisions."
Understanding LLMs: Beyond the Hype
Contrary to much of the marketing hype, Mason presents a more grounded perspective on Large Language Models (LLMs). "LLMs are aspirationally an engine for producing generally mid content," she states. "That is what they are designed for at best. They take everything in the data and they make something that looks like what is most common in that data."
This perspective challenges the common narrative that positions LLMs as infallible sources of truth. Instead, Mason frames them as probabilistic systems that require careful management and appropriate expectations.
The Challenge of Context Management
One of the most significant challenges in building effective AI products is context management. This involves determining what information to use for specific tasks, which models to employ, and how to verify and integrate responses.
"The challenge right at this moment for people building these systems is context management," Mason emphasizes. "What I mean by that is really, what information are you using for what task and what purpose? What model are you hitting with it? Where does the response come out? What format is it? How is that verified?"
This complexity explains why many AI products today have suboptimal user experiences. As Mason demonstrates with her experience using Google's Gemini, even sophisticated AI systems can produce confusing interactions when context management is poorly implemented.
Beyond Chat: The Search for Better Interfaces
A critical insight from Mason's work is the recognition that chat interfaces, despite their popularity, are not ideal for many AI applications.
"Chat is a terrible interface for software," Mason asserts. "I am not sure it is a good interface for interpersonal communication either. We have ended up in a moment where because of the wild success of ChatGPT, which the model existed for some time before the product was built around it, everyone who is figuring out an AI product is starting with chat."
Her company Hidden Door has developed alternative approaches for their story-based games, providing users with example actions rather than relying solely on text input. This demonstrates how different applications may require fundamentally different interfaces.
Evaluating AI Products: New Metrics Needed
Traditional metrics for software evaluation often don't translate well to AI products. Mason questions whether metrics like "number of chat messages" are meaningful indicators of success.
"Everything we have learned in machine learning, precision, recall, accuracy in thinking about product analytics like daily active users and how many interactions does someone have? We have to question all of these metrics for evaluation as well right now," she explains.
The example of her son using ChatGPT at the Met Museum illustrates this point perfectly. While the AI provided incorrect translations of hieroglyphics, it served its purpose in that moment by sparking curiosity and learning. "Accuracy didn't matter. Directional accuracy mattered because it gave us something to learn and get excited about," Mason notes.
Cost Considerations and Architectural Approaches
Many organizations struggle with the operational costs of AI systems, particularly when using expensive proprietary models. Mason suggests several approaches to mitigate these costs:
Turning generation problems into ranking problems: "Can you pre-generate almost everything you need and then use embeddings to say, this is the thing that is most relevant to the thing we're looking for?"
Component-based architectures: Hidden Door's system can run with or without LLMs, allowing flexibility and cost control. "Originally, our whole system could run without LLMs. We do use them in the critical flow at this point, but we could still yank them out if we wanted to."
Simple, targeted solutions: The company uses a database of 40,000 English-language words and phrases with metadata to filter inappropriate content, which is both effective and inexpensive.
The Evolving Role of Engineers
Mason identifies an "existential crisis" in engineering as AI tools change what it means to be a valuable technologist. "Good judgment, systems thinking and design has never been more important than in a world where you can have 12 things implemented and they can all be crap," she argues.
The key differentiator for engineers moving forward will be the ability to:
- Form good questions
- Recognize quality solutions
- Understand systems in context
- Make appropriate judgment calls
This represents a fundamental shift from syntax knowledge to systems thinking and design acumen.
Business Implications and New Opportunities
The evolution of AI products is creating new business models and opportunities. Mason's Hidden Door exemplifies this approach by creating a platform where users can role-play in fictional worlds created by authors, with revenue shared back to the creators.
"It has never been a better time to build something yourself," Mason suggests. "If you need to hear it, you can hear it, because anyone who can think in those complex systems ways right now has a really big opportunity space. There's so much unexplored surface area, new business models, new product experiences."
Conclusion: Navigating the AI Product Landscape
The next generation of AI products requires organizations to move beyond the hype and develop more nuanced approaches. This involves:
- Understanding the limitations of current AI technologies
- Developing better interfaces than simple chat
- Implementing robust context management systems
- Creating appropriate evaluation metrics for AI products
- Designing flexible architectures that can adapt to changing technologies
- Cultivating engineering talent focused on systems thinking and judgment
As Mason concludes, "It is still super cool. The idea is saying we need to be able to give people trust, autonomy, that they can handle that complexity, and we need to be really rigorous in our process for how we manage our machines."
For organizations looking to develop next-generation AI products, the key takeaway is clear: success will come not from simply adopting the latest AI technologies, but from thoughtfully integrating them into broader systems that serve human needs and create genuine value.

Comments
Please log in or register to join the discussion