-
Beyond Automation — The Case for AI Augmentation
The really transformative interfaces won't be the ones that make us more productive; they'll be the ones that make us more thoughtful, more creative, more aware of our own cognitive patterns. Like mirrors for our minds, showing us our blind spots and suggesting perspectives we habitually miss.
-
Rethinking Generation & Reasoning Evaluation in Dialogue AI Systems
As we rely further on (and reap the benefits of) LLMs’ reasoning abilities in AI systems and products, how can we still grasp a sense of how LLMs “think”? Where steerability is concerned — users or developers may desire to add in custom handling logic and instructions — how can ensure that these models continue to follow and reason from these instructions towards a desirable output?
-
Concepts for Reliability of LLMs in Production
By replacing traditional NLP models with LLM APIs, we trade the controllability for their flexibility, generalizability, and ease of use. How might we de-risk our ML systems and safeguard GenAI-enabled features in production?
-
Designing Human-in-the-Loop ML Systems
As machine learning practitioners, we constantly strive to produce the highest-performing models to achieve the best business outcomes. But model development is only the tip of the iceberg; how well an ML solution performs has to be continuously evaluated on live predictions. When using trained models, we subtly invoke an assumption -- that the training data distribution sufficiently approximates the unseen data distribution. Unfortunately, though, this does not always hold.
-
Learning Bayesian Hierarchical Modeling from 8 Schools
A walkthrough of a classical Bayesian problem.