Scaling AI Operations
Architecting a high-efficiency internal annotation ecosystem to train advanced Object Detection and Vision-Language Models for complex UI pattern recognition.
* Disclaimer *
Due to the confidentiality of this project, the extent of work presented on this page has been limited in accordance with a non-disclosure agreement. All information in this case study reflects my contributions and does not necessarily reflect the views of the organization.
Due to the confidentiality of this project, the extent of work presented on this page has been limited in accordance with a non-disclosure agreement. All information in this case study reflects my contributions and does not necessarily reflect the views of the organization.
M Y R O L E
User Experience DesignerAI Trainer
T E A M
AI Data Labelling Team, Machine Learning EngineersS C O P E
Internal Labelling Tool deployment, AI training strategy refinement, and labeller workflow management.T I M E
2025-2026O V E R V I E W
Mobbin is a curated, web-based library of real-world mobile and web app screenshots designed for UI/UX designers to find design inspiration, analyze user flows, and identify industry best practices. To process, categorize, and tag thousands of complex mobile and web interfaces at scale, the platform relies heavily on advanced AI models. However, an AI is only as intelligent as the data it is trained on.
This project focused on the complete operational and experiential overhaul of Mobbin’s internal AI Labelling Tool and the strategic management of the human labelling team. By serving as the critical bridge between human annotators and AI engineers, I redesigned the internal tooling ecosystem to drastically improve the speed, accuracy, and contextual depth of our machine learning training pipeline.
This project focused on the complete operational and experiential overhaul of Mobbin’s internal AI Labelling Tool and the strategic management of the human labelling team. By serving as the critical bridge between human annotators and AI engineers, I redesigned the internal tooling ecosystem to drastically improve the speed, accuracy, and contextual depth of our machine learning training pipeline.
O B J E C T I V E
To optimize the internal AI training pipeline by resolving severe operational friction within the human labelling team, overhauling the new labelling tool interface for maximum annotation throughput, and enhancing the contextual accuracy of the underlying AI models (transitioning from basic Object Detection to Vision-Language Models).
C H A L L E N G E
The human-in-the-loop (HITL) training ecosystem was fracturing under the weight of scaling operations, manifesting in three critical areas.
1. Operational Burnout & Labeller Friction
The human annotation team was experiencing severe operational bottlenecks and plummeting morale. This was driven by an opaque quota tracking system, confusing and often punitive rejection metrics, and deeply fragmented, hard-to-access documentation.2. Technical Interface Bottlenecks
The legacy labelling interface was clunky and plagued with technical bugs. The UI actively slowed down the intricate work of drawing bounding boxes and applying metadata to complex screen architectures.3. AI Model Accuracy Limits
The existing Object Detection Model (ODM) had hit a ceiling regarding contextual nuances. It consistently failed to differentiate visually similar but functionally distinct elements—for example, struggling to accurately distinguish a promotional "Banner" from a system "Badge" without understanding the surrounding context.T H E A P P R O A C H
To build a smarter AI, we first had to build a flawless operational environment for the humans training it. I structured this transformation into three strategic phases.
1. Establishing the Baseline
Operational & UX Auditing
Before touching the interface, I conducted deep-dive qualitative and quantitative research directly with the remote labelling team to uncover the root causes of their operational slowdowns.
Workflow Deconstruction
I shadowed annotators to map their exact end-to-end journey. I uncovered that labellers were spending excessive time context-switching away from the tool to reference massive, disconnected documentation sheets whenever they encountered ambiguous UI patterns.
Workflow Deconstruction
I shadowed annotators to map their exact end-to-end journey. I uncovered that labellers were spending excessive time context-switching away from the tool to reference massive, disconnected documentation sheets whenever they encountered ambiguous UI patterns.
Quota & Rejection Analysis
I audited the backend operational metrics, identifying that the current system for tracking daily quotas and managing task rejections was fundamentally broken. It lacked transparency, leading to extreme frustration and a high rate of defensive, low-quality labelling just to meet targets.2. Architecting the Solution
Redesigning Labelling Tool
I completely overhauled the AI Labelling Tool to prioritize high-speed, high-accuracy human annotations while removing systemic operational blockers.
In-Context Documentation Integration
I designed a dynamic, intelligent workspace that embedded contextual guidelines directly into the new labelling tool interface. When a labeller hovered over or selected a complex tag (like "Banner"), the tool instantly surfaced visual examples and exact definitions, entirely eliminating the need for context-switching and drastically reducing human error.Transparent Feedback Loops
I redesigned the quota tracking and QA rejection mechanisms. I shifted the system from a punitive model to an educational one—providing labellers with transparent dashboards, clear pathways to dispute or learn from rejections, and realistic, easily trackable daily quotas.3. Enhancing AI Context
The VLM Integration
With the human operational pipeline stabilized, I shifted focus to the technical output, acting as the strategic liaison between the labelling operations and the AI engineering team.