Our Guidelines Were Never Written for Classifiers
Trust and Safety teams are building classifiers against policy documents that were never written for machine inference. The gap between written guidelines and classifier behavior is now one of the primary sources of liability. Enforcement teams routinely flag behavior that violates the spirit of policy but does not trigger any rule. That negative space between policy intent and classifier threshold is where your exposure accumulates. Why this matters to you: Your policy team writes guidelines for human moderators. Your applied science team converts those guidelines into classifier targets. The translation is lossy. Nuance drops out. Developmental context disappears. When enforcement reviews a flagged interaction and finds behavior that "perhaps should" have triggered a policy but didn't, that is not a classifier bug. That is a policy primitive gap. CCHAC provides harm definitions written for computational implementation, not human interpretation. For: Safeguards Policy Design Managers. Trust and Safety Staff Engineers. Red Team Program Managers. Operations & Strategy Managers at AI companies (Scale AI-type roles).



