Trusted AI Begins And Ends With Alignment
What do discrimination, harmful content, unexpected outcomes, and catastrophic business failure have in common? They are all potential consequences of misaligned AI. They are also largely avoidable. Misaligned AI erodes trust among employees, customers, government, and other stakeholders, and trusted AI holds the key to enterprise AI success. Without trust in AI, enterprises will not be able to reap the full benefits and transformational impact of AI. Understanding how to reduce these risks and continue to build trust is the subject of brand-new research from Brian Hopkins, Enza Iannopollo, and myself: Align By Design (Or Risk Decline).
What Is AI Misalignment?
Today’s AI systems are shackled inside Plato’s cave, experiencing a mere shadow representation of the real world in their training data. The result is misalignment between the intended and actual outcomes of the system. Sometimes, misalignment can be amusing, as when Google’s AI Overviews tool suggested that users put glue on pizza and eat rocks. Other times, misalignment can be harmful, as when the National Eating Disorders Association’s chatbot recommended counting calories. The dangers that misalignment poses are only going to grow as AI continues to advance in ability and agency.
Align By Design Or Risk Decline
It’s time for organizations to take AI alignment seriously, right from the beginning of AI development and not after an “oops” moment when things go awry. In the report, we propose an “align by design” approach, which we define as:
A proactive approach to developing AI systems, ensuring that they meet business goals while adhering to company values, standards, and guidelines throughout the AI development lifecycle.
Aligning by design requires effort across all three sides of the golden triangle — people, technology, and processes:
- People: Organizational alignment enables AI alignment. Misalignment is not just an AI problem; it is also an organizational problem that plagues companies today as disparate lines of business possess myriad incentives and optimize for divergent KPIs. Companies (and, especially, technology teams) must align internally on objectives, standards, principles, and values to ensure AI alignment.
- Technology: Balance helpfulness and harmlessness with alignment techniques. Overloading models with guardrails and tuning can diminish their effectiveness, while insufficient alignment may lead to harmful outputs or unintended actions. Achieve the right balance by leveraging an assortment of alignment techniques such as fine-tuning, prompt enrichment, and controlled generation to ensure that systems achieve intended (and not objectionable) objectives.
- Processes: Plan for remediation. Our research uncovered the unsettling fact that, despite best efforts, AI misalignment is an inevitability. Companies therefore need to be prepared to respond to unintended outcomes and to mitigate their negative impact.
I am honored to be delivering a keynote on the topic of AI alignment at Forrester’s Technology & Innovation Summit in Austin, Texas, September 9–12. If you’re a technology, data, or analytics leader grappling with AI risk, opportunities, and ROI, please consider joining us there.
In the meantime, please read the report and reach out to Brian Hopkins, Enza Iannopollo, or myself with questions.