Current alignment methods address straightforward safety standards but fail in complex scenarios where AI must navigate competing values with no single correct answer.
We develop frameworks that acknowledge multiple valid perspectives, enabling AI systems to operate ethically in pluralistic environments.
We seek researchers and engineers passionate about AI alignment that respects diverse value perspectives.
Lead technical research on AI safety and alignment, developing novel approaches to value learning, preference modeling, and robustness in AI systems.
Build and maintain experimental infrastructure for AI safety research, focusing on scalable systems for testing alignment theories and value learning approaches.
Investigate foundational questions in AI alignment and safety, combining theoretical analysis with empirical insights to develop new frameworks for value learning and preference modeling.