Aithos
We are a non-profit foundation focused on AI alignment research. Our aim is to defend autonomy and pluralism as we enter an age of increasingly powerful AI systems.
We develop frameworks and tools to ensure AI systems remain transparent, contestable, and compatible with human autonomy and the diversity of human values.
We believe the values that shape AI should remain open to disagreement, and work at the intersection of research, governance, and industry to make this a reality. We publish tools and standards openly, so they can inform how AI is actually built, deployed, and regulated.
Based in Amsterdam, Aithos is a Dutch Public Benefit Organisation (ANBI) with a global perspective, committed to opening up the AI conversation to all.
AI systems are increasingly shaping our digital world, economy, and even our social life. These systems carry implicit values that are rarely disclosed, poorly understood, and difficult to contest. Aithos exists to change that.
Value Diversity
We consider the variation in human values not a challenge to alignment, but its very foundation. The complexity of the ethical landscape represents richness to be represented, not noise to be filtered out. Multiple value systems can and should coexist.
Human Agency
We hold that people have a fundamental right to influence AI systems that affect their lives. Different stakeholders have different needs, and forcing consensus often silences legitimate perspectives. AI should safeguard and enable human autonomy.
Systemic Alignment
We see AI alignment as the society-wide challenge of engineering technical and social systems that accommodate diverse and conflicting values while promoting individual and social wellbeing, rather than convergence on a static ideal.
Procedural Legitimacy
We believe choices about the AI in our lives and societies are political decisions that belong in public discourse, not hidden behind technical complexity or corporate secrecy. How choices are made matters, independently of the outcome.
News

We published a comprehensive policy plan outlining our strategic objectives and focus in the coming years.

We will share our latest research on AI moral alignment and accountability at the Paris Conference on AI & Digital Ethics.
Blog

AI models show dramatically different ethical behavior at different temperature settings.

These findings fundamentally challenge how we evaluate AI systems.

The case for keeping safety evaluation prompts private to maintain their effectiveness.

Public safety prompts create systematic blind spots in evaluation frameworks by enabling targeted evasion.

Claude Opus 4.6 rarely verbalizes alignment faking in its reasoning.