Netflix routinely uses online A/B experiments to inform strategy and operation discussions, as well as whether certain product changes should be launched. Over time these discussions grew to be increasingly specialized, generating demand for more and richer metrics powered by extensible statistical methodologies that are capable of answering diverse causal effects questions. To support these ever-growing use-cases, Netflix made a strategic bet to make their experimentation science-centric; that is, to place a heavy emphasis on enabling arbitrary data analyses methods for causal inference that are developed in different fields of science. To implement this science-centric vision, Netflix’s experimentation platform, Netflix XP, was reimagined around three key tenets: trustworthiness, scalability, and inclusivity. In this extended abstract, we report on the architecture of this platform, with a special emphasis on its novel aspects: how it supports science-centric end-to-end workflows without compromising important engineering requirements. Secondly, we briefly describe its approach to causal inference, which leverages the potential outcomes framework to provide a unified abstraction layer for arbitrary statistical models and methodologies.