14 Section 4: Learnings for practitioners
14.1 To improve personalization, designs need to be diverse and take risks
Any design team, regardless of domain, goes through a prioritization process. We winnow down design ideas based on a set of criteria. These are usually some combination of feasibility, prospective effectiveness, and cost. In some cases, we should deviate from this approach at the pilot phase. Rather than select designs that are likely to impact the most people at the lowest cost, designers can push a more diverse set of designs that take calculated risks. The risk-reward tradeoff looks different when we can assess impact for a segmented population and personalize designs. Rather than seeing no effect for risky designs, we can isolate the effect where it makes a real difference in the lives of users and clients.
14.2 When you see no effect, you may simply have no effect. That’s okay.
When we first started this work, we did not rule out experiments that showed little-to-no average treatment effect. Our hypothesis was that a small, marginally significant treatment effect may actually be a large treatment effect among a subset of the population. After estimating a few models, it was clear to us that the absence of a treatment effect was not due to heterogeneity, but instead it was due to there being a simply small impact from treatment. This is also a useful outcome; in those cases, we learned that the intervention’s effect is simply marginal over business-as-usual.
14.3 Some systems are not ready for this. Start low-fi and build from there.
As the interest in data science and machine learning grows, many systems will reveal opportunities for improvement. Where you’re not collecting data, you can start. And often an interrogation of data sources is an important first step to using data to improve outcomes. High quality data collection and organization are key to making machine learning methods a useful tool in your toolkit.
14.4 Next steps
ideas42 and its research partners are changing the way we approach human decision-making by testing machine learning methods to improve field experimentation. We urge others to join in and do the same.