Acquiring fear responses to predictors of aversive outcomes is crucial for survival. At the same time, it is important to be able to modify such associations when they are maladaptive, for instance in treating anxiety and trauma-related disorders. Standard extinction procedures can reduce fear temporarily, but with sufficient delay or with reminders of the aversive experience, fear often returns. The latent-cause inference framework explains the return of fear by presuming that animals learn a rich model of the environment, in which the standard extinction procedure triggers the inference of a new latent cause, preventing the unlearning of the original aversive associations. This computational framework had previously inspired an alternative extinction paradigm – gradual extinction – which indeed was shown to be more effective in reducing the return of fear. However, the original framework was not sufficient to explain the pattern of results seen in the experiments. Here, we propose a formal model to explain the effectiveness of gradual extinction in reducing spontaneous recovery and reinstatement effects, in contrast to the ineffectiveness of standard extinction and a gradual reverse control procedure. We demonstrate through quantitative simulation that our model can explain qualitative behavioral differences across different extinction procedures as seen in the empirical study. We verify the necessity of several key assumptions added to the latent-cause framework, which suggest potential general principles of animal learning and provide novel predictions for future experiments.