Using a newly developed model to upgrade a legacy model is a common practice in machine learning applications. After the upgrade, it is expected that the new model should outperform the legacy model in the regions of interest. However, it is observed that the new model often makes incorrect decisions on some instances where the legacy model still performs well. For a binary classification model (e.g., click-through-rate/CTR prediction model), such undesirable behavior could even occur in the low false positive region of the receiver operating characteristic (ROC) curve. Finding the reasons behind this phenomenon can help business partners in an organization gain confidence in adopting the new model and help modelers to improve the new model in future releases. In this paper, we present the "Learning from Disagreement" framework to understand and improve the performance of a predictive model. Under the setting of a binary classification task, this proposed approach focuses on instances that lead to contradictory decisions between a pair of models at a given operating point. We perform feature importance analysis exclusively on these instances, gain insights into the pair of models without even knowing their inner operations, and offer actionable feedback for model improvement. We demonstrate the usefulness of this framework on two real-world event detection datasets.