Max’s tip for debugging models

Create 3 dataframes:

  • One with random data, completely uncorrelated from target
    • Should get accuracy no better than random guessing
  • One that is exactly the same as regular data, except with the target as an extra column
    • Should get 100%
  • One where the target is overwritten to be a very simple function of existing data
    • Should get close to 100%

Do all your operations on all 3 dataframes so you know if something goes wrong