Max’s tip for debugging models
Create 3 dataframes:
- One with random data, completely uncorrelated from target
- Should get accuracy no better than random guessing
- One that is exactly the same as regular data, except with the target as an extra column
- Should get 100%
- One where the target is overwritten to be a very simple function of existing data
- Should get close to 100%
Do all your operations on all 3 dataframes so you know if something goes wrong