Linghan
Active Member
But the question mentioned data sources are unstructured - metadata and various other sources from app or other channels, so it should be unsupervised learningClustering identifies outliers that do not have strong connections with the rest of the data. You can't predict PD with clustering only.
For example , you can use clustering to divide the applicants into 10 classes. But without any PD information about some applicants in each class. You will not know the PD of each class.