Conversation
|
Thanks @caglayantuna ! This is super useful / interesting. In regular CP I noticed after normalization, the value of some was very large, which can be problematic, especially if we use lower precision dtypes (e.g. float32, 16 etc). |
|
Let's make a few more tests including random tensors, and then chat about which method to keep in tensorly, and then try to change all relevant functions to have the same normalization style. Personnaly I don't think the way we normalize will impact the results's stability so much (at least I have not observed it in practical applications). On the other hand, normalization has very small computational complexity, so it cannot hurt to overdo it a little. I think we could keep solution 1 (normalize all factors after each outer loop) always using the CPTensor.normalize routine (in Tensorly we sometimes use dedicated code, this PR also fixes that). The key point is that we propose to keep the weights stored in the |
|
I have added new notebooks with differently magnified random tensors. Besides, I have increased number of experiments to see if there is any difference on variance of error or processing time in the other notebooks. According to all these experiments, we don't see any significant difference between the normalization methods. I will be waiting for you to select one of the method to update the PR. |
Comparison of possible normalization methods
This PR is related to issue #264. After our discussion, I have done some experiments in order to compare possible normalization methods for tensor decomposition and I wanted to share notebooks to be sure about the normalization method for tensorly.
In our PR #281, we suggested to normalize factors after inner iteration by using cp_normalize function for non_negative_parafac, non_negative_parafac_hals and parafac functions. As we discussed, there are different options to normalize factors and there is no exact way in the literature.
To normalize factors during inner iteration, we suggest to normalize last factor after error computation since we have used weights to compute mttkrp and removed the weights from the iprod computation. Following normalization methods are implemented to an audio data and a hyperspectral satellite image:
In addition, parafac function experiments include its own normalization.
Conclusion
In the experiments, we demonstrate weights and average of the factors to see numerical stability. Besides, we report processing time and RMSE for each experiment. Briefly, there is no significant difference between the methods. We can select one of them to update our PR and tensorly.