Factor Analysis and Uniform distributions27 Mar 2018
This month I worked on a psychometric project. It is interesting that I had never heard about this field before. Psychometrics, is basically statistics applied to psychology, similar to econometrics, biostatitcs etc. What is great about my economic background, though, is the rigorous mathematical and statistical background - it helped me learn the subject and contribute fairly quickly.
The task was to design an algorithm which randomly generates correlation matrices satisfying given factor analysis model.
I finished the great textbook “Modern Factor Analysis” by Harry H. Harman and learned a lot. In a nutshell, Factor analysis is the method used to find latent common factors which underlie the given variables; once found, such factors may be used in the model as regressors, thus reducing the number of variables.
Psychologists use this to evaluate various test results: 100-question test by 1000 people gives a large panel dataset. However if the test questions could be reduced to their underlying hidden factors like introversion, risk aversion, optimism etc. we could considerably simplify the analysis. After all, the measure of those traits (and not the test questions) is THE purpose of psychology.
Generating correlation matrices was easy, but generating them uniformly was a big trouble. Indeed, a product of two uniform random variables is not uniform. I have discovered a great combination of free terms for beta distribution to deal with this problem:
First of all, let’s recall the density plot of a uniform distribution:
Unfortunately, the product of uniform distributions is not uniform and is skewed towards origin. So, I decided to find the square root of a uniform distribution. After long inquiries I found a great proof that square root of the uniform distribution is Beta(2,1) distribution.
Unfortunately, the product of two distinct i.i.d. Beta(2,1)-distributed variables was not uniform. So, the above property wasn’t useful for the project. But I was inspired now — I discovered the beta distribution and wasn’t going to let it go — I googled “the product of beta distributions” and found a great deal of related literature (please look it up if you are interested).
I discovered that the product formula for two beta-distributed variables was really complex and I decided that the rigor could be compensated for intuitiveness in this case, after all this was statistics and not a real analysis.
After trying to solve for the free parameters and combining this with trial-and-error, I have finally discovered the answer — the distribution was Beta(1.2, 0.5)!
The density plot for Beta(1.2, 0.5) looks as follows:
The product of two i.i.d. variables from this distribution is approximately uniformly distributed, as can be seen below:
I find this discovery extremely enlightening and useful. Hope you will too. To read about my other projects please see my other blog entries.