Latent Dirichlet Allocation (LDA) is a method for clustering documents.
Write a Latent Dirichlet Allocation implementation for PHP, most probably using Gibbs Sampling. I do not know of any PHP implementation, but there are for python (https://github.com/shuyo/iir/blob/master/lda/lda.py), C (http://gibbslda.sourceforge.net/), and other languages.
The input will be an array of strings (each string = a document).
The output will be the LDA clusters (topics, word probabilities within each …