English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Poster

PAC-Bayesian Bounds for Discrete Density Estimation and Co-clustering Analysis

MPS-Authors
/persons/resource/persons84206

Seldin,  Y
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Seldin, Y. (2010). PAC-Bayesian Bounds for Discrete Density Estimation and Co-clustering Analysis.


Cite as: https://hdl.handle.net/11858/00-001M-0000-0013-C10E-1
Abstract
We applied PAC-Bayesian framework to derive gen- eralization bounds for co-clustering1. The analysis yielded regularization terms that were absent in the preceding formulations of this task. The bounds sug- gested that co-clustering should optimize a trade-off between its empirical performance and the mutual in- formation that the cluster variables preserve on row and column indices. Proper regularization enabled us to achieve state-of-the-art results in prediction of the missing ratings in the MovieLens collaborative filtering dataset. In addition a PAC-Bayesian bound for discrete den- sity estimation was derived. We have shown that the PAC-Bayesian bound for classification is a spe- cial case of the PAC-Bayesian bound for discrete den- sity estimation. We further introduced combinatorial priors to PAC-Bayesian analysis. The combinatorial priors are more appropriate for discrete domains, as opposed to Gaussian priors, the latter of which are suitable for continuous domains. It was shown that combinatorial priors lead to regularization terms in the form of mutual information.