Dr. Pablo Piantanida
An Information-Theoretic Approach to Selected Topics in Machine Learning and Statistics
In the first part of this talk, we investigate the problem of distributed biclustering of memoryless sources.
This scenario consists of a set of distributed stationary memoryless sources where the goal is to find ratelimited
representations such that the mutual information between two selected subsets of descriptions
(each of them generated by distinct encoder functions) is maximized. This formulation is fundamentally
different from conventional information-theoretic problems since here redundancy among descriptions
should actually be maximally preserved. Furthermore, necessary and sufficient conditions for the special
case of two arbitrarily correlated Rademacher random variables and Boolean encoders are derived.
Interestingly, these results positively resolve long-standing open conjecture.
In the second part of the talk, we study the problem of collaborative distributed hypothesis testing. Two
statisticians are required to declare the correct probability measure of two jointly distributed memoryless
processes out of two possible probability measures. The marginal samples given are assumed to be
available at different locations and the statisticians are allowed to exchange limited amount of data over
multiple rounds of interactions. A new achievable error exponent is derived based on the use of nonasymptotic
binning, improving the quality of communicated descriptions. Optimal achievable error
exponents for the special cases of testing against independence and zero-rate communication (data
exchanges grow sub-exponentially with n) are characterized. Application examples to binary symmetric
sources are provided as well.
Joint work with Georg Pichler (TU Wien, Austria), Prof. Gerald Matz (TU Wien, Austria), Gil Katz
(CentraleSupélec, France) and Prof. Merouane Debbah (Huawei Technologies Co., France)
Miércoles 4/5 11 hs. Sala de seminarios del primer piso