Arama Sonuçları

Listeleniyor 1 - 2 / 2
  • Yayın
    Evaluating the efficiency of latent spaces via the coupling-matrix
    (Cornell Univ, 2025-09-08) Yavuz, Mehmet Can; Yanıkoğlu, Berrin
    A central challenge in representation learning is constructing latent embeddings that are both expressive and efficient. In practice, deep networks often produce redundant latent spaces where multiple coordinates encode overlapping information, reducing effective capacity and hindering generalization. Standard metrics such as accuracy or reconstruction loss provide only indirect evidence of such redundancy and cannot isolate it as a failure mode. We introduce a redundancy index, denoted ρ(C), that directly quantifies inter-dimensional dependencies by analyzing coupling matrices derived from latent representations and comparing their off-diagonal statistics against a normal distribution via energy distance. The result is a compact, interpretable, and statistically grounded measure of representational quality. We validate ρ(C) across discriminative and generative settings on MNIST variants, Fashion-MNIST, CIFAR-10, and CIFAR-100, spanning multiple architectures and hyperparameter optimization strategies. Empirically, low ρ(C) reliably predicts high classification accuracy or low reconstruction error, while elevated redundancy is associated with performance collapse. Estimator reliability grows with latent dimension, yielding natural lower bounds for reliable analysis. We further show that Treestructured Parzen Estimators (TPE) preferentially explore lowρ regions, suggesting that ρ(C) can guide neural architecture search and serve as a redundancy-aware regularization target. By exposing redundancy as a universal bottleneck across models and tasks, ρ(C) offers both a theoretical lens and a practical tool for evaluating and improving the efficiency of learned representations.
  • Yayın
    Variational self-supervised learning
    (Cornell Univ, 2025-04-06) Yavuz, Mehmet Can; Yanıkoğlu, Berrin
    We present Variational Self-Supervised Learning (VSSL), a novel framework that combines variational inference with self-supervised learning to enable efficient, decoder-free representation learning. Unlike traditional VAEs that rely on input reconstruction via a decoder, VSSL symmetrically couples two encoders with Gaussian outputs. A momentum-updated teacher network defines a dynamic, data-dependent prior, while the student encoder produces an approximate posterior from augmented views. The reconstruction term in the ELBO is replaced with a cross-view denoising objective, preserving the analytical tractability of Gaussian KL divergence. We further introduce cosine-based formulations of KL and log-likelihood terms to enhance semantic alignment in high-dimensional latent spaces. Experiments on CIFAR-10, CIFAR-100, and ImageNet-100 show that VSSL achieves competitive or superior performance to leading self-supervised methods, including BYOL and MoCo V3. VSSL offers a scalable, probabilistically grounded approach to learning transferable representations without generative reconstruction, bridging the gap between variational modeling and modern self-supervised techniques.