Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding the Schatten norm #1

Open
peekxc opened this issue Aug 26, 2023 · 1 comment
Open

Understanding the Schatten norm #1

peekxc opened this issue Aug 26, 2023 · 1 comment

Comments

@peekxc
Copy link

peekxc commented Aug 26, 2023

Hey! Out of curiosity, what is the limitation that prevents imate from working with products of rectangular matrices?

For any real-valued matrix $A \in \mathbb{R}^{n \times m}$ and positive semi-definite matrix $P \in \mathbb{R}^{n \times n}$ given by $P = A A^T$ (or $P = A^T A)$, I would like to estimate various Schatten norms of both $A$ and $P$, e.g.

from math import prod
import numpy as np 
from scipy.sparse import  random
from scipy.sparse.linalg import svds
from imate import schatten, trace

## Generate random semi-sparse matrix 
n = 500
A = random(n, 40, density=0.050)
P = A @ A.T

## Ground truth
sv = np.sort(np.abs(svds(A, k = min(A.shape)-1)[1]))
ev = np.sort(np.linalg.eigh(P.todense())[0])

print(f"Density of P: {P.nnz/prod(P.shape)}")
print(f"(A) Schatten-1 / nuclear norm: {np.sum(sv[sv > 1e-16])}")
print(f"(A) Schatten-2 / frobenius norm: {np.sum(sv[sv > 1e-16]**2)**(1/2)}")
print(f"(P) Schatten-1 / nuclear norm: {np.sum(ev[ev > 1e-16])}")
print(f"(P) Schatten-2 / frobenius norm: {np.sum(ev[ev > 1e-16])**(1/2)}")
# Density of P: 0.096224
# (A) Schatten-1 / nuclear norm: 110.72781488837865
# (A) Schatten-2 / frobenius norm: 18.074377305479523
# (P) Schatten-1 / nuclear norm: 330.2851529166273
# (P) Schatten-2 / frobenius norm: 56.81765769772352

I can get the Schatten-norms out of $P$ with imate:

schatten(P, p=1.0, method="exact")*P.shape[0]
schatten(P, p=2.0, method="exact")*P.shape[0]**(1/2)
# 330.28515291662717
# 56.81765769772352s

But I also want the norms for $A$. Initially I thought to use that fact that the singular values of $A$ are the square roots of the eigenvalues of $P$. This works well enough using numpy:

print(f"(A) Schatten-1 / nuclear norm, derived from P: {np.sum(np.sqrt(ev[ev > 1e-16]))}")
print(f"(A) Schatten-2 / frobenius norm, derived from P: {np.sum(ev[ev > 1e-16])**(1/2)}")
# (A) Schatten-1 / nuclear norm, derived from P: 112.62572173011456
# (A) Schatten-2 / frobenius norm, derived from P: 18.173749005547183

But when I try this for A via the gramian approach I get an error:

schatten(A, gram=True, p=1.0, method="slq")
# ValueError: Input matrix should be a square matrix.

EDIT: Actually the Schatten-p norm of a general matrix expressed as a trace quantity is given by 3rd equation in section 2.3 of Ubaru's paper:

$$ \lVert X \rVert_p = \left(\sum\limits_{i=1}^r \sigma_i^p \right)^{1/p} = \left(\sum\limits_{i=1}^r \lambda_i^{p/2} \right)^{1/p}$$

This doesn't seem capturable by imate's definition of the Schatten norm. Am I missing something? Or is this particular matrix function just yet to be implemented in imate?

@ameli
Copy link
Owner

ameli commented Sep 14, 2023

Thank you for getting in touch and bringing this issue to my attention. I have resolved the ValueError: Input matrix should be a square matrix error associated with the square matrix requirement when the Gram matrix is enabled.

Regarding your inquiry concerning the interpretation of the Schatten norm, both the mentioned paper and the package are aligned in their definitions for non-negative definite matrices. However, there is one distinction: within the package, we incorporate a coefficient of $n^{-\frac{1}{p}}$. To adhere to the mentioned paper's definition precisely, you can simply divide the result by this coefficient. For a more comprehensive understanding, you may refer to equations 3 to 5 in this article or consult this section within the package documentation.

Regarding the SLQ method, it is designed with the prerequisite that the input matrix is symmetric. This requirement aligns well with many practical applications, particularly in machine learning, where calculations involving trace functions are often required. These matrices in ML applications, such as covariance matrices, are typically non-negative definite.

However, if your interest extends to working with symmetric matrices that may not strictly adhere to the non-negative definite condition, please feel free to inform me. Adapting the implementation to handle such symmetric but indefinite matrices is a straightforward process and involves introducing an absolute value operation to the eigenvalues, effectively replacing $A$ with $\vert A \vert = \sqrt{A^{\ast} A}$.

Please don't hesitate to share further details or pose additional questions if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants