# Reminisce about a Dimensionality Reduction Technique

Vincent Li

I decided to share something that I’ve been working on in the field of dimensionality reduction.

In 1987, Pena and Box published a simple dimensionality reduction technique that, in my opinion, has been vastly under-rated. Assuming some minimal knowledge of time series analysis, I’ll present the beautiful results of this antique.

$Z_{t}^{k \times 1} = P^{k \times r} Y_{t}^{r \times 1} + \epsilon_{t}, \; \epsilon_{t} \sim (0, \Sigma_{\epsilon})$
$Z_{t}$ is observable data, $Y_{t}$ is underlying factors of lower dimension than $Z_{t}$. We believe that the observed data comes from a linear combination of components of the factor. And the goal is to recover the underlying factors.

We assume $Y_{t} \sim ARMA(p_{y}, q_{y})$.

We assume the components of the factor $Y_{t}$ are independent. An important corollary is that the $\Phi$ and $\Theta$ matrices of ARMA model $Y_{t}$ are all diagonal.

We assume $P'P = I_{r}$ to eliminate indeterminancy. This does not alter the time series structure of the problem and is simply a scaling of the $Y_{t}$. The proof follows from SVD.

Theorem 1: Representation of $Z_{t}$

$Z_{t}^{k \times 1} \sim ARMA(p_{y}, max(p_{y}, q_{y}))$

Theorem 2: Autocovariance of $Z_{t}$
The autocovariance matrix of $Z_{t}$ has rank r,
$\Gamma_{z}(k) = P \Gamma_{y}(k) P', \; k \geq 1$

$rank\Gamma_{z}(k) = r$

Theorem 3: Canoniacl Transformation
Suppose we are given matrix $P$. Then we can transform $Z_{t}$ into $Y_{t}$ by the following procedure.

Define Transformation Matrix
$M_{k\times k} = [P^{-'}_{r\times k}, B^{'}_{(k-r) \times k}]' , \; BP = 0$

Then applying $M$ to $Z_{t}$, $X_{t} = MZ_{t}$, we have
$X_{t} = [(Y_{t} + P^{-}\epsilon_{t}), (B\epsilon_{t})] = [X_{1t}^{'}, X_{2t}^{'} ]'$
$X_{1t}$ is our holy grail – the underlying factors plus some noise. $X_{2t}$ is the non-informational dimension of $Z_{t}$ and should be discarded.

Theorem 4: Non-Uniqueness of Representation of $Z_{t}$

Representation of $Z_{t}$ is not unique.
$\Phi^{*}(B)Z_{t} = \Theta^{*}(B)w_{t}$
Preserves the time series structure. Where for any A of $rank(A) = k - r$. The ARMA matrices are:
$\Phi^{*}_{l} = \Phi_{l} + A \Theta^{*}_{l} = \Theta_{l} + A$