scikit-pns documentation#
Principal nested spheres (PNS) analysis [1] for scikit-learn.
The main API classes are IntrinsicPNS and ExtrinsicPNS.
Low-level functions are available in skpns.pns.
Jung, Sungkyu, Ian L. Dryden, and James Stephen Marron. “Analysis of principal nested spheres.” Biometrika 99.3 (2012): 551-568.
Installation#
scikit-pns can be installed using pip:
pip install scikit-pns
Quickstart#
scikit-pns is imported as skpns.
from skpns import IntrinsicPNS
from pns.util import circular_data
X = circular_data([0, -1, 0])
X_new = IntrinsicPNS().fit_transform(X)
ONNX support#
Transformers can be converted to ONNX models.
Note
To use this feature, you need to install scikit-pns with [onnx] optional dependency:
pip install scikit-pns[onnx]
import numpy as np
from skpns import ExtrinsicPNS, IntrinsicPNS
from pns.util import circular_data
from skl2onnx import to_onnx
import matplotlib.pyplot as plt
# Train and save model
X = circular_data([0, -1, 0]).astype(np.float32) # Must be float32
int_pns = IntrinsicPNS(2).fit(X)
with open("int_pns.onnx", "wb") as f:
f.write(to_onnx(int_pns, X[:1]).SerializeToString())
ext_pns = ExtrinsicPNS(2).fit(X)
with open("ext_pns.onnx", "wb") as f:
f.write(to_onnx(ext_pns, X[:1]).SerializeToString())
# Load model
import onnxruntime as rt
ext_sess = rt.InferenceSession("ext_pns.onnx", providers=["CPUExecutionProvider"])
ext_onnx = ext_sess.run([ext_sess.get_outputs()[0].name], {ext_sess.get_inputs()[0].name: X})[0]
int_sess = rt.InferenceSession("int_pns.onnx", providers=["CPUExecutionProvider"])
int_onnx = int_sess.run([int_sess.get_outputs()[0].name], {int_sess.get_inputs()[0].name: X})[0]
fig = plt.figure()
ax1 = fig.add_subplot(121)
ax1.plot(*int_pns.transform(X).T, "o", label="Python runtime")
ax1.plot(*int_onnx.T, "x", label="ONNX runtime")
ax1.set_xlim(-np.pi, np.pi)
ax1.set_ylim(-np.pi / 2, np.pi / 2)
ax1.legend()
ax1.set_title("IntrinsicPNS")
ax2 = fig.add_subplot(122)
ax2.plot(*ext_pns.transform(X).T, "o", label="Python runtime")
ax2.plot(*ext_onnx.T, "x", label="ONNX runtime")
ax2.set_aspect("equal")
ax2.legend()
ax2.set_title("ExtrinsicPNS")
fig.show()
Converting inverse transformers#
ONNX model for transformer only supports forward transformation. To build a model for inverse transformation, use dedicated wrapper instead.
from skpns import InverseExtrinsicPNS
from pns.util import unit_sphere
X_transform = ext_pns.transform(X).astype(np.float32)
invext_pns = InverseExtrinsicPNS(ext_pns)
with open("invext_pns.onnx", "wb") as f:
f.write(to_onnx(invext_pns, X_transform[:1]).SerializeToString())
invext_sess = rt.InferenceSession("invext_pns.onnx", providers=["CPUExecutionProvider"])
invext_onnx = invext_sess.run(
[invext_sess.get_outputs()[0].name], {invext_sess.get_inputs()[0].name: X_transform}
)[0]
fig = plt.figure()
ax = fig.add_subplot(projection='3d', computed_zorder=False)
ax.plot_surface(*unit_sphere(), color='skyblue', edgecolor='gray')
ax.plot(*ext_pns.inverse_transform(X_transform).T, "o", label="Python runtime")
ax.plot(*invext_onnx.T, "x", label="ONNX runtime")
ax.legend()
fig.show()
Module reference#
High-level API#
- class skpns.IntrinsicPNS(n_components=None, tol=0.001, maxiter=None, lm_kwargs=None)[source]#
Principal nested spheres (PNS) analysis with intrinsic coordinates.
Reduces the dimensionality of data on a high-dimensional hypersphere while preserving its spherical geometry.
The resulting data are intrinsic Euclidean coordinates, which are the scaled residuals in each dimension. For example, n_components=2 represents data on the surface of a 3D sphere.
- Parameters:
- n_componentsint, default=None
Number of components to keep. Data are transformed onto a Euclidean space in this dimension, representing the surface of a hypersphere with the same dimension. If None, all components are kept, i.e., extrinsic coordinates are converted to intrinsic coordinates without loosing dimenisonality.
- tolfloat, default=1e-3
Optimization tolerance.
- maxiterint, optional
Maximum number of iterations for the optimization. If None, the number of iterations is not checked.
- Attributes:
- embedding_ndarray of shape (n_samples, d)
The embedding vectors, \(\Xi(0), \Xi(1), \ldots, \Xi(d-1)\), where the input data is on d-sphere.
- v_list of arrays
Principal directions of nested spheres, \(\hat{v}_1, \hat{v}_2, \ldots, \hat{v}_d\).
- r_ndarray
Principal radii of nested spheres, \(\hat{r}_1, \hat{r}_2, \ldots, \hat{r}_d\).
- lm_kwargsdict, optional
Additional keyword arguments to be passed for Levenberg-Marquardt optimization. Follows the signature of
scipy.optimize.least_squares().
Notes
The resulting data is the transposed matrix of
\[\begin{split}\hat{X}_\mathrm{PNS} = \begin{bmatrix} \Xi(0) \\ \Xi(1) \\ \vdots \\ \Xi(n) \end{bmatrix},\end{split}\]with notations in the original paper, where \(n\) is n_components. The coordinates lie in \([-\pi, \pi] \times [-\pi/2, \pi/2]^{n-1}\), i.e., the azimuthal angle is the first coordinate.
Examples
>>> import numpy as np >>> from skpns import IntrinsicPNS >>> from pns.util import circular_data, unit_sphere >>> X = circular_data([0, -1, 0]) >>> pns = IntrinsicPNS() >>> Xi = pns.fit_transform(X) >>> import matplotlib.pyplot as plt ... fig = plt.figure() ... ax1 = fig.add_subplot(121, projection='3d', computed_zorder=False) ... ax1.plot_surface(*unit_sphere(), color='skyblue', edgecolor='gray') ... ax1.scatter(*X.T, c=Xi[:, 0]) ... ax2 = fig.add_subplot(122) ... ax2.scatter(*Xi.T, c=Xi[:, 0]) ... ax2.set_xlim(-np.pi, np.pi) ... ax2.set_ylim(-np.pi/2, np.pi/2)
- fit(X, y=None)[source]#
Find principal nested spheres for the data X.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Data on (n_features - 1)-dimensional hypersphere.
- yIgnored
Not used, present for API consistency by convention.
- Returns:
- selfobject
Returns a fitted instance of self.
- fit_transform(X, y=None)[source]#
Fit the model with data in X and transform X.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Data on (n_features - 1)-dimensional hypersphere.
- yIgnored
Not used, present for API consistency by convention.
- Returns:
- X_newarray-like, shape (n_samples, n_components)
X transformed in the new space.
- transform(X, y=None)[source]#
Transform X onto the fitted subsphere.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Data on (n_features - 1)-dimensional hypersphere.
- Returns:
- X_newarray-like, shape (n_samples, n_components)
X transformed in the new space.
- inverse_transform(Xi)[source]#
Transform the low-dimensional data back to the original hypersphere.
- Parameters:
- Xarray-like of shape (n_samples, n_components)
- Returns:
- X_newarray-like of shape (n_samples, n_features)
Examples
>>> from skpns import IntrinsicPNS >>> from pns.util import circular_data, unit_sphere >>> X = circular_data([0, -1, 0]) >>> pns = IntrinsicPNS(1) >>> Xi = pns.fit_transform(X) >>> X_inv = pns.inverse_transform(Xi) >>> import matplotlib.pyplot as plt ... ax = plt.figure().add_subplot(projection='3d', computed_zorder=False) ... ax.plot_surface(*unit_sphere(), color='skyblue', edgecolor='gray') ... ax.scatter(*X.T) ... ax.scatter(*X_inv.T)
- class skpns.ExtrinsicPNS(n_components=2, tol=0.001, maxiter=None, lm_kwargs=None)[source]#
Principal nested spheres (PNS) analysis with extrinsic coordinates.
Reduces the dimensionality of data on a high-dimensional hypersphere while preserving its spherical geometry.
The resulting data are represented by extrinsic coordinates. For example, n_components=2 transforms data onto a 2D unit circle, represented by x and y coordinates.
- Parameters:
- n_componentsint, default=2
Number of components to keep. Data are transformed onto a unit hypersphere embedded in this dimensional space.
- tolfloat, default=1e-3
Optimization tolerance.
- maxiterint, optional
Maximum number of iterations for the optimization. If None, the number of iterations is not checked.
- lm_kwargsdict, optional
Additional keyword arguments to be passed for Levenberg-Marquardt optimization. Follows the signature of
scipy.optimize.least_squares().
- Attributes:
- embedding_ndarray of shape (n_samples, n_components)
Stores the embedding vectors.
- v_list of (n_features - 1) arrays
Principal directions of nested spheres.
- r_ndarray of shape (n_features - 1,)
Principal radii of nested spheres.
Examples
>>> from skpns import ExtrinsicPNS >>> from pns.util import circular_data, unit_sphere >>> X = circular_data([0, -1, 0]) >>> pns = ExtrinsicPNS(n_components=2) >>> X_reduced = pns.fit_transform(X) >>> X_inv = pns.inverse_transform(X_reduced) >>> import matplotlib.pyplot as plt ... fig = plt.figure() ... ax1 = fig.add_subplot(121, projection='3d', computed_zorder=False) ... ax1.plot_surface(*unit_sphere(), color='skyblue', alpha=0.6, edgecolor='gray') ... ax1.scatter(*X_inv.T, zorder=10) ... ax1.scatter(*X.T) ... ax2 = fig.add_subplot(122) ... ax2.scatter(*X_reduced.T) ... ax2.set_aspect('equal')
- fit(X, y=None)[source]#
Find principal nested spheres for the data X.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Data on (n_features - 1)-dimensional hypersphere.
- yIgnored
Not used, present for API consistency by convention.
- Returns:
- selfobject
Returns a fitted instance of self.
- fit_transform(X, y=None)[source]#
Fit the model with data in X and transform X.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Data on (n_features - 1)-dimensional hypersphere.
- yIgnored
Not used, present for API consistency by convention.
- Returns:
- X_newarray-like, shape (n_samples, n_components)
X transformed in the new space.
Inverse transformers#
Note
These classes are intended to support conversion of inverse transformation subroutines to ONNX graph.
In Python runtime, use inverse_transform() method of transformers instead of these classes.
- class skpns.InverseExtrinsicPNS(extrinsic_pns)[source]#
Inverse converter of
ExtrinsicPNS.This class is for building ONNX graph and not intended to be used directly. Use
ExtrinsicPNS.inverse_transform()instead in Python runtime.- Parameters:
- extrinsic_pnsExtrinsicPNS
Fitted
ExtrinsicPNSinstance.
Examples
>>> from skpns import ExtrinsicPNS, InverseExtrinsicPNS >>> from pns.util import circular_data >>> from skl2onnx import to_onnx >>> X = circular_data().astype('float32') >>> pns = ExtrinsicPNS(n_components=2).fit(X) >>> onnx = to_onnx(InverseExtrinsicPNS(pns), X[:1])