whiten#
- scipy.cluster.vq.whiten(obs, check_finite=None)[source]#
- Normalize a group of observations on a per feature basis. - Before running k-means, it is beneficial to rescale each feature dimension of the observation set by its standard deviation (i.e. “whiten” it - as in “white noise” where each frequency has equal power). Each feature is divided by its standard deviation across all observations to give it unit variance. - Parameters:
- obsndarray
- Each row of the array is an observation. The columns are the features seen during each observation: - # f0 f1 f2 obs = [[ 1., 1., 1.], #o0 [ 2., 2., 2.], #o1 [ 3., 3., 3.], #o2 [ 4., 4., 4.]] #o3 
- check_finitebool, optional
- Whether to check that the input matrices contain only finite numbers. Disabling may give a performance gain, but may result in problems (crashes, non-termination) if the inputs do contain infinities or NaNs. Default: True for eager backends and False for lazy ones. 
 
- Returns:
- resultndarray
- Contains the values in obs scaled by the standard deviation of each column. 
 
 - Notes - whitenhas experimental support for Python Array API Standard compatible backends in addition to NumPy. Please consider testing these features by setting an environment variable- SCIPY_ARRAY_API=1and providing CuPy, PyTorch, JAX, or Dask arrays as array arguments. The following combinations of backend and device (or other capability) are supported.- Library - CPU - GPU - NumPy - ✅ - n/a - CuPy - n/a - ✅ - PyTorch - ✅ - ✅ - JAX - ✅ - ✅ - Dask - ✅ - n/a - See Support for the array API standard for more information. - Examples - >>> import numpy as np >>> from scipy.cluster.vq import whiten >>> features = np.array([[1.9, 2.3, 1.7], ... [1.5, 2.5, 2.2], ... [0.8, 0.6, 1.7,]]) >>> whiten(features) array([[ 4.17944278, 2.69811351, 7.21248917], [ 3.29956009, 2.93273208, 9.33380951], [ 1.75976538, 0.7038557 , 7.21248917]])