KDTree#
- class scipy.spatial.KDTree(data, leafsize=10, compact_nodes=True, copy_data=False, balanced_tree=True, boxsize=None)[source]#
- kd-tree for quick nearest-neighbor lookup. - This class provides an index into a set of k-dimensional points which can be used to rapidly look up the nearest neighbors of any point. - Parameters:
- dataarray_like, shape (n,m)
- The n data points of dimension m to be indexed. This array is not copied unless this is necessary to produce a contiguous array of doubles, and so modifying this data will result in bogus results. The data are also copied if the kd-tree is built with copy_data=True. 
- leafsizepositive int, optional
- The number of points at which the algorithm switches over to brute-force. Default: 10. 
- compact_nodesbool, optional
- If True, the kd-tree is built to shrink the hyperrectangles to the actual data range. This usually gives a more compact tree that is robust against degenerated input data and gives faster queries at the expense of longer build time. Default: True. 
- copy_databool, optional
- If True the data is always copied to protect the kd-tree against data corruption. Default: False. 
- balanced_treebool, optional
- If True, the median is used to split the hyperrectangles instead of the midpoint. This usually gives a more compact tree and faster queries at the expense of longer build time. Default: True. 
- boxsizearray_like or scalar, optional
- Apply a m-d toroidal topology to the KDTree.. The topology is generated by \(x_i + n_i L_i\) where \(n_i\) are integers and \(L_i\) is the boxsize along i-th dimension. The input data shall be wrapped into \([0, L_i)\). A ValueError is raised if any of the data is outside of this bound. 
 
- Attributes:
- datandarray, shape (n,m)
- The n data points of dimension m to be indexed. This array is not copied unless this is necessary to produce a contiguous array of doubles. The data are also copied if the kd-tree is built with - copy_data=True.
- leafsizepositive int
- The number of points at which the algorithm switches over to brute-force. 
- mint
- The dimension of a single data-point. 
- nint
- The number of data points. 
- maxesndarray, shape (m,)
- The maximum value in each dimension of the n data points. 
- minsndarray, shape (m,)
- The minimum value in each dimension of the n data points. 
- sizeint
- The number of nodes in the tree. 
 
 - Methods - count_neighbors(other, r[, p, weights, ...])- Count how many nearby pairs can be formed. - query(x[, k, eps, p, distance_upper_bound, ...])- Query the kd-tree for nearest neighbors. - query_ball_point(x, r[, p, eps, workers, ...])- Find all points within distance r of point(s) x. - query_ball_tree(other, r[, p, eps])- Find all pairs of points between self and other whose distance is at most r. - query_pairs(r[, p, eps, output_type])- Find all pairs of points in self whose distance is at most r. - sparse_distance_matrix(other, max_distance)- Compute a sparse distance matrix. - innernode - leafnode - node - Notes - The algorithm used is described in Maneewongvatana and Mount 1999. The general idea is that the kd-tree is a binary tree, each of whose nodes represents an axis-aligned hyperrectangle. Each node specifies an axis and splits the set of points based on whether their coordinate along that axis is greater than or less than a particular value. - During construction, the axis and splitting point are chosen by the “sliding midpoint” rule, which ensures that the cells do not all become long and thin. - The tree can be queried for the r closest neighbors of any given point (optionally returning only those within some maximum distance of the point). It can also be queried, with a substantial gain in efficiency, for the r approximate closest neighbors. - For large dimensions (20 is already large) do not expect this to run significantly faster than brute force. High-dimensional nearest-neighbor queries are a substantial open problem in computer science.