Table of Contents

Module: kNN Bio/Tools/Classification/kNN.py

kNN.py

This module provides code for doing k-nearest-neighbors classification.

k Nearest Neighbors is a supervised learning algorithm that classifies a new observation based the classes in its surrounding neighborhood.

Glossary: distance The distance between two points in the feature space. weight The importance given to each point for classification.

Classes: kNN Holds information for a nearest neighbors classifier.

Functions: train Train a new kNN classifier. calculate Calculate the probabilities of each class, given an observation. classify Classify an observation into a class.

Distance Functions: euclidean_dist The euclidean distance between two points.

Weighting Functions: equal_weight Every example is given a weight of 1.

Functions   
calculate
classify
equal_weight
euclidean_dist
euclidean_dist_py
train
  calculate 
calculate (
        knn,
        x,
        weight_fn=equal_weight,
        distance_fn=euclidean_dist,
        )

calculate(knn, x[, weight_fn][, distance_fn]) -> weight dict

Calculate the probability for each class. knn is a kNN object. x is the observed data. weight_fn is an optional function that takes x and a training example, and returns a weight. distance_fn is an optional function that takes two points and returns the distance between them. Returns a dictionary of the class to the weight given to the class.

  classify 
classify (
        knn,
        x,
        weight_fn=equal_weight,
        distance_fn=euclidean_dist,
        )

classify(knn, x[, weight_fn][, distance_fn]) -> class

Classify an observation into a class. If not specified, weight_fn will give all neighbors equal weight and distance_fn will be the euclidean distance.

  equal_weight 
equal_weight ( x,  y )

equal_weight(x, y) -> 1

  euclidean_dist 
euclidean_dist ( x,  y )

euclidean_dist(x, y) -> euclidean distance between x and y

Exceptions   
ValueError, "vectors must be same length"
  euclidean_dist_py 
euclidean_dist_py ( x,  y )

euclidean_dist_py(x, y) -> euclidean distance between x and y

Exceptions   
ValueError, "vectors must be same length"
  train 
train (
        xs,
        ys,
        k,
        typecode=None,
        )

train(xs, ys, k) -> kNN

Train a k nearest neighbors classifier on a training set. xs is a list of observations and ys is a list of the class assignments. Thus, xs and ys should contain the same number of elements. k is the number of neighbors that should be examined when doing the classification.

Classes   
kNN

Holds information necessary to do nearest neighbors classification.


Table of Contents

This document was automatically generated on Sat Jul 7 09:50:16 2001 by HappyDoc version r1_5