A data point is assigned to the majority class of its k nearest neighbors in the feature space using the K-Nearest Neighbor (KNN) classification technique. It’s an easy-to-understand and direct approach. Instead of employing explicit model training, it relies on storing the entire dataset during the prediction process.
Difference between K-Nearest Neighbor Classification and K-Means Clustering

The K-Nearest Neighbors classification in Python can be created using the sklearn.neighbors package in the manner described below:
Include the required libraries:
import numpy as np
import matplotlib.pyplot as plt
#for dataset classification import the following library
from sklearn.datasets import make_classification
#to split the model into train and test set
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
#for result comparison
from sklearn.metrics import accuracy_score, classification_report
Make a dataset or import some test information:
We’ll use make_classification to generate fake data for this example:
# Generate synthetic data
X, y = make_classification(n_samples=100, n_features=2, n_classes=2, n_clusters_per_class=1, n_redundant=0, random_state=42)
From the data, create training and test sets:
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Build and train the K-Nearest Neighbors classifier:
# Create a KNeighborsClassifier object with the desired number of neighbors (k)
knn_classifier = KNeighborsClassifier(n_neighbors=3)
# Train the classifier on the training data
knn_classifier.fit(X_train, y_train)
Predict the test set using the following:
# Use the trained classifier to make predictions on the test data
y_pred = knn_classifier.predict(X_test)
Examine the classifier’s output:
# Calculate the accuracy of the classifier
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
# Print the classification report
print(classification_report(y_test, y_pred))
The classification report, which can be used to evaluate the K-Nearest Neighbors classifier’s performance on the test set, contains the accuracy, recall, F1-score, and support for each class.
Please be advised that the choice you make regarding the number of neighbors (k) may have an impact on the KNN classifier’s performance. A small k could result in overfitting, whereas a large k could cause underfitting and over-smoothing. To find the optimal value for k, you can apply techniques like grid search and cross-validation, which are similar to what we previously discussed for hyperparameter tuning.