Monday, 9 March 2026

Day 48: Beeswarm Plot in Python 🐝📊

 

A Beeswarm Plot (also called a Swarm Plot) is a powerful visualization used to display the distribution of data points across different categories. Unlike a simple scatter plot, a beeswarm plot adjusts the position of points so they don’t overlap, making it easier to see how data is spread within each category.

In this example, we use the Iris dataset to visualize how petal length varies across different flower species.


🔹 Why Use a Beeswarm Plot?

Beeswarm plots are useful when you want to:

  • Show individual data points

  • Understand the distribution of values

  • Compare multiple categories

  • Avoid overlapping points like in regular scatter plots

They are commonly used in data analysis, exploratory data science, and statistical visualization.


📊 Dataset Used

We are using the Iris dataset, one of the most popular datasets in machine learning and statistics.

The dataset contains measurements of iris flowers including:

  • Sepal Length

  • Sepal Width

  • Petal Length

  • Petal Width

  • Species

The three species are:

  • Setosa

  • Versicolor

  • Virginica

In this visualization, we compare petal length across these species.


🧠 Python Code

import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
import pandas as pd

# Load dataset
iris = load_iris()

# Create dataframe
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df["Species"] = iris.target_names[iris.target]

# Create Beeswarm Plot
plt.figure(figsize=(8,5))
sns.swarmplot(data=df, x="Species", y="petal length (cm)")

# Title
plt.title("Beeswarm Plot: Petal Length by Species")

plt.tight_layout()
plt.show()

🔍 Code Explanation

1️⃣ Import Libraries

We import the required libraries:

  • Seaborn → for statistical visualizations

  • Matplotlib → for plotting

  • Scikit-learn → to load the Iris dataset

  • Pandas → for data manipulation


2️⃣ Load the Dataset

iris = load_iris()

This loads the iris dataset from scikit-learn.


3️⃣ Create a DataFrame

df = pd.DataFrame(iris.data, columns=iris.feature_names)

We convert the dataset into a pandas DataFrame for easier handling.

Then we add the species column:

df["Species"] = iris.target_names[iris.target]

4️⃣ Create the Beeswarm Plot

sns.swarmplot(data=df, x="Species", y="petal length (cm)")

This line creates the beeswarm plot where:

  • x-axis → flower species

  • y-axis → petal length

  • Each dot represents one observation

The swarm algorithm spreads points horizontally to avoid overlap.


5️⃣ Add Title and Display

plt.title("Beeswarm Plot: Petal Length by Species")
plt.show()

This adds a chart title and displays the plot.


📈 What Insights Can We See?

From the beeswarm plot:

  • Setosa flowers have small petal lengths

  • Versicolor has medium petal lengths

  • Virginica generally has larger petals

The plot clearly shows distinct clusters for each species, which is why the Iris dataset is often used for classification problems in machine learning.


🚀 When Should You Use Beeswarm Plots?

Use beeswarm plots when you want to:

  • Show raw data points

  • Compare distributions across categories

  • Avoid overlapping points

  • Perform exploratory data analysis

They are especially useful in data science, biology, statistics, and machine learning.


🎯 Conclusion

The Beeswarm Plot is a simple yet powerful way to visualize categorical data distributions while preserving individual data points. Using Seaborn in Python, creating this plot becomes quick and effective for exploring patterns within your dataset.

In just a few lines of code, we were able to visualize petal length differences across iris species, revealing clear distinctions between the groups.

0 Comments:

Post a Comment

Popular Posts

Categories

100 Python Programs for Beginner (119) AI (234) Android (25) AngularJS (1) Api (7) Assembly Language (2) aws (28) Azure (10) BI (10) Books (262) Bootcamp (2) C (78) C# (12) C++ (83) Course (87) Coursera (300) Cybersecurity (30) data (5) Data Analysis (29) Data Analytics (20) data management (15) Data Science (337) Data Strucures (16) Deep Learning (142) Django (16) Downloads (3) edx (21) Engineering (15) Euron (30) Events (7) Excel (19) Finance (10) flask (4) flutter (1) FPL (17) Generative AI (68) Git (10) Google (51) Hadoop (3) HTML Quiz (1) HTML&CSS (48) IBM (41) IoT (3) IS (25) Java (99) Leet Code (4) Machine Learning (275) Meta (24) MICHIGAN (5) microsoft (11) Nvidia (8) Pandas (13) PHP (20) Projects (32) pytho (1) Python (1278) Python Coding Challenge (1118) Python Mistakes (50) Python Quiz (460) Python Tips (5) Questions (3) R (72) React (7) Scripting (3) security (4) Selenium Webdriver (4) Software (19) SQL (48) Udemy (18) UX Research (1) web application (11) Web development (8) web scraping (3)

Followers

Python Coding for Kids ( Free Demo for Everyone)