Seaborn

  • Seaborn is a Python library based on matplotlib.

  • It is used for data visualization especially a high-level interface for drawing attractive and informative statistical graphics.

  • Familiar alias for seaborn is sns

# Import required libraries including seaborn

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Getting the built-in datasets of seaborn for study purposes

sns.get_dataset_names()
Output:
['anagrams',
 'anscombe',
 'attention',
 'brain_networks',
 'car_crashes',
 'diamonds',
 'dots',
 'dowjones',
 'exercise',
 'flights',
 'fmri',
 'geyser',
 'glue',
 'healthexp',
 'iris',
 'mpg',
 'penguins',
 'planets',
 'seaice',
 'taxis',
 'tips',
 'titanic']
# Loading 'iris' datasets

ds = sns.load_dataset('iris')
ds.head()
Output:
   sepal_length  sepal_width  petal_length  petal_width species
0           5.1          3.5           1.4          0.2  setosa
1           4.9          3.0           1.4          0.2  setosa
2           4.7          3.2           1.3          0.2  setosa
3           4.6          3.1           1.5          0.2  setosa
4           5.0          3.6           1.4          0.2  setosa
ds.keys()
Output:
Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width',
       'species'],
      dtype='object')

Let’s start Seaborn plotting-

Line plot

sns.lineplot(x='sepal_length',y='sepal_width',data=ds,hue='species')

Scatter Plot

  • On the x-axis, we have the first feature.
  • The y-axis we have placed the second feature.
  • The ‘data’ parameter specifies the dataset we are drawing the columns from.
  • The ‘hue’ parameter specifies the feature based on which the points are going to be colored.
  • The ‘palette’ parameter specifies the colors to be used in the plot.
  • The ‘markers’ parameter determines the shape of the points.
  • The ‘style’ parameter connects markers to classes.
  • The ‘s’ parameter specifies the size of the points.
  • The ‘alpha’ parameter controls the opacity of the datapoints.
  • We have decided to set the ‘legend’ parameter equal to False as we will make no use of it in this example.
sns.scatterplot(x='sepal_length',y='sepal_width',data=ds,hue='species',markers = [',', '^', 'P'],style = 'species',s = 100)

Use set() function

  • The sns.set() function is used to set the aesthetic parameters for the plots created by Seaborn. It allows you to set the theme, color palette, and other parameters to control the overall look of your plots.
  • If you do not use the sns.set() function in your Seaborn visualizations, your plots will use the default Matplotlib settings instead of the more refined and aesthetically pleasing settings provided by Seaborn. This means that your plots might not look as visually appealing or consistent with the design principles that Seaborn emphasizes.
sns.set()

sns.scatterplot(x='sepal_length',y='sepal_width',data=ds,hue='species',markers = [',', '^', 'P'],style = 'species',s = 100)

Bar plot

  • To use Bar plot , categorical data is needed for x-axis & numerical continuous data is needed for y-axis

  • It creates a plot taking a mean of a categorical column

sns.barplot(x='species',y='petal_length',data=ds)

Count Plot

  • It counts the categories and returns a count of their occurrence
sns.countplot(x='species',data=ds)

Categorical plot

  • used to plot categorical plots

  • default value for kind is strip,

  • we can use points,bar,count for categorical estimate

  • we can use box, violin, boxen for categorical distribution

  • we can use strip, and swarm for categorical scatterplots

sns.catplot(x='species',y='petal_length',data=ds)

sns.catplot(x='species',y='petal_length',data=ds,kind='bar')

sns.catplot(x='species',y='petal_length',data=ds,kind='box') 

Box Plot

  • It is sometimes known as box & whisker plot.

  • It shows the distribution of quantitative data that represents the comparison between variables.

  • Box plot shows the quartile of the dataset while whiskers extend to show the rest of the distribution i.e: the dots indicating the presence of outliers.

# Vertical Box Plot

sns.boxplot(x='species',y='sepal_width',data=ds)

# Horizontal box plot ( Switching x & y)

sns.boxplot(y='species',x='sepal_width',data=ds)

Violin plot

  • Similar to boxplot except that it provides a higher & advanced visualization and uses the kernel

  • Density estimation to give a better description about the data distribution

# Vertical plot

sns.violinplot(x='species',y='sepal_width',data=ds)

# Horizontal plot

sns.violinplot(y='species',x='sepal_width',data=ds)

Strip plot

  • It creates scatter plot based on category
sns.stripplot(x='species',y='sepal_width',data=ds)

swarm plot

  • swarmplot() function positions each point of scatter plot on the categorical axis and thereby avoids overlapping points
sns.swarmplot(x='species',y='sepal_width',data=ds)

Point Plots

  • Point plots serve same as bar plots but in a different style. Rather than the full bar, the value of the estimate is represented by the point at a certain height on the other axis.
sns.pointplot(x='species',y='sepal_width',data=ds)

Histogram

  • It represents data provided in a form of some groups.

  • It is a graphical representation of numerical data distribution

sns.histplot(x='petal_width',data=ds,hue='species',kde=True)  # kde=kernel density estimate

KDE Plot

  • Kernel Distribution Estimation(kde) Plot which depicts the probability density function of the continuous or non-parametric data variables i.e. we can plot for the univariate or multiple variables altogether.

  • Using the Python Seaborn module, we can build the Kdeplot with various functionality added to it.

sns.kdeplot(x='sepal_length',data=ds,hue='species')

Distribution plot

  • Used for univariate analysis
  • Visualize through a histogram only one observation, so one particular column should be chosen
sns.distplot(ds['petal_width'])

Joint plot

  • To analyze Bivariate distribution in seaborn , jointplot( ) function can be used.

  • Jointplot creates a multi-panel figure that projects the bivariate relationship between two variables and also the univariate distribution of each variable on separate axes.

sns.jointplot(x='petal_length',y='petal_width',data=ds)

Pairplot

  • It represents a pairwise relation across the entire data frame & supports an additional argument called hue for categorical separation.
sns.pairplot(data=ds,hue='species')

Heat Map

  • It is a graphical representation of data using colors to visualize the value of the matrix
  • More common or higher activities lighter the color

  • Less common or lower activities, the darker the color

  • It shows the correlation between different parameters

sns.heatmap(ds.corr(),annot=True)

Register

Login here

Forgot your password?

ads

ads

I am an enthusiastic advocate for the transformative power of data in the fashion realm. Armed with a strong background in data science, I am committed to revolutionizing the industry by unlocking valuable insights, optimizing processes, and fostering a data-centric culture that propels fashion businesses into a successful and forward-thinking future. - Masud Rana, Certified Data Scientist, IABAC

© Data4Fashion 2023-2024

Developed by: Behostweb.com

Please accept cookies
Accept All Cookies