Data Exploration and Visualization

Data Visualization using Distribution Plot (Seaborn Library)


Lets visualize our data with Distribution Plot which is present in Seaborn library. By default, Distribution Plot uses Histogram and KDE (Kernel Density Estimate). We can specify number of bins to the histogram as per our requirement. Please note that Distribution Plot is a univariate plot.


We can pass various parameters to distplot like bins, hist, kde, rug, vertical, color etc. 


Lets explore Distribution Plot by generating 150 random numbers.

Step 1: Import required libraries

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

Step 2: Generate 150 random numbers


num = np.random.randn(150)
num

Step 3: Explore data using Distribution Plot


sns.distplot(num)


Specify number of bins

sns.distplot(num, bins=20)


Remove histogram from distribution plot


sns.distplot(num, hist=False)


Remove KDE from distribution plot


sns.distplot(num, kde=False)


Add rug parameter to distribution plot


sns.distplot(num, hist=False, rug=True)


Add label to distribution plot


label_dist = pd.Series(num, name=”variable x”)
sns.distplot(label_dist)


Change orientation of distribution plot


sns.distplot(label_dist, vertical=True)


Add cosmetic parameter: color

sns.distplot(label_dist, color=’red’)


You can download my Jupyter notebook from here. I recommend to also try above code with Tips and Iris dataset.


Terimakasih telah membaca di Piool.com, semoga bermanfaat dan lihat juga di situs berkualitas dan paling populer Aopok.com, peluang bisnis online Topbisnisonline.com dan join di komunitas Topoin.com.


Comments

Paling Populer

To Top