Skip to content

Commit b9a5873

Browse files
authored
Merge pull request animator#1276 from RajKhanke/raj
Violin plots using matplotlib
2 parents 0ab0668 + 3d25e5a commit b9a5873

File tree

10 files changed

+279
-1
lines changed

10 files changed

+279
-1
lines changed
18.1 KB
Loading
20.5 KB
Loading
20.4 KB
Loading
20.9 KB
Loading
19.7 KB
Loading
24 KB
Loading
24.1 KB
Loading
15 KB
Loading

contrib/plotting-visualization/index.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,5 @@
1010
- [Seaborn Plotting Functions](seaborn-plotting.md)
1111
- [Getting started with Seaborn](seaborn-basics.md)
1212
- [Bar Plots in Plotly](plotly-bar-plots.md)
13-
- [Pie Charts in Plotly](plotly-pie-charts.md)
13+
- [Pie Charts in Plotly](plotly-pie-charts.md)
14+
- [Violin Plots in Matplotlib](matplotlib-violin_plots.md)
Lines changed: 277 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,277 @@
1+
# Violin Plots in Matplotlib
2+
3+
A violin plot is a method of plotting numeric data and a probability density function. It is a combination of a box plot and a kernel density plot, providing a richer visualization of the distribution of the data. In a violin plot, each data point is represented by a kernel density plot, mirrored and joined together to form a symmetrical shape resembling a violin, hence the name.
4+
5+
Violin plots are particularly useful when comparing distributions across different categories or groups. They provide insights into the shape, spread, and central tendency of the data, allowing for a more comprehensive understanding than traditional box plots.
6+
7+
Violin plots offer a more detailed distribution representation, combining summary statistics and kernel density plots, handle unequal sample sizes effectively, allow easy comparison across groups, and facilitate identification of multiple modes compared to box plots.
8+
9+
![Violen plot 1](images/violen-plots1.webp)
10+
11+
## Prerequisites
12+
13+
Before creating violin charts in matplotlib you must ensure that you have Python as well as Matplotlib installed on your system.
14+
15+
## Creating a simple Violin Plot with `violinplot()` method
16+
17+
A basic violin plot can be created with `violinplot()` method in `matplotlib.pyplot`.
18+
19+
```Python
20+
import matplotlib.pyplot as plt
21+
import numpy as np
22+
23+
# Creating dataset
24+
data = [np.random.normal(0, std, 100) for std in range(1, 5)]
25+
26+
# Creating Plot
27+
plt.violinplot(data)
28+
29+
# Show plot
30+
plt.show()
31+
32+
```
33+
34+
When executed, this would show the following pie chart:
35+
36+
37+
![Basic violin plot](images/violinplotnocolor.png)
38+
39+
40+
The `Violinplot` function in matplotlib.pyplot creates a violin plot, which is a graphical representation of the distribution of data across different levels of a categorical variable. Here's a breakdown of its usage:
41+
42+
```Python
43+
plt.violinplot(data, showmeans=False, showextrema=False)
44+
```
45+
46+
- `data`: This parameter represents the dataset used to create the violin plot. It can be a single array or a sequence of arrays.
47+
48+
- `showmeans`: This optional parameter, if set to True, displays the mean value as a point on the violin plot. Default is False.
49+
50+
- `showextrema`: This optional parameter, if set to True, displays the minimum and maximum values as points on the violin plot. Default is False.
51+
52+
Additional parameters can be used to further customize the appearance of the violin plot, such as setting custom colors, adding labels, and adjusting the orientation. For instance:
53+
54+
```Python
55+
plt.violinplot(data, showmedians=True, showmeans=True, showextrema=True, vert=False, widths=0.9, bw_method=0.5)
56+
```
57+
- showmedians: Setting this parameter to True displays the median value as a line on the violin plot.
58+
59+
- `vert`: This parameter determines the orientation of the violin plot. Setting it to False creates a horizontal violin plot. Default is True.
60+
61+
- `widths`: This parameter sets the width of the violins. Default is 0.5.
62+
63+
- `bw_method`: This parameter determines the method used to calculate the kernel bandwidth for the kernel density estimation. Default is 0.5.
64+
65+
Using these parameters, you can customize the violin plot according to your requirements, enhancing its readability and visual appeal.
66+
67+
68+
## Customizing Violin Plots in Matplotlib
69+
70+
When customizing violin plots in Matplotlib, using `matplotlib.pyplot.subplots()` provides greater flexibility for applying customizations.
71+
72+
### Coloring Violin Plots
73+
74+
You can assign custom colors to the `violins` by passing an array of colors to the color parameter in `violinplot()` method.
75+
76+
```Python
77+
import matplotlib.pyplot as plt
78+
import numpy as np
79+
80+
# Creating dataset
81+
data = [np.random.normal(0, std, 100) for std in range(1, 5)]
82+
colors = ['tab:red', 'tab:blue', 'tab:green', 'tab:orange']
83+
84+
# Creating plot using matplotlib.pyplot.subplots()
85+
fig, ax = plt.subplots()
86+
87+
# Customizing colors of violins
88+
for i in range(len(data)):
89+
parts = ax.violinplot(data[i], positions=[i], vert=False, showmeans=False, showextrema=False, showmedians=True, widths=0.9, bw_method=0.5)
90+
for pc in parts['bodies']:
91+
pc.set_facecolor(colors[i])
92+
93+
# Show plot
94+
plt.show()
95+
```
96+
This code snippet creates a violin plot with custom colors assigned to each violin, enhancing the visual appeal and clarity of the plot.
97+
98+
99+
![Coloring violin](images/violenplotnormal.png)
100+
101+
102+
When customizing violin plots using `matplotlib.pyplot.subplots()`, you obtain a `Figure` object `fig` and an `Axes` object `ax`, allowing for extensive customization. Each `violin plot` consists of various components, including the `violin body`, `lines representing median and quartiles`, and `potential markers for mean and outliers`. You can customize these components using the appropriate methods and attributes of the Axes object.
103+
104+
- Here's an example of how to customize violin plots:
105+
106+
```Python
107+
import matplotlib.pyplot as plt
108+
import numpy as np
109+
110+
# Creating dataset
111+
data = [np.random.normal(0, std, 100) for std in range(1, 5)]
112+
colors = ['tab:red', 'tab:blue', 'tab:green', 'tab:orange']
113+
114+
# Creating plot using matplotlib.pyplot.subplots()
115+
fig, ax = plt.subplots()
116+
117+
# Creating violin plots
118+
parts = ax.violinplot(data, showmeans=False, showextrema=False, showmedians=True, widths=0.9, bw_method=0.5)
119+
120+
# Customizing colors of violins
121+
for i, pc in enumerate(parts['bodies']):
122+
pc.set_facecolor(colors[i])
123+
124+
# Customizing median lines
125+
for line in parts['cmedians'].get_segments():
126+
ax.plot(line[:, 0], line[:, 1], color='black')
127+
128+
# Customizing quartile lines
129+
for line in parts['cmedians'].get_segments():
130+
ax.plot(line[:, 0], line[:, 1], linestyle='--', color='black', linewidth=2)
131+
132+
# Adding mean markers
133+
for line in parts['cmedians'].get_segments():
134+
ax.scatter(np.mean(line[:, 0]), np.mean(line[:, 1]), marker='o', color='black')
135+
136+
# Customizing axes labels
137+
ax.set_xlabel('X Label')
138+
ax.set_ylabel('Y Label')
139+
140+
# Adding title
141+
ax.set_title('Customized Violin Plot')
142+
143+
# Show plot
144+
plt.show()
145+
```
146+
147+
![Customizing violin](images/violin-plot4.png)
148+
149+
In this example, we customize various components of the violin plot, such as colors, line styles, and markers, to enhance its visual appeal and clarity. Additionally, we modify the axes labels and add a title to provide context to the plot.
150+
151+
### Adding Hatching to Violin Plots
152+
153+
You can add hatching patterns to the violin plots to enhance their visual distinction. This can be achieved by setting the `hatch` parameter in the `violinplot()` function.
154+
155+
```Python
156+
import matplotlib.pyplot as plt
157+
import numpy as np
158+
159+
# Creating dataset
160+
data = [np.random.normal(0, std, 100) for std in range(1, 5)]
161+
colors = ['tab:red', 'tab:blue', 'tab:green', 'tab:orange']
162+
hatches = ['/', '\\', '|', '-']
163+
164+
# Creating plot using matplotlib.pyplot.subplots()
165+
fig, ax = plt.subplots()
166+
167+
# Creating violin plots with hatching
168+
parts = ax.violinplot(data, showmeans=False, showextrema=False, showmedians=True, widths=0.9, bw_method=0.5)
169+
170+
for i, pc in enumerate(parts['bodies']):
171+
pc.set_facecolor(colors[i])
172+
pc.set_hatch(hatches[i])
173+
174+
# Show plot
175+
plt.show()
176+
```
177+
178+
![violin_hatching](images/violin-hatching.png)
179+
180+
181+
182+
### Labeling Violin Plots
183+
184+
You can add `labels` to violin plots to provide additional information about the data. This can be achieved by setting the label parameter in the `violinplot()` function.
185+
186+
An example in shown here:
187+
188+
```Python
189+
import matplotlib.pyplot as plt
190+
import numpy as np
191+
192+
# Creating dataset
193+
data = [np.random.normal(0, std, 100) for std in range(1, 5)]
194+
labels = ['Group {}'.format(i) for i in range(1, 5)]
195+
196+
# Creating plot using matplotlib.pyplot.subplots()
197+
fig, ax = plt.subplots()
198+
199+
# Creating violin plots
200+
parts = ax.violinplot(data, showmeans=False, showextrema=False, showmedians=True, widths=0.9, bw_method=0.5)
201+
202+
# Adding labels to violin plots
203+
for i, label in enumerate(labels):
204+
parts['bodies'][i].set_label(label)
205+
206+
# Show plot
207+
plt.legend()
208+
plt.show()
209+
```
210+
![violin_labeling](images/violin-labelling.png)
211+
212+
In this example, each violin plot is labeled according to its group, providing context to the viewer.
213+
These customizations can be combined and further refined to create violin plots that effectively convey the underlying data distributions.
214+
215+
### Stacked Violin Plots
216+
217+
`Stacked violin plots` are useful when you want to compare the distribution of a `single` variable across different categories or groups. In a stacked violin plot, violins for each category or group are `stacked` on top of each other, allowing for easy visual comparison.
218+
219+
```Python
220+
import matplotlib.pyplot as plt
221+
import numpy as np
222+
223+
# Generating sample data
224+
np.random.seed(0)
225+
data1 = np.random.normal(0, 1, 100)
226+
data2 = np.random.normal(2, 1, 100)
227+
data3 = np.random.normal(1, 1, 100)
228+
229+
# Creating a stacked violin plot
230+
plt.violinplot([data1, data2, data3], showmedians=True)
231+
232+
# Adding labels to x-axis ticks
233+
plt.xticks([1, 2, 3], ['Group 1', 'Group 2', 'Group 3'])
234+
235+
# Adding title and labels
236+
plt.title('Stacked Violin Plot')
237+
plt.xlabel('Groups')
238+
plt.ylabel('Values')
239+
240+
# Displaying the plot
241+
plt.show()
242+
```
243+
![stacked violin plots](images/stacked_violin_plots.png)
244+
245+
246+
### Split Violin Plots
247+
248+
`Split violin plots` are effective for comparing the distribution of a `single variable` across `two` different categories or groups. In a split violin plot, each violin is split into two parts representing the distributions of the variable for each category.
249+
250+
```Python
251+
import matplotlib.pyplot as plt
252+
import numpy as np
253+
254+
# Generating sample data
255+
np.random.seed(0)
256+
data_male = np.random.normal(0, 1, 100)
257+
data_female = np.random.normal(2, 1, 100)
258+
259+
# Creating a split violin plot
260+
plt.violinplot([data_male, data_female], showmedians=True)
261+
262+
# Adding labels to x-axis ticks
263+
plt.xticks([1, 2], ['Male', 'Female'])
264+
265+
# Adding title and labels
266+
plt.title('Split Violin Plot')
267+
plt.xlabel('Gender')
268+
plt.ylabel('Values')
269+
270+
# Displaying the plot
271+
plt.show()
272+
```
273+
274+
![Shadow](images/split-violin-plot.png)
275+
276+
In both examples, we use Matplotlib's `violinplot()` function to create the violin plots. These unique features provide additional flexibility and insights when analyzing data distributions across different groups or categories.
277+

0 commit comments

Comments
 (0)