ridgeplot
is a Python package that provides a simple interface for plotting beautiful and interactive ridgeline plots within the extensive Plotly ecosystem.
Installation
ridgeplot
can be installed and updated from PyPi using pip:
pip install -U ridgeplot
For more information, see the installation guide.
Getting started
Take a look at the getting started guide, which provides a quick introduction to the ridgeplot
library.
Basic example
For those in a hurry, here’s a very basic example on how to quickly get started with ridgeplot()
.
import numpy as np
from ridgeplot import ridgeplot
my_samples = [np.random.normal(n / 1.2, size=600) for n in range(9, 0, -1)]
fig = ridgeplot(samples=my_samples)
fig.update_layout(height=500, width=800)
fig.show()
Installation#
Installing from PyPi#
ridgeplot
can be installed and updated from PyPi using pip:
pip install -U ridgeplot
Installing from source#
The source code for this project is hosted on GitHub at: https://github.com/tpvasconcelos/ridgeplot
Take a look at the contributing guide for instructions on how to build from the git source. Further, refer to the instructions on creating a development environment if you wish to create a local development environment, or wish to contribute to the project.
Dependencies#
We try to keep the number of dependencies to a minimum and only use common and well-established libraries in the scientific python ecosystem. Currently, we only depend on the following 3 Python packages:
plotly - The interactive graphing backend that powers
ridgeplot
statsmodels - Used for Kernel Density Estimation (KDE)
numpy - Supporting library for multidimensional array manipulations
Getting started#
This page provides a quick introduction to the ridgeplot
library, showcasing some of its features and providing a few practical examples. All examples use the ridgeplot.ridgeplot()
function, which is the main entry point to the library. For more information on the available options, take a look at the reference page.
Basic example#
This basic example shows how you can quickly get started with a simple call to the ridgeplot()
function.
import numpy as np
from ridgeplot import ridgeplot
my_samples = [np.random.normal(n / 1.2, size=600) for n in range(9, 0, -1)]
fig = ridgeplot(samples=my_samples)
fig.update_layout(height=500, width=800)
fig.show()
Flexible configuration#
In this example, we will try to replicate the first ridgeline plot in this from Data to Viz post. The example in the post was created using the “Perception of Probability Words” dataset (see load_probly()
) and the popular ggridges R package. In the end, we will see how the ridgeplot
Python library can be used to create a (nearly) identical plot, thanks to its extensive configuration options.
import numpy as np
from ridgeplot import ridgeplot
from ridgeplot.datasets import load_probly
# Load the probly dataset
df = load_probly()
# Let's grab the subset of columns used in the example
column_names = [
"Almost Certainly",
"Very Good Chance",
"We Believe",
"Likely",
"About Even",
"Little Chance",
"Chances Are Slight",
"Almost No Chance",
]
df = df[column_names]
# Not only does 'ridgeplot(...)' come configured with sensible defaults
# but is also fully configurable to your own style and preference!
fig = ridgeplot(
samples=df.values.T,
bandwidth=4,
kde_points=np.linspace(-12.5, 112.5, 500),
colorscale="viridis",
colormode="row-index",
coloralpha=0.65,
labels=column_names,
linewidth=2,
spacing=5 / 9,
)
# And you can still update and extend the final
# Plotly Figure using standard Plotly methods
fig.update_layout(
height=760,
width=900,
font_size=16,
plot_bgcolor="white",
xaxis_tickvals=[-12.5, 0, 12.5, 25, 37.5, 50, 62.5, 75, 87.5, 100, 112.5],
xaxis_ticktext=["", "0", "", "25", "", "50", "", "75", "", "100", ""],
xaxis_gridcolor="rgba(0, 0, 0, 0.1)",
yaxis_gridcolor="rgba(0, 0, 0, 0.1)",
yaxis_title="Assigned Probability (%)",
showlegend=False,
)
# Show us the work!
fig.show()
The resulting ridgeline plot generated by the code above:
The target reference from the from Data to Viz post:
More traces#
In this example, we will dive a bit deeper into the samples
parameter and see how we can be used to plot multiple traces per row in a ridgeline plot.
Final result#
For the ones in a hurry, we are including the entire final code-block and resulting plot already in this section. It is here also to serve as a reference for the rest of the section and to demonstrate what the goal of this example is. That said, throughout the rest of this section, we will dive a bit deeper into the samples
parameter and understand how flexible it is.
import numpy as np
from ridgeplot import ridgeplot
from ridgeplot.datasets import load_lincoln_weather
# Load test data
df = load_lincoln_weather()
# Transform the data into a 3D (ragged) array format of
# daily min and max temperature samples per month
months = df.index.month_name().unique()
samples = [
[
df[df.index.month_name() == month]["Min Temperature [F]"],
df[df.index.month_name() == month]["Max Temperature [F]"],
]
for month in months
]
# And finish by styling it up to your liking!
fig = ridgeplot(
samples=samples,
labels=months,
coloralpha=0.98,
bandwidth=4,
kde_points=np.linspace(-25, 110, 400),
spacing=0.33,
linewidth=2,
)
fig.update_layout(
title="Minimum and maximum daily temperatures in Lincoln, NE (2016)",
height=650,
width=950,
font_size=14,
plot_bgcolor="rgb(245, 245, 245)",
xaxis_gridcolor="white",
yaxis_gridcolor="white",
xaxis_gridwidth=2,
yaxis_title="Month",
xaxis_title="Temperature [F]",
showlegend=False,
)
fig.show()
Step-by-step#
Let’s start by loading the “Lincoln Weather” test dataset (see load_lincoln_weather()
).
>>> from ridgeplot.datasets import load_lincoln_weather
>>> df = load_lincoln_weather()
>>> df[["Min Temperature [F]", "Max Temperature [F]"]].head()
Min Temperature [F] Max Temperature [F]
CST
2016-01-01 11 37
2016-01-02 5 41
2016-01-03 8 37
2016-01-04 4 30
2016-01-05 19 38
The goal will be to plot the KDEs for the minimum and maximum daily temperatures for each month of 2016 (i.e. the year covered by the dataset).
>>> months = df.index.month_name().unique()
>>> months.to_list()
['January', 'February', 'March', 'April', 'May', 'June', 'July',
'August', 'September', 'October', 'November', 'December']
The samples
argument in the ridgeplot()
function expects a 3D array of shape \((R, T_r, S_t)\), where \(R\) is the number of rows, \(T_r\) is the number of traces per row, and \(S_t\) is the number of samples per trace, with:
Dimension values |
Description |
---|---|
\(R=12\) |
One row per month. |
\(T_r=2\) (for all rows \(r \in R\)) |
Two traces per row (one for the minimum temperatures and one for the maximum temperatures). |
\(S_t \in \{29, 30, 31\}\) |
One sample per day of the month, where different months have different number of days. |
We can create this array using a simple list comprehension, where each element of the list is a list of two arrays, one for the minimum temperatures and one for the maximum temperatures samples, for each month:
samples = [
[
df[df.index.month_name() == month]["Min Temperature [F]"],
df[df.index.month_name() == month]["Max Temperature [F]"],
]
for month in months
]
Note
For other use cases (like in the two previous examples), you could use a numpy ndarray to represent the samples. However, since different months have different number of days, we need to use a data container that can hold arrays of different lengths along the same dimension. Irregular arrays like this one are called ragged arrays. There are many different ways you can represent irregular arrays in Python. In this specific example, we used a list of lists of pandas Series. However,ridgeplot
is designed to handle any object that implements the Collection
[Collection
[Collection
[Numeric
]]] protocol (i.e. any numeric 3D ragged array).
Finally, we can pass the samples
list to the ridgeplot()
function and specify any other arguments we want to customize the plot, like adjusting the KDE’s bandwidth, the vertical spacing between rows, etc.
fig = ridgeplot(
samples=samples,
labels=months,
coloralpha=0.98,
bandwidth=4,
kde_points=np.linspace(-25, 110, 400),
spacing=0.33,
linewidth=2,
)
fig.update_layout(
title="Minimum and maximum daily temperatures in Lincoln, NE (2016)",
height=650,
width=950,
font_size=14,
plot_bgcolor="rgb(245, 245, 245)",
xaxis_gridcolor="white",
yaxis_gridcolor="white",
xaxis_gridwidth=2,
yaxis_title="Month",
xaxis_title="Temperature [F]",
showlegend=False,
)
fig.show()
API Reference#
Return an interactive ridgeline (Plotly) |
Color utilities#
Get a list with all available colorscale names. |
Data loading utilities#
Load a version of the "Perception of Probability Words" (a.k.a., "probly") dataset. |
|
Load the "Weather in Lincoln, Nebraska in 2016" dataset. |
Internals#
Color utilities#
- ridgeplot._colors.ColorScale#
A colorscale is an iterable of tuples of two elements:
the first element (a scale value) is a float bounded to the interval
[0, 1]
the second element (a color) is a string representation of a color parsable by Plotly
For instance, the Viridis colorscale would be defined as
>>> get_colorscale("viridis") ((0.0, 'rgb(68, 1, 84)'), (0.1111111111111111, 'rgb(72, 40, 120)'), (0.2222222222222222, 'rgb(62, 73, 137)'), (0.3333333333333333, 'rgb(49, 104, 142)'), (0.4444444444444444, 'rgb(38, 130, 142)'), (0.5555555555555556, 'rgb(31, 158, 137)'), (0.6666666666666666, 'rgb(53, 183, 121)'), (0.7777777777777777, 'rgb(110, 206, 88)'), (0.8888888888888888, 'rgb(181, 222, 43)'), (1.0, 'rgb(253, 231, 37)'))
- ridgeplot._colors.validate_colorscale(colorscale)[source]#
Validate the structure, scale values, and colors of a colorscale.
Adapted from
_plotly_utils.colors.validate_colorscale()
.
- ridgeplot._colors._any_to_rgb(color)[source]#
Convert any color to an rgb string.
- Parameters:
color – A color. This can be a tuple of
(r, g, b)
values, a hex string, or an rgb string.- Returns:
An rgb string.
- Return type:
- Raises:
TypeError – If
color
is not a tuple or a string.ValueError – If
color
is a string that does not represent a hex or rgb color.
- ridgeplot._colors.list_all_colorscale_names()[source]#
Get a list with all available colorscale names.
New in version 0.1.21: Replaces the deprecated
get_all_colorscale_names()
.- Returns:
A list with all available colorscale names.
- Return type:
list[str]
- ridgeplot._colors.get_all_colorscale_names()[source]#
Get a tuple with all available colorscale names.
Deprecated since version 0.1.21: Use
list_all_colorscale_names()
instead.- Returns:
A tuple with all available colorscale names.
- Return type:
tuple[str
,]
- ridgeplot._colors.get_colorscale(name)[source]#
Get a colorscale by name.
- Parameters:
name – The colorscale name. This argument is case-insensitive. For instance,
"YlOrRd"
and"ylorrd"
map to the same colorscale. Colorscale names ending in_r
represent a reversed colorscale.- Returns:
A colorscale.
- Return type:
ColorScale
- Raises:
ValueError – If an unknown name is provided
Figure factory#
- ridgeplot._figure_factory.LabelsArray#
A
LabelsArray
represents the labels of traces in a ridgeplot.For instance, the following is a valid
LabelsArray
:>>> labels_array: LabelsArray = [ ... ["trace 1", "trace 2", "trace 3"], ... ["trace 4", "trace 5"], ... ]
alias of
Collection
[Collection
[str
]]
- ridgeplot._figure_factory.ShallowLabelsArray#
Shallow type for
LabelsArray
.Example:
>>> labels_array: ShallowLabelsArray = ["trace 1", "trace 2", "trace 3"]
alias of
Collection
[str
]
- ridgeplot._figure_factory.ColorsArray#
A
ColorsArray
represents the colors of traces in a ridgeplot.For instance, the following is a valid
ColorsArray
:>>> colors_array: ColorsArray = [ ... ["red", "blue", "green"], ... ["orange", "purple"], ... ]
alias of
Collection
[Collection
[str
]]
- ridgeplot._figure_factory.ShallowColorsArray#
Shallow type for
ColorsArray
.Example:
>>> colors_array: ShallowColorsArray = ["red", "blue", "green"]
alias of
Collection
[str
]
- ridgeplot._figure_factory.MidpointsArray#
A
MidpointsArray
represents the midpoints of colorscales in a ridgeplot.For instance, the following is a valid
MidpointsArray
:>>> midpoints_array: MidpointsArray = [ ... [0.2, 0.5, 1], ... [0.3, 0.7], ... ]
alias of
Collection
[Collection
[float
]]
- ridgeplot._figure_factory.get_xy_extrema(densities)[source]#
Get the global x-y extrema (x_min, x_max, y_min, y_max) from all the
Densities
array.- Parameters:
densities – A
Densities
array.- Returns:
A tuple of the form (x_min, x_max, y_min, y_max).
- Return type:
Examples
>>> get_xy_extrema( ... [ ... [ ... [(0, 0), (1, 1), (2, 2), (3, 3)], ... [(0, 0), (1, 1), (2, 2)], ... [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)], ... ], ... [ ... [(-2, 2), (-1, 1), (0, 1)], ... [(2, 2), (3, 1), (4, 1)], ... ], ... ] ... ) (-2, 4, 0, 4)
- class ridgeplot._figure_factory.RidgePlotFigureFactory(densities, colorscale, coloralpha, colormode, labels, linewidth, spacing, show_yticklabels, xpad)[source]#
Refer to
ridgeplot.ridgeplot()
.- draw_base(x, y_shifted)[source]#
Draw the base for a density trace.
Adds an invisible trace at constant y that will serve as the fill-limit for the corresponding density trace.
- draw_density_trace(x, y, y_shifted, label, color)[source]#
Draw a density trace.
Adds a density ‘trace’ to the Figure. The
fill="tonexty"
option fills the trace until the previously drawn trace (seedraw_base()
). This is why the base trace must be drawn first.
- _compute_midpoints_row_index()[source]#
colormode=’row-index’
Uses the row’s index. e.g. if the ridgeplot has 3 rows of traces, then the midpoints will be [[1, …], [0.5, …], [0, …]].
- _compute_midpoints_trace_index()[source]#
colormode=’trace-index’
Uses the trace’s index. e.g. if the ridgeplot has a total of 3 traces (across all rows), then the midpoints will be 0, 0.5, and 1, respectively.
- _compute_midpoints_mean_minmax()[source]#
colormode=’mean-minmax’
Uses the min-max normalized (weighted) mean of each density to calculate the midpoints. The normalization min and max values are the minimum and maximum x-values from all densities, respectively.
Kernel density estimation (KDE)#
Testing#
- ridgeplot._testing.patch_plotly_show()[source]#
Patch the
plotly.io.show()
function to skip any rendering steps and, instead, simply callplotly.io._utils.validate_coerce_fig_to_dict()
.
Types#
- ridgeplot._types.CollectionL1#
A
TypeAlias
for a 1-level-deepCollection
.Example:
>>> c1 = [1, 2, 3]
alias of
Collection
[_T
]
- ridgeplot._types.CollectionL2#
A
TypeAlias
for a 2-level-deepCollection
.Example:
>>> c2 = [[1, 2, 3], [4, 5, 6]]
alias of
Collection
[Collection
[_T
]]
- ridgeplot._types.CollectionL3#
A
TypeAlias
for a 3-level-deepCollection
.Example:
>>> c3 = [ ... [[1, 2], [3, 4]], ... [[5, 6], [7, 8]], ... ]
alias of
Collection
[Collection
[Collection
[_T
]]]
- class ridgeplot._types.NumericT#
A
TypeVar
variable bound toNumeric
types.alias of TypeVar(‘NumericT’, bound=
Union
[int
,integer
,float
,floating
])
- ridgeplot._types._is_numeric(obj: Numeric) Literal[True] [source]#
- ridgeplot._types._is_numeric(obj: Any) bool
Check if the given object is a
Numeric
type.
- ridgeplot._types.XYCoordinate#
A 2D \((x, y)\) coordinate, represented as a
tuple
of twoNumeric
values.Example:
>>> xy_coord = (1, 2)
- ridgeplot._types.DensityTrace#
A 2D line/trace represented as a collection of \((x, y)\) coordinates (i.e.
XYCoordinate
s).These are equivalent:
DensityTrace
CollectionL1[XYCoordinate]
Collection[Tuple[Numeric, Numeric]]
By convention, the \(x\) values should be non-repeating and increasing. For instance, the following is a valid 2D line trace:
>>> density_trace = [(0, 0), (1, 1), (2, 2), (3, 1), (4, 0)]
alias of
Collection
[Tuple
[NumericT
,NumericT
]]
- ridgeplot._types.DensitiesRow#
A
DensitiesRow
represents a set ofDensityTrace
s that are to be plotted on a given row of a ridgeplot.These are equivalent:
DensitiesRow
CollectionL2[XYCoordinate]
Collection[Collection[Tuple[Numeric, Numeric]]]
Example:
>>> densities_row = [ ... [(0, 0), (1, 1), (2, 0)], # Trace 1 ... [(1, 0), (2, 1), (3, 2), (4, 1)], # Trace 2 ... [(3, 0), (4, 1), (5, 2), (6, 1), (7, 0)], # Trace 3 ... ]
alias of
Collection
[Collection
[Tuple
[NumericT
,NumericT
]]]
- ridgeplot._types.Densities#
The
Densities
type represents the entire collection of traces that are to be plotted on a ridgeplot.In a ridgeplot, several traces can be plotted on different rows. Each row is represented by a
DensitiesRow
object which, in turn, is a collection ofDensityTrace
s. Therefore, theDensities
type is a collection ofDensitiesRow
s.These are equivalent:
Densities
CollectionL1[DensitiesRow]
CollectionL3[XYCoordinate]
Collection[Collection[Collection[Tuple[Numeric, Numeric]]]]
For instance, the following is a valid
Densities
object:>>> densities = [ ... [ # Row 1 ... [(0, 0), (1, 1), (2, 0)], # Trace 1 ... [(1, 0), (2, 1), (3, 2), (4, 1)], # Trace 2 ... [(3, 0), (4, 1), (5, 2), (6, 1), (7, 0)], # Trace 3 ... ], ... [ # Row 2 ... [(-2, 0), (-1, 1), (0, 0)], # Trace 5 ... [(0, 0), (1, 1), (2, 1), (3, 0)], # Trace 6 ... ], ... ]
alias of
Collection
[Collection
[Collection
[Tuple
[NumericT
,NumericT
]]]]
- ridgeplot._types.ShallowDensities#
Shallow type for
Densities
where each row of the ridgeplot contains only a single trace.These are equivalent:
Densities
CollectionL1[DensityTrace]
CollectionL2[XYCoordinate]
Collection[Collection[Tuple[Numeric, Numeric]]]
Example:
>>> shallow_densities = [ ... [(0, 0), (1, 1), (2, 0)], # Trace 1 ... [(1, 0), (2, 1), (3, 0)], # Trace 2 ... [(2, 0), (3, 1), (4, 0)], # Trace 3 ... ]
alias of
Collection
[Collection
[Tuple
[NumericT
,NumericT
]]]
- ridgeplot._types.is_shallow_densities(obj: ShallowDensities) Literal[True] [source]#
- ridgeplot._types.is_shallow_densities(obj: Any) bool
Check if the given object is a
ShallowDensities
type.
- ridgeplot._types.SamplesTrace#
A
SamplesTrace
is a collection of numeric values representing a set of samples from which aDensityTrace
can be estimated via KDE.Example:
>>> samples_trace = [0, 1, 1, 2, 2, 2, 3, 3, 4]
- ridgeplot._types.SamplesRow#
A
SamplesRow
represents a set ofSamplesTrace
s that are to be plotted on a given row of a ridgeplot.i.e. a
SamplesRow
is a collection ofSamplesTrace
s and can be converted into aDensitiesRow
by applying KDE to each trace.Example:
>>> samples_row = [ ... [0, 1, 1, 2, 2, 2, 3, 3, 4], # Trace 1 ... [1, 2, 2, 3, 3, 3, 4, 4, 5], # Trace 2 ... ]
alias of
Collection
[Collection
[Union
[int
,integer
,float
,floating
]]]
- ridgeplot._types.Samples#
The
Samples
type represents the entire collection of samples that are to be plotted on a ridgeplot.It is a collection of
SamplesRow
objects. Each row is represented by aSamplesRow
type which, in turn, is a collection ofSamplesTrace
s which can be converted intoDensityTrace
‘s by applying a kernel density estimation algorithm.Therefore, the
Samples
type can be converted into aDensities
type by applying a kernel density estimation (KDE) algorithm to each trace.See
Densities
for more details.Example:
>>> samples = [ ... [ # Row 1 ... [0, 1, 1, 2, 2, 2, 3, 3, 4], # Trace 1 ... [1, 2, 2, 3, 3, 3, 4, 4, 5], # Trace 2 ... ], ... [ # Row 2 ... [2, 3, 3, 4, 4, 4, 5, 5, 6], # Trace 3 ... [3, 4, 4, 5, 5, 5, 6, 6, 7], # Trace 4 ... ], ... ]
alias of
Collection
[Collection
[Collection
[Union
[int
,integer
,float
,floating
]]]]
- ridgeplot._types.ShallowSamples#
Shallow type for
Samples
where each row of the ridgeplot contains only a single trace.Example:
>>> shallow_samples = [ ... [0, 1, 1, 2, 2, 2, 3, 3, 4], # Trace 1 ... [1, 2, 2, 3, 3, 3, 4, 4, 5], # Trace 2 ... ]
alias of
Collection
[Collection
[Union
[int
,integer
,float
,floating
]]]
- ridgeplot._types.is_shallow_samples(obj: ShallowSamples) Literal[True] [source]#
- ridgeplot._types.is_shallow_samples(obj: Any) bool
Check if the given object is a
ShallowSamples
type.
Other utilities#
- ridgeplot._utils.get_collection_array_shape(arr)[source]#
Return the shape of a
Collection
array.- Parameters:
arr – The
Collection
array.- Returns:
The elements of the shape tuple give the lengths of the corresponding array dimensions. If the length of a dimension is variable, the corresponding element is a
set
of the variable lengths. Otherwise, (if the length of a dimension is fixed), the corresponding element is anint
.- Return type:
Tuple[Union[int
,Set[int]]
,]
Examples
>>> get_collection_array_shape([1, 2, 3]) (3,)
>>> get_collection_array_shape([[1, 2, 3], [4, 5]]) (2, {2, 3})
>>> get_collection_array_shape( ... [ ... [ ... [1, 2, 3], [4, 5] ... ], ... [ ... [6, 7, 8, 9], ... ], ... ] ... ) (2, {1, 2}, {2, 3, 4})
>>> get_collection_array_shape( ... [ ... [ ... [1], [2, 3], [4, 5, 6], ... ], ... [ ... [7, 8, 9, 10, 11], ... ], ... ] ... ) (2, {1, 3}, {1, 2, 3, 5})
>>> get_collection_array_shape( ... [ ... [ ... [(0, 0), (1, 1), (2, 2), (3, 3)], ... [(0, 0), (1, 1), (2, 2)], ... [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)], ... ], ... [ ... [(-2, 2), (-1, 1), (0, 1)], ... [(2, 2), (3, 1), (4, 1)], ... ], ... ] ... ) (2, {2, 3}, {3, 4, 5}, 2)
>>> get_collection_array_shape( ... [ ... [ ... ["a", "b", "c", "d"], ["e", "f"], ... ], ... [ ... ["h", "i", "j", "k", "l"], ... ], ... ] ... ) (2, {1, 2}, {2, 4, 5})
- class ridgeplot._utils.LazyMapping(loader)[source]#
A lazy mapping that loads its contents only when first needed.
- Parameters:
loader – A callable that returns a mapping.
Examples
>>> from typing import Dict >>> >>> def my_io_loader() -> Dict[str, int]: ... print("Loading...") ... return {"a": 1, "b": 2} ... >>> lazy_mapping = LazyMapping(my_io_loader) >>> lazy_mapping Loading... {'a': 1, 'b': 2}
- _loader#
- _abc_impl = <_abc._abc_data object>#
Alternatives#
Changelog#
This document outlines the list of changes to ridgeplot between each release. For full details, see the commit logs.
Unreleased changes#
…
0.1.23#
Fix the references to the interactive Plotly IFrames (#129)
0.1.22#
Deprecations#
The
colormode='index'
value has been deprecated in favor ofcolormode='row-index'
, which provides the same functionality but is more explicit and allows to distinguish between the'row-index'
and'trace-index'
modes. (#114)The
show_annotations
argument has been deprecated in favor ofshow_yticklabels
. (#114)The
get_all_colorscale_names()
function has been deprecated in favor oflist_all_colorscale_names()
. (#114)
Features#
Documentation#
Major update to the documentation, including more examples, interactive plots, script to generate the HTML and WebP images from the example scripts, improved API reference, and more. (#114)
Internal#
0.1.21#
Features#
Add
ridgeplot.datasets.load_probly()
helper function to load theprobly
toy dataset. Theprobly.csv
file is now included in the package underridgeplot/datasets/data/
. (#80)
Documentation#
Internal#
Fixed and improved some type annotations, including the introduction of
ridgeplot._types
module for type aliases such asNumeric
andNestedNumericSequence
. (#80)Add the
blacken-docs
pre-commit hook and add thepep8-naming
,flake8-pytest-style
,flake8-simplify
,flake8-implicit-str-concat
,flake8-bugbear
,flake8-rst-docstrings
,flake8-rst-docstrings
, etc… plugins to theflake8
pre-commit hook. (#81)Cleanup and improve some type annotations. (#81)
Update deprecated
set-output
commands (GitHub Actions) (#87)
0.1.17#
Automate the release process. See .github/workflows/release.yaml, which issues a new GitHub release whenever a new git tag is pushed to the main branch by extracting the release notes from the changelog.
Fix automated release process to PyPI. (#27)
0.1.16#
0.1.14#
Remove
named_colorscales
from public API (#18)
0.1.13#
Add tests for example scripts (#14)
0.1.12#
Internal#
Update and standardise CI steps (#6)
Documentation#
0.1.11#
colors.json
was missing from the final distributions (#2)
0.1.0#
🚀 Initial release!
Contributing#
Thank you for your interest in contributing to ridgeplot! 🚀
The contribution process for ridgeplot should start with filing a GitHub issue. We define three main categories of issues, and each category has its own GitHub issue template
⭐ Feature requests
🐛 Bug reports
📚 Documentation fixes
After the implementation strategy has been agreed on by a ridgeplot contributor, the next step is to introduce your changes as a pull request (see Pull Request Workflow) against the ridgeplot repository. Once your pull request is merged, your changes will be automatically included in the next ridgeplot release. Every change should be listed in the ridgeplot Changelog.
The following is a set of (slightly opinionated) rules and general guidelines for contributing to ridgeplot. Emphasis on guidelines, not rules. Use your best judgment, and feel free to propose changes to this document in a pull request.
Development environment#
Here are some guidelines for setting up your development environment. Most of the steps have been abstracted away using the make build automation tool. Feel free to peak inside Makefile at any time to see exactly what is being run, and in which order.
First, you will need to clone this repository. For this, make sure you have a GitHub account, fork ridgeplot to your GitHub account by clicking the Fork button, and clone the main repository locally (e.g. using SSH)
git clone git@github.com:tpvasconcelos/ridgeplot.git
cd ridgeplot
You will also need to add your fork as a remote to push your work to. Replace {username}
with your
GitHub username.
git remote add fork git@github.com:{username}/ridgeplot.git
The following command will 1) create a new virtual environment (under .venv
), 2) install ridgeplot
in editable mode (along with all
it’s dependencies), and 3) set up and install all pre-commit hooks. Make
sure you always work within this virtual environment (i.e., $ source .venv/bin/activate
). On top
of this, you should also set up your IDE to always point to this python interpreter. In PyCharm,
open Preferences -> Project: ridgeplot -> Project Interpreter
and point the python interpreter to
.venv/bin/python
.
make init
The default and recommended base python is python3.7
. You can change this by exporting the
BASE_PYTHON
environment variable. For instance, if you are having issues installing scientific
packages on macOS for python 3.7, you can try python 3.8 instead:
BASE_PYTHON=python3.8 make init
If you need to use jupyter-lab, you can install all extra requirements, as well as set up the environment and jupyter kernel with
make init-jupyter
Pull Request Workflow#
Always confirm that you have properly configured your Git username and email.
git config --global user.name 'Your name' git config --global user.email 'Your email address'
Each release series has its own branch (i.e.
MAJOR.MINOR.x
). If submitting a documentation or bug fix contribution, branch off of the latest release series branch.git fetch origin git checkout -b <YOUR-BRANCH-NAME> origin/x.x.x
Otherwise, if submitting a new feature or API change, branch off of the
main
branchgit fetch origin git checkout -b <YOUR-BRANCH-NAME> origin/main
Apply and commit your changes.
Include tests that cover any code changes you make, and make sure the test fails without your patch.
Add an entry to CHANGES.md summarising the changes in this pull request. The entry should follow the same style and format as other entries, i.e.
- Your summary here. (#XXX)
where
#XXX
should link to the relevant pull request. If you think that the changes in this pull request do not warrant a changelog entry, please state it in your pull request’s description. In such cases, a maintainer should add askip news
label to make CI pass.Make sure all integration approval steps are passing locally (i.e.,
tox
).Push your changes to your fork
git push --set-upstream fork <YOUR-BRANCH-NAME>
Create a pull request . Remember to update the pull request’s description with relevant notes on the changes implemented, and to link to relevant issues (e.g.,
fixes #XXX
orcloses #XXX
).Wait for all remote CI checks to pass and for a ridgeplot contributor to approve your pull request.
Continuous Integration#
From GitHub’s Continuous Integration and Continuous Delivery (CI/CD) Fundamentals:
Continuous Integration (CI) automatically builds, tests, and integrates code changes within a shared repository.
The first step to Continuous Integration (CI) is having a version control system (VCS) in place. Luckily, you don’t have to worry about that! As you have already noticed, we use Git and host on GitHub.
On top of this, we also run a series of integration approval steps that allow us to ship code changes faster and more reliably. In order to achieve this, we run automated tests and coverage reports, as well as syntax (and type) checkers, code style formatters, and dependency vulnerability scans.
Running it locally#
Our tool of choice to configure and reliably run all integration approval steps is Tox, which allows us to run each step in reproducible isolated virtual environments. To trigger all checks in parallel, simply run
./bin/tox --parallel auto -m static tests
It’s that simple 🙌 !! Note only that this will take a while the first time you run the command, since it will have to create all the required virtual environments (along with their dependencies) for each CI step.
The configuration for Tox can be found in tox.ini.
Tests and coverage reports#
We use pytest as our testing framework, and pytest-cov to track and measure code coverage. You can find all configuration details in tox.ini. To trigger all tests, simply run
./bin/tox --parallel auto -m tests
If you need more control over which tests are running, or which flags are being passed to pytest, you can also invoke pytest
directly which will run on your current virtual environment. Configuration details can be found in tox.ini.
Linting#
This project uses pre-commit hooks to check and automatically fix any formatting rules. These checks are triggered before creating any git commit. To manually trigger all linting steps (i.e., all pre-commit hooks), run
pre-commit run --all-files
For more information on which hooks will run, have a look inside the .pre-commit-config.yaml configuration file. If you want to manually trigger
individual hooks, you can invoke the pre-commit
script directly. If you need even more control over
the tools used you could also invoke them directly (e.g., isort .
). Remember however that this is
not the recommended approach.
GitHub Actions#
We use GitHub Actions to automatically run all integration approval steps defined with Tox on every push or pull request event. These checks run on all major operating systems and all supported Python versions. Finally, the generated coverage reports are uploaded to Codecov and Codacy. Check .github/workflows/ci.yaml for more details.
Tools and software#
Here is a quick overview of all CI tools and software in use, some of which have already been discussed in the sections above.
Tool |
Category |
config files |
Details |
---|---|---|---|
🔧 Orchestration |
We use Tox to reliably run all integration approval steps in reproducible isolated virtual environments. |
||
🔧 Orchestration |
Workflow automation for GitHub. We use it to automatically run all integration approval steps defined with Tox on every push or pull request event. |
||
🕰 VCS |
Projects version control system software of choice. |
||
🧪 Testing |
Testing framework for python code. |
||
📊 Coverage |
Coverage plugin for pytest. |
||
📊 Coverage |
Two great services for tracking, monitoring, and alerting on code coverage and code quality. |
||
💅 Linting |
Used to to automatically check and fix any formatting rules on every commit. |
||
💅 Linting |
A static type checker for Python. We use quite a strict configuration here, which can be tricky at times. Feel free to ask for help from the community by commenting on your issue or pull request. |
||
💅 Linting |
“The uncompromising Python code formatter”. We use |
||
💅 Linting |
Used to check the style and quality of python code. |
||
💅 Linting |
Used to sort python imports. |
||
💅 Linting |
This repository uses the |
Project structure#
Community health files#
GitHub’s community health files allow repository maintainers to set contributing guidelines to help collaborators make meaningful, useful contributions to a project. Read more on this official reference .
CODE_OF_CONDUCT.md - A CODE_OF_CONDUCT file defines standards for how to engage in a community. For more information, see “Adding a code of conduct to your project.”
CONTRIBUTING.md - A CONTRIBUTING file communicates how people should contribute to your project. For more information, see “Setting guidelines for repository contributors.”
Configuration files#
For more context on some of the tools referenced below, refer to the sections on Continuous Integration.
.github/workflows/ci.yaml - Workflow definition for our CI GitHub Actions pipeline.
.pre-commit-config.yaml - List of pre-commit hooks.
.editorconfig - EditorConfig standard configuration file.
mypy.ini - Configuration for the
mypy
static type checker.build system requirements (probably won’t need to touch these!) and black configurations.
setup.cfg - Here, we specify the package metadata, requirements, as well as configuration details for flake8 and isort.
Release process#
You need push access to the project’s repository to make releases. The following release steps are here for reference only.
Review the
## Unreleased changes
section in CHANGES.md by checking for consistency in format and, if necessary, refactoring related entries into relevant subsections (e.g. Features , Docs, Bugfixes, Security, etc). Take a look at previous release notes for guidance and try to keep it consistent.Submit a pull request with these changes only and use the
"Cleanup release notes for X.X.X release"
template for the pull request title. ridgeplot uses the SemVer (MAJOR.MINOR.PATCH
) versioning standard. You can determine the latest release version by runninggit describe --tags --abbrev=0
on themain
branch. Based on this, you can determine the next release version by incrementing the MAJOR, MINOR, or PATCH. More on this on the next section. For now, just make sure you merge this pull request into themain
branch before continuing.Use the bumpversion utility to bump the current version. This utility will automatically bump the current version, and issue a relevant commit and git tag. E.g.,
# Bump MAJOR version (e.g., 0.4.2 -> 1.0.0) bumpversion major # Bump MINOR version (e.g., 0.4.2 -> 0.5.0) bumpversion minor # Bump PATCH version (e.g., 0.4.2 -> 0.4.3) bumpversion patch
You can always perform a dry-run to see what will happen under the hood.
bumpversion --dry-run --verbose [--allow-dirty] [major,minor,patch]
Push your changes along with all tag references:
git push && git push --tags
At this point a couple of GitHub Actions workflows will be triggered:
.github/workflows/ci.yaml
: Runs all CI checks with Tox against the new changes pushed tomain
..github/workflows/release.yaml
: Issues a new GitHub release triggered by the new git tag pushed in the previous step..github/workflows/publish-pypi.yaml
: Builds, packages, and uploads the source and wheel package to PyPI (and test PyPI). This is triggered by the new GitHub release created in the previous step.
Trust but verify!
Verify that all three workflows passed successfully: https://github.com/tpvasconcelos/ridgeplot/actions
Verify that the new git tag is present in the remote repository: https://github.com/tpvasconcelos/ridgeplot/tags
Verify that the new release is present in the remote repository and that the release notes were correctly parsed: https://github.com/tpvasconcelos/ridgeplot/releases
Verify that the new package is available in PyPI: https://pypi.org/project/ridgeplot/
Verify that the docs were updated and published to https://ridgeplot.readthedocs.io/en/stable/
Code of Conduct#
Please remember to read and follow our Code of Conduct. 🤝