ridgeplot: beautiful ridgeline plots in Python

PyPI - Latest Release PyPI - Python Versions PyPI - Downloads PyPI - Package Status PyPI - License

GitHub CI Docs codecov CodeFactor Codacy code quality


ridgeplot is a Python package that provides a simple interface for plotting beautiful and interactive ridgeline plots within the extensive Plotly ecosystem.

ridgeplot - beautiful ridgeline plots in Python

Installation

ridgeplot can be installed and updated from PyPi using pip:

pip install -U ridgeplot

For more information, see the installation guide.

Getting started

Take a look at the getting started guide, which provides a quick introduction to the ridgeplot library.

Basic example

For those in a hurry, here’s a very basic example on how to quickly get started with ridgeplot().

import numpy as np
from ridgeplot import ridgeplot

my_samples = [np.random.normal(n / 1.2, size=600) for n in range(9, 0, -1)]
fig = ridgeplot(samples=my_samples)
fig.update_layout(height=500, width=800)
fig.show()

ridgeline plot example using the ridgeplot Python library

Installation#

Installing from PyPi#

ridgeplot can be installed and updated from PyPi using pip:

pip install -U ridgeplot

Installing from source#

The source code for this project is hosted on GitHub at: https://github.com/tpvasconcelos/ridgeplot

Take a look at the contributing guide for instructions on how to build from the git source. Further, refer to the instructions on creating a development environment if you wish to create a local development environment, or wish to contribute to the project.

Dependencies#

We try to keep the number of dependencies to a minimum and only use common and well-established libraries in the scientific python ecosystem. Currently, we only depend on the following 3 Python packages:

  • plotly - The interactive graphing backend that powers ridgeplot

  • statsmodels - Used for Kernel Density Estimation (KDE)

  • numpy - Supporting library for multidimensional array manipulations

Getting started#

This page provides a quick introduction to the ridgeplot library, showcasing some of its features and providing a few practical examples. All examples use the ridgeplot.ridgeplot() function, which is the main entry point to the library. For more information on the available options, take a look at the reference page.

Basic example#

This basic example shows how you can quickly get started with a simple call to the ridgeplot() function.

import numpy as np
from ridgeplot import ridgeplot

my_samples = [np.random.normal(n / 1.2, size=600) for n in range(9, 0, -1)]
fig = ridgeplot(samples=my_samples)
fig.update_layout(height=500, width=800)
fig.show()

ridgeline plot example using the ridgeplot Python library

Flexible configuration#

In this example, we will try to replicate the first ridgeline plot in this from Data to Viz post. The example in the post was created using the “Perception of Probability Words” dataset (see load_probly()) and the popular ggridges R package. In the end, we will see how the ridgeplot Python library can be used to create a (nearly) identical plot, thanks to its extensive configuration options.

import numpy as np
from ridgeplot import ridgeplot
from ridgeplot.datasets import load_probly

# Load the probly dataset
df = load_probly()

# Let's grab the subset of columns used in the example
column_names = [
    "Almost Certainly",
    "Very Good Chance",
    "We Believe",
    "Likely",
    "About Even",
    "Little Chance",
    "Chances Are Slight",
    "Almost No Chance",
]
df = df[column_names]

# Not only does 'ridgeplot(...)' come configured with sensible defaults
# but is also fully configurable to your own style and preference!
fig = ridgeplot(
    samples=df.values.T,
    bandwidth=4,
    kde_points=np.linspace(-12.5, 112.5, 500),
    colorscale="viridis",
    colormode="row-index",
    coloralpha=0.65,
    labels=column_names,
    linewidth=2,
    spacing=5 / 9,
)

# And you can still update and extend the final
# Plotly Figure using standard Plotly methods
fig.update_layout(
    height=760,
    width=900,
    font_size=16,
    plot_bgcolor="white",
    xaxis_tickvals=[-12.5, 0, 12.5, 25, 37.5, 50, 62.5, 75, 87.5, 100, 112.5],
    xaxis_ticktext=["", "0", "", "25", "", "50", "", "75", "", "100", ""],
    xaxis_gridcolor="rgba(0, 0, 0, 0.1)",
    yaxis_gridcolor="rgba(0, 0, 0, 0.1)",
    yaxis_title="Assigned Probability (%)",
    showlegend=False,
)

# Show us the work!
fig.show()

The resulting ridgeline plot generated by the code above: ridgeline plot of the probly dataset using the ridgeplot Python library

The target reference from the from Data to Viz post: reference ridgeline plot of the probly dataset from data to viz

More traces#

In this example, we will dive a bit deeper into the samples parameter and see how we can be used to plot multiple traces per row in a ridgeline plot.

Final result#

For the ones in a hurry, we are including the entire final code-block and resulting plot already in this section. It is here also to serve as a reference for the rest of the section and to demonstrate what the goal of this example is. That said, throughout the rest of this section, we will dive a bit deeper into the samples parameter and understand how flexible it is.

ridgeline plot of the Lincoln Weather dataset using the ridgeplot Python library

import numpy as np
from ridgeplot import ridgeplot
from ridgeplot.datasets import load_lincoln_weather

# Load test data
df = load_lincoln_weather()

# Transform the data into a 3D (ragged) array format of
# daily min and max temperature samples per month
months = df.index.month_name().unique()
samples = [
    [
        df[df.index.month_name() == month]["Min Temperature [F]"],
        df[df.index.month_name() == month]["Max Temperature [F]"],
    ]
    for month in months
]

# And finish by styling it up to your liking!
fig = ridgeplot(
    samples=samples,
    labels=months,
    coloralpha=0.98,
    bandwidth=4,
    kde_points=np.linspace(-25, 110, 400),
    spacing=0.33,
    linewidth=2,
)
fig.update_layout(
    title="Minimum and maximum daily temperatures in Lincoln, NE (2016)",
    height=650,
    width=950,
    font_size=14,
    plot_bgcolor="rgb(245, 245, 245)",
    xaxis_gridcolor="white",
    yaxis_gridcolor="white",
    xaxis_gridwidth=2,
    yaxis_title="Month",
    xaxis_title="Temperature [F]",
    showlegend=False,
)
fig.show()

Step-by-step#

Let’s start by loading the “Lincoln Weather” test dataset (see load_lincoln_weather()).

>>> from ridgeplot.datasets import load_lincoln_weather
>>> df = load_lincoln_weather()
>>> df[["Min Temperature [F]", "Max Temperature [F]"]].head()
            Min Temperature [F]  Max Temperature [F]
CST
2016-01-01                   11                   37
2016-01-02                    5                   41
2016-01-03                    8                   37
2016-01-04                    4                   30
2016-01-05                   19                   38

The goal will be to plot the KDEs for the minimum and maximum daily temperatures for each month of 2016 (i.e. the year covered by the dataset).

>>> months = df.index.month_name().unique()
>>> months.to_list()
['January', 'February', 'March', 'April', 'May', 'June', 'July',
 'August', 'September', 'October', 'November', 'December']

The samples argument in the ridgeplot() function expects a 3D array of shape \((R, T_r, S_t)\), where \(R\) is the number of rows, \(T_r\) is the number of traces per row, and \(S_t\) is the number of samples per trace, with:

Dimension values

Description

\(R=12\)

One row per month.

\(T_r=2\) (for all rows \(r \in R\))

Two traces per row (one for the minimum temperatures and one for the maximum temperatures).

\(S_t \in \{29, 30, 31\}\)

One sample per day of the month, where different months have different number of days.

We can create this array using a simple list comprehension, where each element of the list is a list of two arrays, one for the minimum temperatures and one for the maximum temperatures samples, for each month:

samples = [
    [
        df[df.index.month_name() == month]["Min Temperature [F]"],
        df[df.index.month_name() == month]["Max Temperature [F]"],
    ]
    for month in months
]

Note

For other use cases (like in the two previous examples), you could use a numpy ndarray to represent the samples. However, since different months have different number of days, we need to use a data container that can hold arrays of different lengths along the same dimension. Irregular arrays like this one are called ragged arrays. There are many different ways you can represent irregular arrays in Python. In this specific example, we used a list of lists of pandas Series. However,ridgeplot is designed to handle any object that implements the Collection[Collection[Collection[Numeric]]] protocol (i.e. any numeric 3D ragged array).

Finally, we can pass the samples list to the ridgeplot() function and specify any other arguments we want to customize the plot, like adjusting the KDE’s bandwidth, the vertical spacing between rows, etc.

fig = ridgeplot(
    samples=samples,
    labels=months,
    coloralpha=0.98,
    bandwidth=4,
    kde_points=np.linspace(-25, 110, 400),
    spacing=0.33,
    linewidth=2,
)

fig.update_layout(
    title="Minimum and maximum daily temperatures in Lincoln, NE (2016)",
    height=650,
    width=950,
    font_size=14,
    plot_bgcolor="rgb(245, 245, 245)",
    xaxis_gridcolor="white",
    yaxis_gridcolor="white",
    xaxis_gridwidth=2,
    yaxis_title="Month",
    xaxis_title="Temperature [F]",
    showlegend=False,
)
fig.show()

ridgeline plot of the Lincoln Weather dataset using the ridgeplot Python library

API Reference#

ridgeplot.ridgeplot

Return an interactive ridgeline (Plotly) Figure.

Color utilities#

ridgeplot.list_all_colorscale_names

Get a list with all available colorscale names.

Data loading utilities#

ridgeplot.datasets.load_probly

Load a version of the "Perception of Probability Words" (a.k.a., "probly") dataset.

ridgeplot.datasets.load_lincoln_weather

Load the "Weather in Lincoln, Nebraska in 2016" dataset.

Internals#

Color utilities#

ridgeplot._colors.ColorScale#

A colorscale is an iterable of tuples of two elements:

  1. the first element (a scale value) is a float bounded to the interval [0, 1]

  2. the second element (a color) is a string representation of a color parsable by Plotly

For instance, the Viridis colorscale would be defined as

>>> get_colorscale("viridis")
((0.0, 'rgb(68, 1, 84)'),
 (0.1111111111111111, 'rgb(72, 40, 120)'),
 (0.2222222222222222, 'rgb(62, 73, 137)'),
 (0.3333333333333333, 'rgb(49, 104, 142)'),
 (0.4444444444444444, 'rgb(38, 130, 142)'),
 (0.5555555555555556, 'rgb(31, 158, 137)'),
 (0.6666666666666666, 'rgb(53, 183, 121)'),
 (0.7777777777777777, 'rgb(110, 206, 88)'),
 (0.8888888888888888, 'rgb(181, 222, 43)'),
 (1.0, 'rgb(253, 231, 37)'))

alias of Iterable[Tuple[float, str]]

ridgeplot._colors._colormap_loader()[source]#
ridgeplot._colors.validate_colorscale(colorscale)[source]#

Validate the structure, scale values, and colors of a colorscale.

Adapted from _plotly_utils.colors.validate_colorscale().

ridgeplot._colors._any_to_rgb(color)[source]#

Convert any color to an rgb string.

Parameters:

color – A color. This can be a tuple of (r, g, b) values, a hex string, or an rgb string.

Returns:

An rgb string.

Return type:

str

Raises:
  • TypeError – If color is not a tuple or a string.

  • ValueError – If color is a string that does not represent a hex or rgb color.

ridgeplot._colors.list_all_colorscale_names()[source]#

Get a list with all available colorscale names.

New in version 0.1.21: Replaces the deprecated get_all_colorscale_names().

Returns:

A list with all available colorscale names.

Return type:

list[str]

ridgeplot._colors.get_all_colorscale_names()[source]#

Get a tuple with all available colorscale names.

Deprecated since version 0.1.21: Use list_all_colorscale_names() instead.

Returns:

A tuple with all available colorscale names.

Return type:

tuple[str, ]

ridgeplot._colors.get_colorscale(name)[source]#

Get a colorscale by name.

Parameters:

name – The colorscale name. This argument is case-insensitive. For instance, "YlOrRd" and "ylorrd" map to the same colorscale. Colorscale names ending in _r represent a reversed colorscale.

Returns:

A colorscale.

Return type:

ColorScale

Raises:

ValueError – If an unknown name is provided

ridgeplot._colors.get_color(colorscale, midpoint)[source]#

Get a color from a colorscale at a given midpoint.

Given a colorscale, it interpolates the expected color at a given midpoint, on a scale from 0 to 1.

ridgeplot._colors.apply_alpha(color, alpha)[source]#

Figure factory#

ridgeplot._figure_factory.LabelsArray#

A LabelsArray represents the labels of traces in a ridgeplot.

For instance, the following is a valid LabelsArray:

>>> labels_array: LabelsArray = [
...     ["trace 1", "trace 2", "trace 3"],
...     ["trace 4", "trace 5"],
... ]

alias of Collection[Collection[str]]

ridgeplot._figure_factory.ShallowLabelsArray#

Shallow type for LabelsArray.

Example:

>>> labels_array: ShallowLabelsArray = ["trace 1", "trace 2", "trace 3"]

alias of Collection[str]

ridgeplot._figure_factory.ColorsArray#

A ColorsArray represents the colors of traces in a ridgeplot.

For instance, the following is a valid ColorsArray:

>>> colors_array: ColorsArray = [
...     ["red", "blue", "green"],
...     ["orange", "purple"],
... ]

alias of Collection[Collection[str]]

ridgeplot._figure_factory.ShallowColorsArray#

Shallow type for ColorsArray.

Example:

>>> colors_array: ShallowColorsArray = ["red", "blue", "green"]

alias of Collection[str]

ridgeplot._figure_factory.MidpointsArray#

A MidpointsArray represents the midpoints of colorscales in a ridgeplot.

For instance, the following is a valid MidpointsArray:

>>> midpoints_array: MidpointsArray = [
...     [0.2, 0.5, 1],
...     [0.3, 0.7],
... ]

alias of Collection[Collection[float]]

ridgeplot._figure_factory.get_xy_extrema(densities)[source]#

Get the global x-y extrema (x_min, x_max, y_min, y_max) from all the Densities array.

Parameters:

densities – A Densities array.

Returns:

A tuple of the form (x_min, x_max, y_min, y_max).

Return type:

Tuple[Numeric, Numeric, Numeric, Numeric]

Examples

>>> get_xy_extrema(
...     [
...         [
...             [(0, 0), (1, 1), (2, 2), (3, 3)],
...             [(0, 0), (1, 1), (2, 2)],
...             [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)],
...         ],
...         [
...             [(-2, 2), (-1, 1), (0, 1)],
...             [(2, 2), (3, 1), (4, 1)],
...         ],
...     ]
... )
(-2, 4, 0, 4)
ridgeplot._figure_factory._mul(a, b)[source]#

Multiply two tuples element-wise.

class ridgeplot._figure_factory.RidgePlotFigureFactory(densities, colorscale, coloralpha, colormode, labels, linewidth, spacing, show_yticklabels, xpad)[source]#

Refer to ridgeplot.ridgeplot().

property colormode_maps: Dict[str, Callable[[], MidpointsArray]]#
draw_base(x, y_shifted)[source]#

Draw the base for a density trace.

Adds an invisible trace at constant y that will serve as the fill-limit for the corresponding density trace.

draw_density_trace(x, y, y_shifted, label, color)[source]#

Draw a density trace.

Adds a density ‘trace’ to the Figure. The fill="tonexty" option fills the trace until the previously drawn trace (see draw_base()). This is why the base trace must be drawn first.

update_layout(y_ticks)[source]#

Update figure’s layout.

_compute_midpoints_row_index()[source]#

colormode=’row-index’

Uses the row’s index. e.g. if the ridgeplot has 3 rows of traces, then the midpoints will be [[1, …], [0.5, …], [0, …]].

_compute_midpoints_trace_index()[source]#

colormode=’trace-index’

Uses the trace’s index. e.g. if the ridgeplot has a total of 3 traces (across all rows), then the midpoints will be 0, 0.5, and 1, respectively.

_compute_midpoints_mean_minmax()[source]#

colormode=’mean-minmax’

Uses the min-max normalized (weighted) mean of each density to calculate the midpoints. The normalization min and max values are the minimum and maximum x-values from all densities, respectively.

_compute_midpoints_mean_means()[source]#

colormode=’mean-means’

Uses the min-max normalized (weighted) mean of each density to calculate the midpoints. The normalization min and max values are the minimum and maximum mean values from all densities, respectively.

pre_compute_colors()[source]#
make_figure()[source]#

Kernel density estimation (KDE)#

ridgeplot._kde.estimate_density_trace(trace_samples, points, kernel, bandwidth)[source]#

Estimates a density trace from a set of samples.

For a given set of sample values, computes the kernel densities (KDE) at the given points.

ridgeplot._kde.estimate_densities(samples, points, kernel, bandwidth)[source]#

Perform KDE for a set of samples.

Testing#

ridgeplot._testing.patch_plotly_show()[source]#

Patch the plotly.io.show() function to skip any rendering steps and, instead, simply call plotly.io._utils.validate_coerce_fig_to_dict().

Types#

ridgeplot._types.CollectionL1#

A TypeAlias for a 1-level-deep Collection.

Example:

>>> c1 = [1, 2, 3]

alias of Collection[_T]

ridgeplot._types.CollectionL2#

A TypeAlias for a 2-level-deep Collection.

Example:

>>> c2 = [[1, 2, 3], [4, 5, 6]]

alias of Collection[Collection[_T]]

ridgeplot._types.CollectionL3#

A TypeAlias for a 3-level-deep Collection.

Example:

>>> c3 = [
...     [[1, 2], [3, 4]],
...     [[5, 6], [7, 8]],
... ]

alias of Collection[Collection[Collection[_T]]]

ridgeplot._types.Float#

A TypeAlias for float types.

alias of Union[float, floating]

ridgeplot._types.Int#

A TypeAlias for a int types.

alias of Union[int, integer]

ridgeplot._types.Numeric#

A TypeAlias for numeric types.

alias of Union[int, integer, float, floating]

class ridgeplot._types.NumericT#

A TypeVar variable bound to Numeric types.

alias of TypeVar(‘NumericT’, bound=Union[int, integer, float, floating])

ridgeplot._types._is_numeric(obj: Numeric) Literal[True][source]#
ridgeplot._types._is_numeric(obj: Any) bool

Check if the given object is a Numeric type.

ridgeplot._types.XYCoordinate#

A 2D \((x, y)\) coordinate, represented as a tuple of two Numeric values.

Example:

>>> xy_coord = (1, 2)

alias of Tuple[NumericT, NumericT]

ridgeplot._types.DensityTrace#

A 2D line/trace represented as a collection of \((x, y)\) coordinates (i.e. XYCoordinates).

These are equivalent:

  • DensityTrace

  • CollectionL1[XYCoordinate]

  • Collection[Tuple[Numeric, Numeric]]

By convention, the \(x\) values should be non-repeating and increasing. For instance, the following is a valid 2D line trace:

>>> density_trace = [(0, 0), (1, 1), (2, 2), (3, 1), (4, 0)]
_images/density_trace.webp

alias of Collection[Tuple[NumericT, NumericT]]

ridgeplot._types.DensitiesRow#

A DensitiesRow represents a set of DensityTraces that are to be plotted on a given row of a ridgeplot.

These are equivalent:

  • DensitiesRow

  • CollectionL2[XYCoordinate]

  • Collection[Collection[Tuple[Numeric, Numeric]]]

Example:

>>> densities_row = [
...     [(0, 0), (1, 1), (2, 0)],                 # Trace 1
...     [(1, 0), (2, 1), (3, 2), (4, 1)],         # Trace 2
...     [(3, 0), (4, 1), (5, 2), (6, 1), (7, 0)], # Trace 3
... ]
_images/densities_row.webp

alias of Collection[Collection[Tuple[NumericT, NumericT]]]

ridgeplot._types.Densities#

The Densities type represents the entire collection of traces that are to be plotted on a ridgeplot.

In a ridgeplot, several traces can be plotted on different rows. Each row is represented by a DensitiesRow object which, in turn, is a collection of DensityTraces. Therefore, the Densities type is a collection of DensitiesRows.

These are equivalent:

  • Densities

  • CollectionL1[DensitiesRow]

  • CollectionL3[XYCoordinate]

  • Collection[Collection[Collection[Tuple[Numeric, Numeric]]]]

For instance, the following is a valid Densities object:

>>> densities = [
...     [                                             # Row 1
...         [(0, 0), (1, 1), (2, 0)],                 # Trace 1
...         [(1, 0), (2, 1), (3, 2), (4, 1)],         # Trace 2
...         [(3, 0), (4, 1), (5, 2), (6, 1), (7, 0)], # Trace 3
...     ],
...     [                                             # Row 2
...         [(-2, 0), (-1, 1), (0, 0)],               # Trace 5
...         [(0, 0), (1, 1), (2, 1), (3, 0)],         # Trace 6
...     ],
... ]
_images/densities.webp

alias of Collection[Collection[Collection[Tuple[NumericT, NumericT]]]]

ridgeplot._types.ShallowDensities#

Shallow type for Densities where each row of the ridgeplot contains only a single trace.

These are equivalent:

  • Densities

  • CollectionL1[DensityTrace]

  • CollectionL2[XYCoordinate]

  • Collection[Collection[Tuple[Numeric, Numeric]]]

Example:

>>> shallow_densities = [
...     [(0, 0), (1, 1), (2, 0)], # Trace 1
...     [(1, 0), (2, 1), (3, 0)], # Trace 2
...     [(2, 0), (3, 1), (4, 0)], # Trace 3
... ]
_images/shallow_densities.webp

alias of Collection[Collection[Tuple[NumericT, NumericT]]]

ridgeplot._types.is_shallow_densities(obj: ShallowDensities) Literal[True][source]#
ridgeplot._types.is_shallow_densities(obj: Any) bool

Check if the given object is a ShallowDensities type.

ridgeplot._types.SamplesTrace#

A SamplesTrace is a collection of numeric values representing a set of samples from which a DensityTrace can be estimated via KDE.

Example:

>>> samples_trace = [0, 1, 1, 2, 2, 2, 3, 3, 4]
_images/samples_trace.webp

alias of Collection[Union[int, integer, float, floating]]

ridgeplot._types.SamplesRow#

A SamplesRow represents a set of SamplesTraces that are to be plotted on a given row of a ridgeplot.

i.e. a SamplesRow is a collection of SamplesTraces and can be converted into a DensitiesRow by applying KDE to each trace.

Example:

>>> samples_row = [
...     [0, 1, 1, 2, 2, 2, 3, 3, 4], # Trace 1
...     [1, 2, 2, 3, 3, 3, 4, 4, 5], # Trace 2
... ]
_images/samples_row.webp

alias of Collection[Collection[Union[int, integer, float, floating]]]

ridgeplot._types.Samples#

The Samples type represents the entire collection of samples that are to be plotted on a ridgeplot.

It is a collection of SamplesRow objects. Each row is represented by a SamplesRow type which, in turn, is a collection of SamplesTraces which can be converted into DensityTrace ‘s by applying a kernel density estimation algorithm.

Therefore, the Samples type can be converted into a Densities type by applying a kernel density estimation (KDE) algorithm to each trace.

See Densities for more details.

Example:

>>> samples = [
...     [                                # Row 1
...         [0, 1, 1, 2, 2, 2, 3, 3, 4], # Trace 1
...         [1, 2, 2, 3, 3, 3, 4, 4, 5], # Trace 2
...     ],
...     [                                # Row 2
...         [2, 3, 3, 4, 4, 4, 5, 5, 6], # Trace 3
...         [3, 4, 4, 5, 5, 5, 6, 6, 7], # Trace 4
...     ],
... ]
_images/samples.webp

alias of Collection[Collection[Collection[Union[int, integer, float, floating]]]]

ridgeplot._types.ShallowSamples#

Shallow type for Samples where each row of the ridgeplot contains only a single trace.

Example:

>>> shallow_samples = [
...     [0, 1, 1, 2, 2, 2, 3, 3, 4], # Trace 1
...     [1, 2, 2, 3, 3, 3, 4, 4, 5], # Trace 2
... ]
_images/shallow_samples.webp

alias of Collection[Collection[Union[int, integer, float, floating]]]

ridgeplot._types.is_shallow_samples(obj: ShallowSamples) Literal[True][source]#
ridgeplot._types.is_shallow_samples(obj: Any) bool

Check if the given object is a ShallowSamples type.

ridgeplot._types.is_flat_str_collection(obj: Collection[str]) Literal[True][source]#
ridgeplot._types.is_flat_str_collection(obj: Any) bool

Check if the given object is a CollectionL1[str] type but not a string itself.

ridgeplot._types.nest_shallow_collection(shallow_collection)[source]#

Internal helper to convert a shallow collection type into a deep collection type.

This function should really only be used in the ridgeplot._ridgeplot module to normalize user input.

Other utilities#

ridgeplot._utils.normalise_min_max(val, min_, max_)[source]#
ridgeplot._utils.get_collection_array_shape(arr)[source]#

Return the shape of a Collection array.

Parameters:

arr – The Collection array.

Returns:

The elements of the shape tuple give the lengths of the corresponding array dimensions. If the length of a dimension is variable, the corresponding element is a set of the variable lengths. Otherwise, (if the length of a dimension is fixed), the corresponding element is an int.

Return type:

Tuple[Union[int, Set[int]], ]

Examples

>>> get_collection_array_shape([1, 2, 3])
(3,)
>>> get_collection_array_shape([[1, 2, 3], [4, 5]])
(2, {2, 3})
>>> get_collection_array_shape(
...     [
...         [
...             [1, 2, 3], [4, 5]
...         ],
...         [
...             [6, 7, 8, 9],
...         ],
...     ]
... )
(2, {1, 2}, {2, 3, 4})
>>> get_collection_array_shape(
...     [
...         [
...             [1], [2, 3], [4, 5, 6],
...         ],
...         [
...             [7, 8, 9, 10, 11],
...         ],
...     ]
... )
(2, {1, 3}, {1, 2, 3, 5})
>>> get_collection_array_shape(
...     [
...         [
...             [(0, 0), (1, 1), (2, 2), (3, 3)],
...             [(0, 0), (1, 1), (2, 2)],
...             [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)],
...         ],
...         [
...             [(-2, 2), (-1, 1), (0, 1)],
...             [(2, 2), (3, 1), (4, 1)],
...         ],
...     ]
... )
(2, {2, 3}, {3, 4, 5}, 2)
>>> get_collection_array_shape(
...     [
...         [
...             ["a", "b", "c", "d"], ["e", "f"],
...         ],
...         [
...             ["h", "i", "j", "k", "l"],
...         ],
...     ]
... )
(2, {1, 2}, {2, 4, 5})
class ridgeplot._utils.LazyMapping(loader)[source]#

A lazy mapping that loads its contents only when first needed.

Parameters:

loader – A callable that returns a mapping.

Examples

>>> from typing import Dict
>>>
>>> def my_io_loader() -> Dict[str, int]:
...     print("Loading...")
...     return {"a": 1, "b": 2}
...
>>> lazy_mapping = LazyMapping(my_io_loader)
>>> lazy_mapping
Loading...
{'a': 1, 'b': 2}
_loader#
_inner_mapping: Mapping[KT, VT] | None#
property _mapping: Mapping[KT, VT]#
_abc_impl = <_abc._abc_data object>#

Alternatives#

Changelog#

This document outlines the list of changes to ridgeplot between each release. For full details, see the commit logs.

Unreleased changes#


0.1.23#

  • Fix the references to the interactive Plotly IFrames (#129)


0.1.22#

Deprecations#

  • The colormode='index' value has been deprecated in favor of colormode='row-index', which provides the same functionality but is more explicit and allows to distinguish between the 'row-index' and 'trace-index' modes. (#114)

  • The show_annotations argument has been deprecated in favor of show_yticklabels. (#114)

  • The get_all_colorscale_names() function has been deprecated in favor of list_all_colorscale_names(). (#114)

Features#

  • Add functionality to allow plotting of multiple traces per row. (#114)

  • Add ridgeplot.datasets.load_lincoln_weather() helper function to load the “Lincoln Weather” toy dataset. (#114)

  • Add more versions of the probly dataset ("wadefagen" and "illinois"). (#114)

  • Add support for Python 3.11.

Documentation#

  • Major update to the documentation, including more examples, interactive plots, script to generate the HTML and WebP images from the example scripts, improved API reference, and more. (#114)

Internal#

  • Remove mdformat from the automated CI checks. It can still be triggered manually. (#114)

  • Improved type annotations and type checking. (#114)


0.1.21#

Features#

  • Add ridgeplot.datasets.load_probly() helper function to load the probly toy dataset. The probly.csv file is now included in the package under ridgeplot/datasets/data/. (#80)

Documentation#

  • Change to numpydoc style docstrings. (#81)

  • Add a robots.txt to the docs site. (#81)

  • Auto-generate a site map for the docs site using sphinx_sitemap. (#81)

  • Change the sphinx theme to furo. (#81)

  • Improve the internal documentation and some of these internals to the API reference. (#81)

Internal#

  • Fixed and improved some type annotations, including the introduction of ridgeplot._types module for type aliases such as Numeric and NestedNumericSequence. (#80)

  • Add the blacken-docs pre-commit hook and add the pep8-naming, flake8-pytest-style, flake8-simplify, flake8-implicit-str-concat, flake8-bugbear, flake8-rst-docstrings, flake8-rst-docstrings, etc… plugins to the flake8 pre-commit hook. (#81)

  • Cleanup and improve some type annotations. (#81)

  • Update deprecated set-output commands (GitHub Actions) (#87)


0.1.17#

  • Automate the release process. See .github/workflows/release.yaml, which issues a new GitHub release whenever a new git tag is pushed to the main branch by extracting the release notes from the changelog.

  • Fix automated release process to PyPI. (#27)


0.1.16#

  • Upgrade project structure, improve testing and CI checks, and start basic Sphinx docs. (#21)

  • Implement LazyMapping helper to allow ridgeplot._colors.PLOTLY_COLORSCALES to lazy-load from colors.json (#20)


0.1.14#

  • Remove named_colorscales from public API (#18)


0.1.13#

  • Add tests for example scripts (#14)


0.1.12#

Internal#

  • Update and standardise CI steps (#6)

Documentation#

  • Publish official contribution guidelines (CONTRIBUTING.md) (#8)

  • Publish an official Code of Conduct (CODE_OF_CONDUCT.md) (#7)

  • Publish an official release/change log (CHANGES.md) (#6)


0.1.11#

  • colors.json was missing from the final distributions (#2)


0.1.0#

  • 🚀 Initial release!

Contributing#

Thank you for your interest in contributing to ridgeplot! 🚀

The contribution process for ridgeplot should start with filing a GitHub issue. We define three main categories of issues, and each category has its own GitHub issue template

  • ⭐ Feature requests

  • 🐛 Bug reports

  • 📚 Documentation fixes

After the implementation strategy has been agreed on by a ridgeplot contributor, the next step is to introduce your changes as a pull request (see Pull Request Workflow) against the ridgeplot repository. Once your pull request is merged, your changes will be automatically included in the next ridgeplot release. Every change should be listed in the ridgeplot Changelog.

The following is a set of (slightly opinionated) rules and general guidelines for contributing to ridgeplot. Emphasis on guidelines, not rules. Use your best judgment, and feel free to propose changes to this document in a pull request.

Development environment#

Here are some guidelines for setting up your development environment. Most of the steps have been abstracted away using the make build automation tool. Feel free to peak inside Makefile at any time to see exactly what is being run, and in which order.

First, you will need to clone this repository. For this, make sure you have a GitHub account, fork ridgeplot to your GitHub account by clicking the Fork button, and clone the main repository locally (e.g. using SSH)

git clone git@github.com:tpvasconcelos/ridgeplot.git
cd ridgeplot

You will also need to add your fork as a remote to push your work to. Replace {username} with your GitHub username.

git remote add fork git@github.com:{username}/ridgeplot.git

The following command will 1) create a new virtual environment (under .venv), 2) install ridgeplot in editable mode (along with all it’s dependencies), and 3) set up and install all pre-commit hooks. Make sure you always work within this virtual environment (i.e., $ source .venv/bin/activate). On top of this, you should also set up your IDE to always point to this python interpreter. In PyCharm, open Preferences -> Project: ridgeplot -> Project Interpreter and point the python interpreter to .venv/bin/python.

make init

The default and recommended base python is python3.7 . You can change this by exporting the BASE_PYTHON environment variable. For instance, if you are having issues installing scientific packages on macOS for python 3.7, you can try python 3.8 instead:

BASE_PYTHON=python3.8 make init

If you need to use jupyter-lab, you can install all extra requirements, as well as set up the environment and jupyter kernel with

make init-jupyter

Pull Request Workflow#

  1. Always confirm that you have properly configured your Git username and email.

    git config --global user.name 'Your name'
    git config --global user.email 'Your email address'
    
  2. Each release series has its own branch (i.e. MAJOR.MINOR.x). If submitting a documentation or bug fix contribution, branch off of the latest release series branch.

    git fetch origin
    git checkout -b <YOUR-BRANCH-NAME> origin/x.x.x
    

    Otherwise, if submitting a new feature or API change, branch off of the main branch

    git fetch origin
    git checkout -b <YOUR-BRANCH-NAME> origin/main
    
  3. Apply and commit your changes.

  4. Include tests that cover any code changes you make, and make sure the test fails without your patch.

  5. Add an entry to CHANGES.md summarising the changes in this pull request. The entry should follow the same style and format as other entries, i.e.

    - Your summary here. (#XXX)

    where #XXX should link to the relevant pull request. If you think that the changes in this pull request do not warrant a changelog entry, please state it in your pull request’s description. In such cases, a maintainer should add a skip news label to make CI pass.

  6. Make sure all integration approval steps are passing locally (i.e., tox).

  7. Push your changes to your fork

    git push --set-upstream fork <YOUR-BRANCH-NAME>
    
  8. Create a pull request . Remember to update the pull request’s description with relevant notes on the changes implemented, and to link to relevant issues (e.g., fixes #XXX or closes #XXX).

  9. Wait for all remote CI checks to pass and for a ridgeplot contributor to approve your pull request.

Continuous Integration#

From GitHub’s Continuous Integration and Continuous Delivery (CI/CD) Fundamentals:

Continuous Integration (CI) automatically builds, tests, and integrates code changes within a shared repository.

The first step to Continuous Integration (CI) is having a version control system (VCS) in place. Luckily, you don’t have to worry about that! As you have already noticed, we use Git and host on GitHub.

On top of this, we also run a series of integration approval steps that allow us to ship code changes faster and more reliably. In order to achieve this, we run automated tests and coverage reports, as well as syntax (and type) checkers, code style formatters, and dependency vulnerability scans.

Running it locally#

Our tool of choice to configure and reliably run all integration approval steps is Tox, which allows us to run each step in reproducible isolated virtual environments. To trigger all checks in parallel, simply run

./bin/tox --parallel auto -m static tests

It’s that simple 🙌 !! Note only that this will take a while the first time you run the command, since it will have to create all the required virtual environments (along with their dependencies) for each CI step.

The configuration for Tox can be found in tox.ini.

Tests and coverage reports#

We use pytest as our testing framework, and pytest-cov to track and measure code coverage. You can find all configuration details in tox.ini. To trigger all tests, simply run

./bin/tox --parallel auto -m tests

If you need more control over which tests are running, or which flags are being passed to pytest, you can also invoke pytest directly which will run on your current virtual environment. Configuration details can be found in tox.ini.

Linting#

This project uses pre-commit hooks to check and automatically fix any formatting rules. These checks are triggered before creating any git commit. To manually trigger all linting steps (i.e., all pre-commit hooks), run

pre-commit run --all-files

For more information on which hooks will run, have a look inside the .pre-commit-config.yaml configuration file. If you want to manually trigger individual hooks, you can invoke the pre-commitscript directly. If you need even more control over the tools used you could also invoke them directly (e.g., isort .). Remember however that this is not the recommended approach.

GitHub Actions#

We use GitHub Actions to automatically run all integration approval steps defined with Tox on every push or pull request event. These checks run on all major operating systems and all supported Python versions. Finally, the generated coverage reports are uploaded to Codecov and Codacy. Check .github/workflows/ci.yaml for more details.

Tools and software#

Here is a quick overview of all CI tools and software in use, some of which have already been discussed in the sections above.

Tool

Category

config files

Details

Tox

🔧 Orchestration

tox.ini

We use Tox to reliably run all integration approval steps in reproducible isolated virtual environments.

GitHub Actions

🔧 Orchestration

.github/workflows/ci.yaml

Workflow automation for GitHub. We use it to automatically run all integration approval steps defined with Tox on every push or pull request event.

Git

🕰 VCS

.gitignore

Projects version control system software of choice.

pytest

🧪 Testing

tox.ini

Testing framework for python code.

pytest-cov

📊 Coverage

tox.ini

Coverage plugin for pytest.

Codecov and Codacy

📊 Coverage

Two great services for tracking, monitoring, and alerting on code coverage and code quality.

pre-commit hooks

💅 Linting

.pre-commit-config.yaml

Used to to automatically check and fix any formatting rules on every commit.

mypy

💅 Linting

mypy.ini

A static type checker for Python. We use quite a strict configuration here, which can be tricky at times. Feel free to ask for help from the community by commenting on your issue or pull request.

black

💅 Linting

pyproject.toml

“The uncompromising Python code formatter”. We use black to automatically format Python code in a deterministic manner. We use a maximum line length of 100 characters.

flake8

💅 Linting

setup.cfg

Used to check the style and quality of python code.

isort

💅 Linting

setup.cfg

Used to sort python imports.

EditorConfig

💅 Linting

.editorconfig

This repository uses the .editorconfig standard configuration file, which aims to ensure consistent style across multiple programming environments.

Project structure#

Community health files#

GitHub’s community health files allow repository maintainers to set contributing guidelines to help collaborators make meaningful, useful contributions to a project. Read more on this official reference .

Configuration files#

For more context on some of the tools referenced below, refer to the sections on Continuous Integration.

Release process#

You need push access to the project’s repository to make releases. The following release steps are here for reference only.

  1. Review the ## Unreleased changes section in CHANGES.md by checking for consistency in format and, if necessary, refactoring related entries into relevant subsections (e.g. Features , Docs, Bugfixes, Security, etc). Take a look at previous release notes for guidance and try to keep it consistent.

  2. Submit a pull request with these changes only and use the "Cleanup release notes for X.X.X release" template for the pull request title. ridgeplot uses the SemVer (MAJOR.MINOR.PATCH) versioning standard. You can determine the latest release version by running git describe --tags --abbrev=0 on the main branch. Based on this, you can determine the next release version by incrementing the MAJOR, MINOR, or PATCH. More on this on the next section. For now, just make sure you merge this pull request into the main branch before continuing.

  3. Use the bumpversion utility to bump the current version. This utility will automatically bump the current version, and issue a relevant commit and git tag. E.g.,

    # Bump MAJOR version (e.g., 0.4.2 -> 1.0.0)
    bumpversion major
    
    # Bump MINOR version (e.g., 0.4.2 -> 0.5.0)
    bumpversion minor
    
    # Bump PATCH version (e.g., 0.4.2 -> 0.4.3)
    bumpversion patch
    

    You can always perform a dry-run to see what will happen under the hood.

    bumpversion --dry-run --verbose [--allow-dirty] [major,minor,patch]
    
  4. Push your changes along with all tag references:

    git push && git push --tags
    
  5. At this point a couple of GitHub Actions workflows will be triggered:

    1. .github/workflows/ci.yaml: Runs all CI checks with Tox against the new changes pushed to main.

    2. .github/workflows/release.yaml: Issues a new GitHub release triggered by the new git tag pushed in the previous step.

    3. .github/workflows/publish-pypi.yaml: Builds, packages, and uploads the source and wheel package to PyPI (and test PyPI). This is triggered by the new GitHub release created in the previous step.

  6. Trust but verify!

    1. Verify that all three workflows passed successfully: https://github.com/tpvasconcelos/ridgeplot/actions

    2. Verify that the new git tag is present in the remote repository: https://github.com/tpvasconcelos/ridgeplot/tags

    3. Verify that the new release is present in the remote repository and that the release notes were correctly parsed: https://github.com/tpvasconcelos/ridgeplot/releases

    4. Verify that the new package is available in PyPI: https://pypi.org/project/ridgeplot/

    5. Verify that the docs were updated and published to https://ridgeplot.readthedocs.io/en/stable/

Code of Conduct#

Please remember to read and follow our Code of Conduct. 🤝