Contents
ridgeplot: beautiful ridgeline plots in Python
The ridgeplot
python library aims at providing a simple API for plotting beautiful
ridgeline plots within the extensive
Plotly interactive graphing environment.
Bumper stickers:
Do one thing, and do it well!
Use sensible defaults, but allow for extensive configuration!
Installation#
The source code is currently hosted on GitHub at: https://github.com/tpvasconcelos/ridgeplot
Install and update using pip:
pip install -U ridgeplot
Dependencies#
plotly - the interactive graphing backend that powers
ridgeplot
statsmodels - Used for Kernel Density Estimation (KDE)
numpy - Supporting library for multi-dimensional array manipulations
Getting started#
Sensible defaults#
Get started with a simple function call to ridgeplot()
with sensible (plotly)
defaults.
import numpy as np
from ridgeplot import ridgeplot
# Put your real samples here...
np.random.seed(0)
synthetic_samples = [np.random.normal(n / 1.2, size=600) for n in range(9, 0, -1)]
# Call the `ridgeplot()` helper, packed with sensible defaults
fig = ridgeplot(samples=synthetic_samples)
# The returned Plotly `Figure` is still fully customizable
fig.update_layout(height=500, width=800)
# show us the work!
fig.show()
Fully configurable#
In this example, we will be replicating the first ridgeline plot example in this from Data to Viz post, which uses the probly dataset. You can find the plobly dataset on multiple sources like in the bokeh python interactive visualization library. I’ll be using the same source used in the original post.
import numpy as np
import pandas as pd
from ridgeplot import ridgeplot
# Get the raw data
df = pd.read_csv("https://raw.githubusercontent.com/bokeh/bokeh/main/bokeh/sampledata/_data/probly.csv")
# Let's grab only the subset of columns displayed in the example
column_names = [
"Almost Certainly", "Very Good Chance", "We Believe", "Likely",
"About Even", "Little Chance", "Chances Are Slight", "Almost No Chance",
]
df = df[column_names]
# Not only does 'ridgeplot(...)' come configured with sensible defaults
# but is also fully configurable to your own style and preference!
fig = ridgeplot(
samples=df.values.T,
bandwidth=4,
kde_points=np.linspace(-12.5, 112.5, 400),
colorscale="viridis",
colormode="index",
coloralpha=0.6,
labels=column_names,
spacing=5 / 9,
)
# Again, update the figure layout to your liking here
fig.update_layout(
title="What probability would you assign to the phrase <i>“Highly likely”</i>?",
height=650,
width=800,
plot_bgcolor="rgba(255, 255, 255, 0.0)",
xaxis_gridcolor="rgba(0, 0, 0, 0.1)",
yaxis_gridcolor="rgba(0, 0, 0, 0.1)",
yaxis_title="Assigned Probability (%)",
)
fig.show()
API Reference#
- ridgeplot.ridgeplot(samples=None, densities=None, kernel: str = 'gau', bandwidth='normal_reference', kde_points=500, colorscale: Union[str, Iterable[Tuple[float, str]]] = 'plasma', colormode: str = 'mean-means', coloralpha: Optional[float] = None, labels=None, linewidth: float = 1.4, spacing: float = 0.5, show_annotations: bool = True, xpad: float = 0.05) plotly.graph_objs._figure.Figure [source]#
Creates and returns a Plotly figure with a beautiful ridgeline plot.
You have to specify one of:
samples
ordensities
. If you specify bothsamples
anddensities
arguments, aValueError
exception will be raised! One of these arguments should always remain set to None. Seesamples
anddensities
bellow for more information.- Args:
- samples
If
samples
data is specified, Kernel Density Estimation (KDE) will be computed. Seekernel
,bandwidth
, andkde_points
for more details and KDE configuration options.- densities
If
densities
arrays are specified instead, the KDE step will be skipped and all associated arguments ignored. Each density array should have shape \((2, N)\), but \(N\) may vary with each array.- kernel
The Kernel to be used during Kernel Density Estimation. The default is a Gaussian Kernel (
"gau"
). Choices are:"biw"
for biweight"cos"
for cosine"epa"
for Epanechnikov"gau"
for Gaussian."tri"
for triangular"triw"
for triweight"uni"
for uniform
- bandwidth
The bandwidth to use during Kernel Density Estimation. The default is
normal_reference
. Choices are:"scott"
- 1.059 * A * nobs ** (-1/5.), where A ismin(std(x),IQR/1.34)
"silverman"
- .9 * A * nobs ** (-1/5.), where A ismin(std(x),IQR/1.34)
"normal_reference"
- C * A * nobs ** (-1/5.), where C is calculated from the kernel. Equivalent (up to 2 dp) to the"scott"
bandwidth for gaussian kernels. See bandwidths.py.If a float is given, its value is used as the bandwidth.
If a callable is given, it’s return value is used. The callable should take exactly two parameters, i.e.,
fn(x, kern)
, and return a float, where:x
: the clipped input datakern
: the kernel instance used
- kde_points
This argument controls the points at which KDE is computed. If an int value is passed (default), the densities will be evaluated at
kde_points
evenly spaced points between the min and max of each set of samples. However, you may also specify a custom range by instead passing an array of points. This array should be one-dimensional.- colorscale
Any valid Plotly color-scale or a str with a valid named color-scale. Use
get_all_colorscale_names()
to see which names are available or check out Plotly’s built-in color-scales.- colormode
This argument controls the logic for choosing the color filling of each ridgeline trace. Each option provides a different method for calculating the
colorscale
midpoint of each trace. The default is mode is"mean-means"
. Choices are:"index"
- uses the trace’s index. e.g. if 3 traces are specified, then the midpoints will be [0, 0.5, 1]."mean-minmax"
- uses the min-max normalized (weighted) mean of each density to calculate the midpoints. The normalization min and max values are the minimum and maximum x-values from all densities, respectively."mean-means"
- uses the min-max normalized (weighted) mean of each density to calculate the midpoints. The normalization min and max values are the minimum and maximum mean values from all densities, respectively.
- coloralpha
If None (default), this argument will be ignored and the transparency values of the specifies color-scale will remain untouched. Otherwise, if a float value is passed, it will be used to overwrite the transparency (alpha) of the color-scale’s colors.
- labels
A list of string labels for each trace. The default value is None, which will result in auto-generated labels of form “Trace n”. If, instead, a list of labels is specified, it must be of the same size/length as the number of traces.
- linewidth
The traces’ line width (in px).
- spacing
The vertical spacing between density traces, which is defined in units of the highest distribution (i.e. the maximum y-value).
- show_annotations
If True (default), it will show the label names as “y-tick-labels”.
- xpad
Specifies the extra padding to use on the x-axis. It is defined in units of the range between the minimum and maximum x-values from all distributions.
.._cs: https://plotly.com/python/builtin-colorscales/
- Returns:
plotly.graph_objects.Figure
A Plotly
Figure
with a ridgeline plot. You can further customize this figure to your liking (e.g. using theupdate_layout()
method).
- Raises:
ValueError
If both
samples
anddensities
arguments are not None, or if neithersamples
ordensities
are specified.
Changelog#
This document outlines the list of changes to ridgeplot between each release. For full details, see the commit logs.
Unreleased changes#
…
0.1.17#
🔧 Automate the release process. See .github/workflows/release.yaml, which issues a new GitHub release whenever a new git tag is pushed to the master branch by extracting the release notes from the changelog.
🔧 Fix automated release process to PyPi. (#27)
0.1.16#
0.1.14#
📦 Remove
named_colorscales
from public API (#18)
0.1.13#
🧪 Add tests for example scripts (#14)
0.1.12#
Internal#
📦 Update and standardise CI steps (#6)
Documentation#
0.1.11#
🐛
colors.json
was missing from the final distributions (#2)
0.1.0#
🚀 Initial release!
Alternatives#
Contributing#
Thank you for your interest in contributing to ridgeplot! 🚀
The contribution process for ridgeplot should start with filing a GitHub issue. We define three main categories of issues, and each category has its own GitHub issue template
⭐ Feature requests
🐛 Bug reports
📚 Documentation fixes
After the implementation strategy has been agreed on by a ridgeplot contributor, the next step is to introduce your changes as a pull request (see Pull Request Workflow) against the ridgeplot repository. Once your pull request is merged, your changes will be automatically included in the next ridgeplot release. Every change should be listed in the ridgeplot Changelog.
The following is a set of (slightly opinionated) rules and general guidelines for contributing to ridgeplot. Emphasis on guidelines, not rules. Use your best judgment, and feel free to propose changes to this document in a pull request.
Development environment#
Here are some guidelines for setting up your development environment. Most of the steps have been abstracted away using the make build automation tool. Feel free to peak inside Makefile at any time to see exactly what is being run, and in which order.
First, you will need to clone this repository. For this, make sure you have a GitHub account, fork ridgeplot to your GitHub account by clicking the Fork button, and clone the main repository locally (e.g. using SSH)
git clone git@github.com:tpvasconcelos/ridgeplot.git
cd ridgeplot
You will also need to add your fork as a remote to push your work to. Replace {username}
with your
GitHub username.
git remote add fork git@github.com:{username}/ridgeplot.git
The following command will 1) create a new virtual environment (under .venv
), 2) install ridgeplot
in editable mode (along with all
it’s dependencies), and 3) set up and install all pre-commit hooks. Make
sure you always work within this virtual environment (i.e., $ source .venv/bin/activate
). On top
of this, you should also set up your IDE to always point to this python interpreter. In PyCharm,
open Preferences -> Project: ridgeplot -> Project Interpreter
and point the python interpreter
to .venv/bin/python
.
make init
The default and recommended base python is python3.7
. You can change this by exporting the
BASE_PYTHON
environment variable. For instance, you could instead run:
BASE_PYTHON=python3.8 make init
If you need to use jupyter-lab, you can install all extra requirements, as well as set up the environment and jupyter kernel with
make init-jupyter
Bonus: If you need to use plotly inside a jupyter-lab notebook, just run
make jupyter-plotly
Pull Request Workflow#
Always confirm that you have properly configured your Git username and email.
git config --global user.name 'Your real name' git config --global user.email 'Your email address'
Each release series has its own branch (i.e.
MAJOR.MINOR.x
). If submitting a documentation or bug fix contribution, branch off of the latest release series branch.git fetch origin git checkout -b <YOUR-BRANCH-NAME> origin/2.0.x
Otherwise, if submitting a new feature or API change, branch off of the “master” branch
git fetch origin git checkout -b <YOUR-BRANCH-NAME> origin/main
Apply and commit your changes.
Include tests that cover any code changes you make, and make sure the test fails without your patch.
Add an entry to CHANGES.md summarising the changes in this pull request. The entry should follow the same style and format as other entries, i.e.
- Your summary here. (#XXX)
where
#XXX
should link to the relevant pull request. If you think that the changes in this pull request do not warrant a changelog entry, please state it in your pull request’s description. In such cases, a maintainer should add askip news
label to make CI pass.Make sure all integration approval steps are passing locally (i.e.,
tox
).Push your changes to your fork
git push --set-upstream fork <YOUR-BRANCH-NAME>
Create a pull request . Remember to update the pull request’s description with relevant notes on the changes implemented, and to link to relevant issues (e.g.,
fixes #XXX
orcloses #XXX
).Wait for all remote CI checks to pass and for a ridgeplot contributor to approve your pull request.
Continuous Integration#
From GitHub’s Continuous Integration and Continuous Delivery (CI/CD) Fundamentals:
Continuous Integration (CI) automatically builds, tests, and integrates code changes within a shared repository.
The first step to Continuous Integration (CI) is having a version control system (VCS) in place. Luckily, you don’t have to worry about that! As you have already noticed, we use Git and host on GitHub.
On top of this, we also run a series of integration approval steps that allow us to ship code changes faster and more reliably. In order to achieve this, we run automated tests and coverage reports, as well as syntax (and type) checkers, code style formatters, and dependency vulnerability scans.
Running it locally#
Our tool of choice to configure and reliably run all integration approval steps is Tox, which allows us to run each step in reproducible isolated virtual environments. To trigger all checks, simply run
tox
It’s that simple 🙌 !! Note only that this will take a while the first time you run the command, since it will have to create all the required virtual environments (along with their dependencies) for each CI step.
The configuration for Tox can be found in tox.ini.
Tests and coverage reports#
We use pytest as our testing framework, and pytest-cov to track and measure code coverage. You can find all configuration details in tox.ini. To trigger all tests, simply run
tox -e py
You can also run your tests against any other supported python versions (e.g., tox -e py38
). If
you need more control over which tests are running, or which flags are being passed to pytest, you
can also invoke pytest
directly which will run on your current virtual environment. Configuration
details can be found in tox.ini.
Linting#
This project uses pre-commit hooks to check and automatically fix any formatting rules. These checks are triggered before creating any git commit. To manually trigger all linting steps (i.e., all pre-commit hooks), run
tox -e lint
For more information on which hooks will run, have a look inside
the .pre-commit-config.yaml configuration file. If you want to manually trigger
individual hooks, you can invoke the pre-commit
script directly. If you need even more control over
the tools used you could also invoke them directly (e.g., isort .
). Remember however that this
is not the recommended approach.
GitHub Actions#
We use GitHub Actions to automatically run all integration approval steps defined with Tox on every push or pull request event. These checks run on all major operating systems and all supported Python versions. Finally, the generated coverage reports are uploaded to Codecov and Codacy. Check .github/workflows/ci.yaml for more details.
Tools and software#
Here is a quick overview of all CI tools and software in use, some of which have already been discussed in the sections above.
Tool |
Category |
config files |
Details |
---|---|---|---|
🔧 Orchestration |
We use Tox to reliably run all integration approval steps in reproducible isolated virtual environments. |
||
🔧 Orchestration |
Workflow automation for GitHub. We use it to automatically run all integration approval steps defined with Tox on every push or pull request event. |
||
🕰 VCS |
Projects version control system software of choice. |
||
🧪 Testing |
Testing framework for python code. |
||
📊 Coverage |
Coverage plugin for pytest. |
||
📊 Coverage |
Two great services for tracking, monitoring, and alerting on code coverage and code quality. |
||
💅 Linting |
Used to to automatically check and fix any formatting rules on every commit. |
||
💅 Linting |
A static type checker for Python. We use quite a strict configuration here, which can be tricky at times. Feel free to ask for help from the community by commenting on your issue or pull request. |
||
💅 Linting |
“The uncompromising Python code formatter”. We use |
||
💅 Linting |
Used to check the style and quality of python code. |
||
💅 Linting |
Used to sort python imports. |
||
💅 Linting |
This repository uses the |
Project structure#
Community health files#
GitHub’s community health files allow repository maintainers to set contributing guidelines to help collaborators make meaningful, useful contributions to a project. Read more on this official reference .
CODE_OF_CONDUCT.md - A CODE_OF_CONDUCT file defines standards for how to engage in a community. For more information, see “Adding a code of conduct to your project.”
CONTRIBUTING.md - A CONTRIBUTING file communicates how people should contribute to your project. For more information, see “Setting guidelines for repository contributors.”
Configuration files#
For more context on some of the tools referenced below, refer to the sections on Continuous Integration.
.github/workflows/ci.yaml - Workflow definition for our CI GitHub Actions pipeline.
.pre-commit-config.yaml - List of pre-commit hooks.
.editorconfig - EditorConfig standard configuration file.
mypy.ini - Configuration for the
mypy
static type checker.build system requirements (probably won’t need to touch these!) and black configurations.
setup.cfg - Here, we specify the package metadata, requirements, as well as configuration details for flake8 and isort.
Release process#
Review the
## Unreleased changes
section in CHANGES.md by checking for consistency in format and, if necessary, refactoring related entries into relevant subsections (e.g. Features , Docs, Bugfixes, Security, etc). Take a look at previous release notes for guidance and try to keep it consistent.Submit a pull request with these changes only and use the
"Cleanup release notes for X.X.X release"
template for the pull request title. ridgeplot uses the SemVer (MAJOR.MINOR.PATCH
) versioning standard. You can determine the latest release version by runninggit describe --tags --abbrev=0
on the master branch. Based on this, you can determine the next release version by incrementing the MAJOR, MINOR, or PATCH. More on this on the next section. For now, just make sure you merge this pull request into the master branch before continuing.Use the bumpversion utility to bump the current version. This utility will automatically bump the current version, and issue a relevant commit and git tag. E.g.,
# Bump MAJOR version (e.g., 0.4.2 -> 1.0.0) bumpversion major # Bump MINOR version (e.g., 0.4.2 -> 0.5.0) bumpversion minor # Bump PATCH version (e.g., 0.4.2 -> 0.4.3) bumpversion patch
You can always perform a dry-run to see what will happen under the hood.
bumpversion --dry-run --verbose [--allow-dirty] [major,minor,patch]
Push your changes along with all tag references:
git push --tags
Open a pull request titled
"Release version X.X.X"
Wait for all CI checks to pass.
A ridgeplot main contributor should sign off and merge this pull requests.
Create a new release using the GitHub UI.
Copy the raw markdown section in
CHANGES.md
corresponding to this release and use it as the description of the GitHub Release.Use the same
X.X.X
tag used in the release.
At this point a GitHub Actions workflow will be triggered which will build and publish new wheels to PyPI. Be sure to check whether all workflows passed successfully.
Code of Conduct#
Please remember to read and follow our Code of Conduct. 🤝