Module pandas_profiling

Main module of pandas-profiling.

Pandas Profiling

Pandas Profiling Logo Header

Build Status Code Coverage Release Version Python Version Code style: black

Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great but a little basic for serious exploratory data analysis. pandas_profiling extends the pandas DataFrame with df.profile_report() for quick data analysis.

For each column the following statistics - if relevant for the column type - are presented in an interactive HTML report:

  • Type inference: detect the types of columns in a dataframe.
  • Essentials: type, unique values, missing values
  • Quantile statistics like minimum value, Q1, median, Q3, maximum, range, interquartile range
  • Descriptive statistics like mean, mode, standard deviation, sum, median absolute deviation, coefficient of variation, kurtosis, skewness
  • Most frequent values
  • Histogram
  • Correlations highlighting of highly correlated variables, Spearman, Pearson and Kendall matrices
  • Missing values matrix, count, heatmap and dendrogram of missing values

Announcements

With your help, we got approved for GitHub Sponsors! It's extra exciting that GitHub matches your contribution for the first year. Therefore, we welcome you to support the project through GitHub!

The v2.4 release includes many new features (performance, exporting, GUI and datasets) and stability improvements.

January 7, 2020


Contents: Examples | Installation | Documentation | Large datasets | Command line usage | Advanced usage | Types | How to contribute | Editor Integration | Dependencies


Examples

The following examples can give you an impression of what the package can do:

Installation

Using pip

PyPi Downloads PyPi Monthly Downloads PyPi Version

You can install using the pip package manager by running

pip install pandas-profiling[notebook,html]

Alternatively, you could install directly from Github:

pip install <https://github.com/pandas-profiling/pandas-profiling/archive/master.zip>

Using conda

Conda Downloads Conda Version

You can install using the conda package manager by running

conda install -c conda-forge pandas-profiling

From source

Download the source code by cloning the repository or by pressing 'Download ZIP' on this page. Install by navigating to the proper directory and running

python setup.py install

Documentation

The documentation for pandas_profiling can be found here.

Getting started

Start by loading in your pandas DataFrame, e.g. by using

import numpy as np
import pandas as pd
from pandas_profiling import ProfileReport

df = pd.DataFrame(
    np.random.rand(100, 5),
    columns=['a', 'b', 'c', 'd', 'e']
)

To generate the report, run:

profile = ProfileReport(df, title='Pandas Profiling Report', style={'full_width':True})

Jupyter Notebook

We recommend generating reports interactively by using the Jupyter notebook. There are two interfaces (see animations below): through widgets and through a HTML report.

Notebook Widgets

This is achieved by simply displaying the report. In the Jupyter Notebook, run:

profile

The HTML report can be included in a Juyter notebook:

HTML

Run the following code:

profile.to_notebook_iframe()

Saving the report

If you want to generate a HTML report file, save the ProfileReport to an object and use the to_file() function:

profile.to_file(output_file="your_report.html")

Alternatively, you can obtain the data as json:

# As a string
json_data = profile.to_json()

# As a file
profile.to_file(output_file="your_report.json")

Large datasets

Version 2.4 introduces minimal mode. This is a default configuration that disables expensive computations (such as correlations and dynamic binning). Use the following syntax:

profile = ProfileReport(large_dataset, minimal=True)
profile.to_file(output_file="output.html")

Command line usage

For standard formatted CSV files that can be read immediately by pandas, you can use the pandas_profiling executable. Run

pandas_profiling -h

for information about options and arguments.

Advanced usage

A set of options is available in order to adapt the report generated.

  • title (str): Title for the report ('Pandas Profiling Report' by default).
  • pool_size (int): Number of workers in thread pool. When set to zero, it is set to the number of CPUs available (0 by default).

More settings can be found in the default configuration file, minimal configuration file and dark themed configuration file.

Example

profile = df.profile_report(title='Pandas Profiling Report', plot={'histogram': {'bins': 8}})
profile.to_file(output_file="output.html")

Types

Types are a powerful abstraction for effective data analysis, that goes beyond the logical data types (integer, float etc.). pandas-profiling currently recognizes the following types:

  • Boolean
  • Numerical
  • Date
  • Categorical
  • URL
  • Path

We have developed a type system for Python, tailored for data analysis: visions. Selecting the right typeset drastically reduces the complexity the code of your analysis. Future versions of pandas-profiling will have extended type support through visions!

How to contribute

Questions: Stackoverflow "pandas-profiling"

The package is actively maintained and developed as open-source software. If pandas-profiling was helpful or interesting to you, you might want to get involved. There are several ways of contributing and helping our thousands of users. If you would like to be a industry partner or sponsor, please drop us a line.

The documentation is generated using pdoc3. If you are contributing to this project, you can rebuild the documentation using:

make docs

or on Windows:

make.bat docs

Read more on getting involved in the Contribution Guide.

Editor integration

PyCharm integration

  1. Install pandas-profiling via the instructions above
  2. Locate your pandas-profiling executable.

    On macOS / Linux / BSD:

    console $ which pandas_profiling (example) /usr/local/bin/pandas_profiling

    On Windows:

    console $ where pandas_profiling (example) C:\ProgramData\Anaconda3\Scripts\pandas_profiling.exe

  3. In Pycharm, go to Settings (or Preferences on macOS) > Tools > External tools

  4. Click the + icon to add a new external tool
  5. Insert the following values
    • Name: Pandas Profiling
    • Program: The location obtained in step 2
    • Arguments: "$FilePath$" "$FileDir$/$FileNameWithoutAllExtensions$_report.html"
    • Working Directory: $ProjectFileDir$

PyCharm Integration

To use the PyCharm Integration, right click on any dataset file: External Tools > Pandas Profiling.

Other integrations

Other editor integrations may be contributed via pull requests.

Dependencies

The profile report is written in HTML and CSS, which means pandas-profiling requires a modern browser.

You need Python 3 to run this package. Other dependencies can be found in the requirements files:

Filename Requirements
requirements.txt Package requirements
requirements-dev.txt Requirements for development
requirements-test.txt Requirements for testing
setup.py Requirements for Widgets etc.
Expand source code
"""Main module of pandas-profiling.

.. include:: ../../README.md
"""
import sys
import warnings
import json
from pathlib import Path
from datetime import datetime

import pandas as pd
import numpy as np

from pandas_profiling.version import __version__
from pandas_profiling.utils.dataframe import clean_column_names, rename_index
from pandas_profiling.utils.paths import get_config_default, get_config_minimal
from pandas_profiling.config import config
from pandas_profiling.controller import pandas_decorator
from pandas_profiling.model.describe import describe as describe_df
from pandas_profiling.report import get_report_structure


class ProfileReport(object):
    """Generate a profile report from a Dataset stored as a pandas `DataFrame`.
    
    Used has is it will output its content as an HTML report in a Jupyter notebook.
    """

    html = ""
    """the HTML representation of the report, without the wrapper (containing `<head>` etc.)"""

    def __init__(self, df, minimal=False, config_file: Path = None, **kwargs):
        if config_file is not None and minimal:
            raise ValueError(
                "Arguments `config_file` and `minimal` are mutually exclusive."
            )

        if minimal:
            config_file = get_config_minimal()

        if config_file:
            config.config.set_file(str(config_file))
        config.set_kwargs(kwargs)

        self.date_start = datetime.utcnow()

        # Treat index as any other column
        if (
            not pd.Index(np.arange(0, len(df))).equals(df.index)
            or df.index.dtype != np.int64
        ):
            df = df.reset_index()

        # Rename reserved column names
        df = rename_index(df)

        # Remove spaces and colons from column names
        df = clean_column_names(df)

        # Sort names according to config (asc, desc, no sort)
        df = self.sort_column_names(df)
        config["column_order"] = df.columns.tolist()

        # Get dataset statistics
        description_set = describe_df(df)

        # Build report structure
        self.sample = self.get_sample(df)
        self.title = config["title"].get(str)
        self.description_set = description_set

        self.date_end = datetime.utcnow()
        self.report = get_report_structure(
            self.date_start, self.date_end, self.sample, description_set
        )

    def sort_column_names(self, df):
        sort = config["sort"].get(str)
        if sys.version_info[1] <= 5 and sort != "None":
            warnings.warn("Sorting is supported from Python 3.6+")

        if sort in ["asc", "ascending"]:
            df = df.reindex(sorted(df.columns, key=lambda s: s.casefold()), axis=1)
        elif sort in ["desc", "descending"]:
            df = df.reindex(
                reversed(sorted(df.columns, key=lambda s: s.casefold())), axis=1
            )
        elif sort != "None":
            raise ValueError('"sort" should be "ascending", "descending" or None.')
        return df

    def get_sample(self, df: pd.DataFrame) -> dict:
        sample = {}
        n_head = config["samples"]["head"].get(int)
        if n_head > 0:
            sample["head"] = df.head(n=n_head)

        n_tail = config["samples"]["tail"].get(int)
        if n_tail > 0:
            sample["tail"] = df.tail(n=n_tail)

        return sample

    def get_description(self) -> dict:
        """Return the description (a raw statistical summary) of the dataset.
        
        Returns:
            Dict containing a description for each variable in the DataFrame.
        """
        return self.description_set

    def get_rejected_variables() -> list:
        return [
            message.column_name
            for message in self.description_set["messages"]
            if message.message_type == MessageType.REJECTED
        ]

    def to_file(self, output_file: Path, silent: bool = True) -> None:
        """Write the report to a file.
        
        By default a name is generated.

        Args:
            output_file: The name or the path of the file to generate including the extension (.html, .json).
            silent: if False, opens the file in the default browser
        """
        if not isinstance(output_file, Path):
            output_file = Path(str(output_file))

        if output_file.suffix == ".html":
            data = self.to_html()
        elif output_file.suffix == ".json":
            data = self.to_json()
        else:
            raise ValueError("Extension not supported (please use .html, .json)")

        with output_file.open("w", encoding="utf8") as f:
            f.write(data)

        if not silent:
            import webbrowser

            webbrowser.open_new_tab(output_file.absolute().as_uri())

    def to_html(self) -> str:
        """Generate and return complete template as lengthy string
            for using with frameworks.

        Returns:
            Profiling report html including wrapper.
        
        """
        from pandas_profiling.report.presentation.flavours import HTMLReport
        from pandas_profiling.report.presentation.flavours.html import templates

        use_local_assets = config["html"]["use_local_assets"].get(bool)

        html = HTMLReport(self.report).render()

        # TODO: move to structure
        wrapped_html = templates.template("wrapper/wrapper.html").render(
            content=html,
            title=self.title,
            correlation=len(self.description_set["correlations"]) > 0,
            missing=len(self.description_set["missing"]) > 0,
            scatter=len(self.description_set["scatter"]) > 0,
            sample=len(self.sample) > 0,
            version=__version__,
            offline=use_local_assets,
            primary_color=config["html"]["style"]["primary_color"].get(str),
            logo=config["html"]["style"]["logo"].get(str),
            theme=config["html"]["style"]["theme"].get(str),
        )

        minify_html = config["html"]["minify_html"].get(bool)
        if minify_html:
            from htmlmin.main import minify

            wrapped_html = minify(
                wrapped_html, remove_all_empty_space=True, remove_comments=True
            )
        return wrapped_html

    def to_json(self) -> str:
        class CustomEncoder(json.JSONEncoder):
            def default(self, o):
                if isinstance(o, pd.core.series.Series) or isinstance(
                    o, pd.core.frame.DataFrame
                ):
                    return {"__{}__".format(o.__class__.__name__): o.to_json()}
                if isinstance(o, np.integer):
                    return {"__{}__".format(o.__class__.__name__): o.tolist()}

                return {"__{}__".format(o.__class__.__name__): str(o)}

        return json.dumps(self.description_set, indent=4, cls=CustomEncoder)

    def to_notebook_iframe(self):
        """Used to output the HTML representation to a Jupyter notebook.
        When config.notebook.iframe.attribute is "src", this function creates a temporary HTML file
        in `./tmp/profile_[hash].html` and returns an Iframe pointing to that contents.
        When config.notebook.iframe.attribute is "srcdoc", the same HTML is injected in the "srcdoc" attribute of
        the Iframe.

        Notes:
            This constructions solves problems with conflicting stylesheets and navigation links.
        """
        from pandas_profiling.report.presentation.flavours.widget.notebook import (
            get_notebook_iframe,
        )
        from IPython.core.display import display

        display(get_notebook_iframe(self))

    def to_widgets(self):
        """The ipython notebook widgets user interface."""
        from pandas_profiling.report.presentation.flavours import WidgetReport
        from IPython.core.display import display, HTML

        report = WidgetReport(self.report).render()

        display(report)
        # TODO: move to report structure
        display(
            HTML(
                'Report generated with <a href="https://github.com/pandas-profiling/pandas-profiling">pandas-profiling</a>.'
            )
        )

    def _repr_html_(self):
        """The ipython notebook widgets user interface gets called by the jupyter notebook."""
        self.to_notebook_iframe()

    def __repr__(self):
        """Override so that Jupyter Notebook does not print the object."""
        return ""

    def to_app(self):
        """
        (Experimental) PyQt5 user interface, not ready to be used.
        You are welcome to contribute a pull request if you like this feature.
        """
        from pandas_profiling.report.presentation.flavours.qt.app import get_app
        from pandas_profiling.report.presentation.flavours import QtReport

        from PyQt5 import QtCore
        from PyQt5.QtWidgets import QApplication

        app = QtCore.QCoreApplication.instance()
        if app is None:
            app = QApplication([])

        app_widgets = QtReport(self.report).render()

        app = get_app(app, self.title, app_widgets)

Sub-modules

pandas_profiling.config

Configuration for the package is handled in this wrapper for confuse.

pandas_profiling.controller

The controller module handles all user interaction with the package (console, jupyter, etc.).

pandas_profiling.model

The model module handles all logic/calculations, e.g. calculate statistics, testing for special conditions.

pandas_profiling.report

All functionality concerned with presentation to the user.

pandas_profiling.utils

Utility functions for the complete package.

pandas_profiling.version

This file is auto-generated by setup.py, please do not alter.

pandas_profiling.visualisation

Code for generating plots

Classes

class ProfileReport (df, minimal=False, config_file=None, **kwargs)

Generate a profile report from a Dataset stored as a pandas DataFrame.

Used has is it will output its content as an HTML report in a Jupyter notebook.

Expand source code
class ProfileReport(object):
    """Generate a profile report from a Dataset stored as a pandas `DataFrame`.
    
    Used has is it will output its content as an HTML report in a Jupyter notebook.
    """

    html = ""
    """the HTML representation of the report, without the wrapper (containing `<head>` etc.)"""

    def __init__(self, df, minimal=False, config_file: Path = None, **kwargs):
        if config_file is not None and minimal:
            raise ValueError(
                "Arguments `config_file` and `minimal` are mutually exclusive."
            )

        if minimal:
            config_file = get_config_minimal()

        if config_file:
            config.config.set_file(str(config_file))
        config.set_kwargs(kwargs)

        self.date_start = datetime.utcnow()

        # Treat index as any other column
        if (
            not pd.Index(np.arange(0, len(df))).equals(df.index)
            or df.index.dtype != np.int64
        ):
            df = df.reset_index()

        # Rename reserved column names
        df = rename_index(df)

        # Remove spaces and colons from column names
        df = clean_column_names(df)

        # Sort names according to config (asc, desc, no sort)
        df = self.sort_column_names(df)
        config["column_order"] = df.columns.tolist()

        # Get dataset statistics
        description_set = describe_df(df)

        # Build report structure
        self.sample = self.get_sample(df)
        self.title = config["title"].get(str)
        self.description_set = description_set

        self.date_end = datetime.utcnow()
        self.report = get_report_structure(
            self.date_start, self.date_end, self.sample, description_set
        )

    def sort_column_names(self, df):
        sort = config["sort"].get(str)
        if sys.version_info[1] <= 5 and sort != "None":
            warnings.warn("Sorting is supported from Python 3.6+")

        if sort in ["asc", "ascending"]:
            df = df.reindex(sorted(df.columns, key=lambda s: s.casefold()), axis=1)
        elif sort in ["desc", "descending"]:
            df = df.reindex(
                reversed(sorted(df.columns, key=lambda s: s.casefold())), axis=1
            )
        elif sort != "None":
            raise ValueError('"sort" should be "ascending", "descending" or None.')
        return df

    def get_sample(self, df: pd.DataFrame) -> dict:
        sample = {}
        n_head = config["samples"]["head"].get(int)
        if n_head > 0:
            sample["head"] = df.head(n=n_head)

        n_tail = config["samples"]["tail"].get(int)
        if n_tail > 0:
            sample["tail"] = df.tail(n=n_tail)

        return sample

    def get_description(self) -> dict:
        """Return the description (a raw statistical summary) of the dataset.
        
        Returns:
            Dict containing a description for each variable in the DataFrame.
        """
        return self.description_set

    def get_rejected_variables() -> list:
        return [
            message.column_name
            for message in self.description_set["messages"]
            if message.message_type == MessageType.REJECTED
        ]

    def to_file(self, output_file: Path, silent: bool = True) -> None:
        """Write the report to a file.
        
        By default a name is generated.

        Args:
            output_file: The name or the path of the file to generate including the extension (.html, .json).
            silent: if False, opens the file in the default browser
        """
        if not isinstance(output_file, Path):
            output_file = Path(str(output_file))

        if output_file.suffix == ".html":
            data = self.to_html()
        elif output_file.suffix == ".json":
            data = self.to_json()
        else:
            raise ValueError("Extension not supported (please use .html, .json)")

        with output_file.open("w", encoding="utf8") as f:
            f.write(data)

        if not silent:
            import webbrowser

            webbrowser.open_new_tab(output_file.absolute().as_uri())

    def to_html(self) -> str:
        """Generate and return complete template as lengthy string
            for using with frameworks.

        Returns:
            Profiling report html including wrapper.
        
        """
        from pandas_profiling.report.presentation.flavours import HTMLReport
        from pandas_profiling.report.presentation.flavours.html import templates

        use_local_assets = config["html"]["use_local_assets"].get(bool)

        html = HTMLReport(self.report).render()

        # TODO: move to structure
        wrapped_html = templates.template("wrapper/wrapper.html").render(
            content=html,
            title=self.title,
            correlation=len(self.description_set["correlations"]) > 0,
            missing=len(self.description_set["missing"]) > 0,
            scatter=len(self.description_set["scatter"]) > 0,
            sample=len(self.sample) > 0,
            version=__version__,
            offline=use_local_assets,
            primary_color=config["html"]["style"]["primary_color"].get(str),
            logo=config["html"]["style"]["logo"].get(str),
            theme=config["html"]["style"]["theme"].get(str),
        )

        minify_html = config["html"]["minify_html"].get(bool)
        if minify_html:
            from htmlmin.main import minify

            wrapped_html = minify(
                wrapped_html, remove_all_empty_space=True, remove_comments=True
            )
        return wrapped_html

    def to_json(self) -> str:
        class CustomEncoder(json.JSONEncoder):
            def default(self, o):
                if isinstance(o, pd.core.series.Series) or isinstance(
                    o, pd.core.frame.DataFrame
                ):
                    return {"__{}__".format(o.__class__.__name__): o.to_json()}
                if isinstance(o, np.integer):
                    return {"__{}__".format(o.__class__.__name__): o.tolist()}

                return {"__{}__".format(o.__class__.__name__): str(o)}

        return json.dumps(self.description_set, indent=4, cls=CustomEncoder)

    def to_notebook_iframe(self):
        """Used to output the HTML representation to a Jupyter notebook.
        When config.notebook.iframe.attribute is "src", this function creates a temporary HTML file
        in `./tmp/profile_[hash].html` and returns an Iframe pointing to that contents.
        When config.notebook.iframe.attribute is "srcdoc", the same HTML is injected in the "srcdoc" attribute of
        the Iframe.

        Notes:
            This constructions solves problems with conflicting stylesheets and navigation links.
        """
        from pandas_profiling.report.presentation.flavours.widget.notebook import (
            get_notebook_iframe,
        )
        from IPython.core.display import display

        display(get_notebook_iframe(self))

    def to_widgets(self):
        """The ipython notebook widgets user interface."""
        from pandas_profiling.report.presentation.flavours import WidgetReport
        from IPython.core.display import display, HTML

        report = WidgetReport(self.report).render()

        display(report)
        # TODO: move to report structure
        display(
            HTML(
                'Report generated with <a href="https://github.com/pandas-profiling/pandas-profiling">pandas-profiling</a>.'
            )
        )

    def _repr_html_(self):
        """The ipython notebook widgets user interface gets called by the jupyter notebook."""
        self.to_notebook_iframe()

    def __repr__(self):
        """Override so that Jupyter Notebook does not print the object."""
        return ""

    def to_app(self):
        """
        (Experimental) PyQt5 user interface, not ready to be used.
        You are welcome to contribute a pull request if you like this feature.
        """
        from pandas_profiling.report.presentation.flavours.qt.app import get_app
        from pandas_profiling.report.presentation.flavours import QtReport

        from PyQt5 import QtCore
        from PyQt5.QtWidgets import QApplication

        app = QtCore.QCoreApplication.instance()
        if app is None:
            app = QApplication([])

        app_widgets = QtReport(self.report).render()

        app = get_app(app, self.title, app_widgets)

Class variables

var html

the HTML representation of the report, without the wrapper (containing <head> etc.)

Methods

def get_description(self)

Return the description (a raw statistical summary) of the dataset.

Returns

Dict containing a description for each variable in the DataFrame.

Expand source code
def get_description(self) -> dict:
    """Return the description (a raw statistical summary) of the dataset.
    
    Returns:
        Dict containing a description for each variable in the DataFrame.
    """
    return self.description_set
def get_rejected_variables()
Expand source code
def get_rejected_variables() -> list:
    return [
        message.column_name
        for message in self.description_set["messages"]
        if message.message_type == MessageType.REJECTED
    ]
def get_sample(self, df)
Expand source code
def get_sample(self, df: pd.DataFrame) -> dict:
    sample = {}
    n_head = config["samples"]["head"].get(int)
    if n_head > 0:
        sample["head"] = df.head(n=n_head)

    n_tail = config["samples"]["tail"].get(int)
    if n_tail > 0:
        sample["tail"] = df.tail(n=n_tail)

    return sample
def sort_column_names(self, df)
Expand source code
def sort_column_names(self, df):
    sort = config["sort"].get(str)
    if sys.version_info[1] <= 5 and sort != "None":
        warnings.warn("Sorting is supported from Python 3.6+")

    if sort in ["asc", "ascending"]:
        df = df.reindex(sorted(df.columns, key=lambda s: s.casefold()), axis=1)
    elif sort in ["desc", "descending"]:
        df = df.reindex(
            reversed(sorted(df.columns, key=lambda s: s.casefold())), axis=1
        )
    elif sort != "None":
        raise ValueError('"sort" should be "ascending", "descending" or None.')
    return df
def to_app(self)

(Experimental) PyQt5 user interface, not ready to be used. You are welcome to contribute a pull request if you like this feature.

Expand source code
def to_app(self):
    """
    (Experimental) PyQt5 user interface, not ready to be used.
    You are welcome to contribute a pull request if you like this feature.
    """
    from pandas_profiling.report.presentation.flavours.qt.app import get_app
    from pandas_profiling.report.presentation.flavours import QtReport

    from PyQt5 import QtCore
    from PyQt5.QtWidgets import QApplication

    app = QtCore.QCoreApplication.instance()
    if app is None:
        app = QApplication([])

    app_widgets = QtReport(self.report).render()

    app = get_app(app, self.title, app_widgets)
def to_file(self, output_file, silent=True)

Write the report to a file.

By default a name is generated.

Args

output_file
The name or the path of the file to generate including the extension (.html, .json).
silent
if False, opens the file in the default browser
Expand source code
def to_file(self, output_file: Path, silent: bool = True) -> None:
    """Write the report to a file.
    
    By default a name is generated.

    Args:
        output_file: The name or the path of the file to generate including the extension (.html, .json).
        silent: if False, opens the file in the default browser
    """
    if not isinstance(output_file, Path):
        output_file = Path(str(output_file))

    if output_file.suffix == ".html":
        data = self.to_html()
    elif output_file.suffix == ".json":
        data = self.to_json()
    else:
        raise ValueError("Extension not supported (please use .html, .json)")

    with output_file.open("w", encoding="utf8") as f:
        f.write(data)

    if not silent:
        import webbrowser

        webbrowser.open_new_tab(output_file.absolute().as_uri())
def to_html(self)

Generate and return complete template as lengthy string for using with frameworks.

Returns

Profiling report html including wrapper.

Expand source code
def to_html(self) -> str:
    """Generate and return complete template as lengthy string
        for using with frameworks.

    Returns:
        Profiling report html including wrapper.
    
    """
    from pandas_profiling.report.presentation.flavours import HTMLReport
    from pandas_profiling.report.presentation.flavours.html import templates

    use_local_assets = config["html"]["use_local_assets"].get(bool)

    html = HTMLReport(self.report).render()

    # TODO: move to structure
    wrapped_html = templates.template("wrapper/wrapper.html").render(
        content=html,
        title=self.title,
        correlation=len(self.description_set["correlations"]) > 0,
        missing=len(self.description_set["missing"]) > 0,
        scatter=len(self.description_set["scatter"]) > 0,
        sample=len(self.sample) > 0,
        version=__version__,
        offline=use_local_assets,
        primary_color=config["html"]["style"]["primary_color"].get(str),
        logo=config["html"]["style"]["logo"].get(str),
        theme=config["html"]["style"]["theme"].get(str),
    )

    minify_html = config["html"]["minify_html"].get(bool)
    if minify_html:
        from htmlmin.main import minify

        wrapped_html = minify(
            wrapped_html, remove_all_empty_space=True, remove_comments=True
        )
    return wrapped_html
def to_json(self)
Expand source code
def to_json(self) -> str:
    class CustomEncoder(json.JSONEncoder):
        def default(self, o):
            if isinstance(o, pd.core.series.Series) or isinstance(
                o, pd.core.frame.DataFrame
            ):
                return {"__{}__".format(o.__class__.__name__): o.to_json()}
            if isinstance(o, np.integer):
                return {"__{}__".format(o.__class__.__name__): o.tolist()}

            return {"__{}__".format(o.__class__.__name__): str(o)}

    return json.dumps(self.description_set, indent=4, cls=CustomEncoder)
def to_notebook_iframe(self)

Used to output the HTML representation to a Jupyter notebook. When config.notebook.iframe.attribute is "src", this function creates a temporary HTML file in ./tmp/profile_[hash].html and returns an Iframe pointing to that contents. When config.notebook.iframe.attribute is "srcdoc", the same HTML is injected in the "srcdoc" attribute of the Iframe.

Notes

This constructions solves problems with conflicting stylesheets and navigation links.

Expand source code
def to_notebook_iframe(self):
    """Used to output the HTML representation to a Jupyter notebook.
    When config.notebook.iframe.attribute is "src", this function creates a temporary HTML file
    in `./tmp/profile_[hash].html` and returns an Iframe pointing to that contents.
    When config.notebook.iframe.attribute is "srcdoc", the same HTML is injected in the "srcdoc" attribute of
    the Iframe.

    Notes:
        This constructions solves problems with conflicting stylesheets and navigation links.
    """
    from pandas_profiling.report.presentation.flavours.widget.notebook import (
        get_notebook_iframe,
    )
    from IPython.core.display import display

    display(get_notebook_iframe(self))
def to_widgets(self)

The ipython notebook widgets user interface.

Expand source code
def to_widgets(self):
    """The ipython notebook widgets user interface."""
    from pandas_profiling.report.presentation.flavours import WidgetReport
    from IPython.core.display import display, HTML

    report = WidgetReport(self.report).render()

    display(report)
    # TODO: move to report structure
    display(
        HTML(
            'Report generated with <a href="https://github.com/pandas-profiling/pandas-profiling">pandas-profiling</a>.'
        )
    )