{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Pnorm and qnorm Tutorial" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Motivation for Normal Functions\n", "\n", "Distributions serve as indispensable tools for modeling existing data, enabling us to handle uncertainty, quantify variability, and make inferential statements. One of the most prevalent distributions is the Normal Distribution, characterized by its mean \\(μ > 0\\) and standard deviation \\(σ > 0\\). This distribution is frequently employed to model real-world phenomena characterized by data clustering around a central value, proving particularly valuable in describing the distribution of diverse variables such as heights, weights, test scores, and errors in measurements. \n", "\n", "To visualize these distributions, Probability Density Function (PDF) and Cumulative Distribution Function (CDF) are commonly employed. The PDF represents the probability distribution of a continuous random variable and for the normal distribution the PDF is bell-shaped. The CDF, derived from the integral of the PDF, provides the probability that a random variable is less than or equal to a specified value.\n", "\n", "The function pnorm and qnorm yield the cumulative probability for a specified quantile and vice versa. Additionally, these functions generate both the Probability Density Function (PDF) and Cumulative Distribution Function (CDF), providing a visual representation of the cumulative probability associated with a particular quantile. The CDF graph plots the cumulative probability against the specific quantile, while the PDF graph illustrates the cumulative probability up to that quantile through the area under the curve." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Usage of pnorm and qnorm\n", "\n", "Here we will demonstrate how to use `pnorm` and `qnorm` to answer simple statistical questions, via a dataset. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let us take a look at our Palmer penguins dataset! The Palmer penguins dataset by Allison Horst, Alison Hill, and Kristen Gorman was made publicly available as an R package. We shall use the dataset to conduct exploratory data analysis and utilize our developed functions. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | species | \n", "island | \n", "bill_length_mm | \n", "bill_depth_mm | \n", "flipper_length_mm | \n", "body_mass_g | \n", "
|---|---|---|---|---|---|---|
| 0 | \n", "Adelie | \n", "Torgersen | \n", "39.1 | \n", "18.7 | \n", "181.0 | \n", "3750.0 | \n", "
| 1 | \n", "Adelie | \n", "Torgersen | \n", "39.5 | \n", "17.4 | \n", "186.0 | \n", "3800.0 | \n", "
| 2 | \n", "Adelie | \n", "Torgersen | \n", "40.3 | \n", "18.0 | \n", "195.0 | \n", "3250.0 | \n", "
| 3 | \n", "Adelie | \n", "Torgersen | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 4 | \n", "Adelie | \n", "Torgersen | \n", "36.7 | \n", "19.3 | \n", "193.0 | \n", "3450.0 | \n", "