Statistical Literacy Tools

Milo Schield, Editor

Tome Burnham

This page contains various tools that may be helpful. All these tools are web-accessible and are freely available for use. Tom Burnham (pictured above) created the VB tools (Reading/Writing drill); Milo Schield created the Excel demonstrations and documentation.  Burnham accomplishments

  1. Chance Demonstrations: Coincidence (Runs, Clusters, Birthday problem),  Four Envelope problem,  Law of Very Large Numbers
  2. Surveys: These help people evaluate their strengths and opportunities for improvement in various aspects of statistical literacy.
  3. Describe & Compare Percentages and Rates using Ordinary English.
  4. Standardizing: Using graphs, these web programs show users the impact of controlling for a related factor.
  5. Part-whole Grammar Reading and Writing Drills: This web programs help users develop their skills in using ordinary English to describe and compare rates and percentages as presented in pie charts, in statements, in tables and in graphs. The author (Tom Burnham) of these cutting-edge programs is shown above.
  6. Odysseys2sense: A lively game of civil discourse.
  7. Using Excel pivot tables to create summary statistics.
  8. Model Data Using Excel
  9. Create Histograms using Excel
  10. Create Discrete Distributions using Excel
  11. Create Log-Normal Distributions using Excel
  12. Create Distributions Empirically using Excel
  13. Create Sampling Distributions using Excel
  14. Create and Compare Confidence Intervals using Excel.
  15. Simple Models for Statistical Significance
  16. Create Hypothesis Tests using Excel
  17. Approximate Cutoffs for Statistical Significance
  18. Using Microsoft Word

0. BASICS/TOOLS:

1. CHANCE-COINCIDENCE DEMONSTRATIONS:

2. SURVEYS

  • Five Table survey.  This short survey (~7 minutes) asks respondent which statements accurately describe the circled percentage in five different tables. This survey is available in a web version with immediate grading and feedback on errors, and in a paper version.
  • 2008 Statistical Literacy Skills Survey (20 Questions).  This survey was based on the longer 2002 W. M. Keck Statistical Literacy Survey. For details on the construction of this instrument, see Schield (2008) titled “Statistical Literacy Skills Survey.”
  • 2002 W. M. Keck Statistical Literacy Survey.  This very long survey (~50 minutes) is available in Perseus-powered web version that acquires the data but gives no interactive feedback, and in a paper version. The results and analysis of this survey are also available.

3. DESCRIBE AND COMPARE PERCENTAGES, RATES AND AVERAGES USING ORDINARY ENGLISH:
   Isaacson data set (240 rows, 8 columns)

4. STANDARDIZING

Standardizing is a new graphical technique for taking into account the influence of a related factor. This site contains several articles on confounding, standardization and Simpson's Paradox.

Here are two web-based interactive Excel programs that demonstrate standardization.

For a broader mathematical background on Simpson's paradox, see:

5. RATIO GRAMMAR READING-WRITING DRILLS  Tom Burnham Accomplishments (1995-2014)

Go to the Grammar Read-Write Index page for details on browser choice, compatibility mode, and for various demos (video, PowerPoint w audio and PowerPoint slides).

6. ODYSSEYS2SENSE:  a game of lively discourse.  Website

7. USING EXCEL TO PRESENT AND CREATE SUMMARY STATISTICS

Presentation: Describe and Summarize

  • XL0Z: Moodle Split Screen: Excel vs. URL.  Slides
  • XL1A: Use 1D nominal frequency data to create pie & bar count & percent charts:  OutputSlides (12)Data Input
  • XL1B: Use 2D count data to create pie & bar charts:  Count (2C):  OutputSlides (10)Input  Count & %: OutputInput
  • XL1B2: Use ordinal pivot-table data to create pie & bar count & percent charts:  Output  Slides  Input
  • XL1C: Create time-based line graphs. Linear and Log-Ratio display.  OutputSlides (12)Data

Creation of statistical summary data:

  • XL1D: Create Pivot tables of counts, averages, statistics and percentages (100% column, 100% row and half table of percentages)  Excel 2013  OutputSlides (24)Data.  Excel 2008 OutputSlidesData.  Excel 2003 Demo slides: 1up6up
  • XL1E: Create Ranks and Percentiles.  OutputInput
  • XL1F: Create Histogram using Excel HISTOGRAM command: Excel 2013  OutputSlides (13)DataData (pdf) X
  • XL1G: Create Histogram using COUNTIF and FREQUENCY functions:  Excel 2013:  OutputSlides (18)Data (xlsx)Data (pdf)
  • XL1H: Creating Funnel Charts using Excel 2013: In-cell, stacked bar and stacked area.  DemoData
  • XL1J: Create centered stacked 100% bar-charts for ordinal data using Excel 2013.  Even and Odd #: Output  Even #:  OutputSlidesData  Odd #:  OutputSlidesData
  • XL1K: Using CORREL function  Output,  Slides  Data
  • XL1L: Interpreting zero correlation.  Output,  Slides,  Data

8. MODEL DATA USING EXCEL LINEAR REGRESSION Pulse dataset

  • XL2A: Chart Trendline: Linear  1Y1X (3 graphs)  DemoSlides (23)
  • XL2B: Chart Trendline: Non-Linear  1Y1X  (6 graphs)  DemoSlides (12)
  • XL2C: Chart Trendline: Linear  Two different Y scales on same raph.  2Y1X (4 graphs)  DemoSlides (36)
  • XL2D: Chart Trendline: Linear  Three-factor 1Y/2Groups/1X.  New (pivot table, 2 graphs):  DemoSlides (12)  [Do XL2C first]
  • XL2G: Association vs. Causation.  Explains vs. Causes using Excel.  Output
  • Linear regression using functions:   Can do on Mac.
    • XL2I: OLS1A  Weight|Height.  Demo  Slides (12)  Data.  Using Intercept, Slope and R-sq.
    • XL2J: OLS1A  Weight|Height.  Demo  Slides (12)  Data.  Using Predict, Trend and R-sq.
    • XL2K: OLS1A  Weight|Height.  OLS1B  Weight|Ht+Male.  OLS1C  Weight|Height+Rest Pulse.  Demo  Slides (12)  Data.  Using LINEST.
    • XL2M: OLS1A  Weight|Height.  Demo  Slides (12)  Data.  Prediction Intervals using LINEST function
    • XL2N: OLS   Demo  Slides (12)  Data.  Non-linear Regression using LINEST
    • XL2O: OLS1D  Simpson's Reversal Demo  Slides (12)  Data
  • Changed names: 3a Wt vs Ht; 3b Wt vs Ht & smoke;  3c Wt vs Ht & Pulse1 (1/18/2018)
    Multiple Regression using Data Analysis Toolpak:
  • XL3A: Toolpak: Regression using Data Analysis 1Y1C  OutputSlides (12)Input  Weight by Height
  • XL3B: Toolpak: Regression using Data Analysis 1Y2BC  OutputSlides (18)Input  Weight by Height and Gender
  • XL3C: Toolpak: Regression using Data Analysis 1Y2C  OutputInput  [No slides]  Weight by Height and Pulse1
  • XL3D: Toolpak: Regression MV4  Example of Simpson's Paradox.  Output  Slides  Data
  • XL3E: Toolpak: Regression MV5 Assess data.  Explain multivariate regression as standardizing.  DemoData
  • XL3F: Segmented OLS Linear Regression:  Slides (13)Math vs. Statistics NNN (2016 draft)
  • XL3G: Toolpak Regression: Key Bivariate OLS Relationships  Output  S1: SE of regression; 1-R^2.  S2: CI and PI
  • XL3M: Modelling using simple reasoning.  Summarizing/Modelling Production of CO2 gases by County.  XLSX

9. MODEL DATA USING EXCEL LOGISTIC REGRESSION

  • Logistic Regression using OLS1: Nudged probabilities.  Handles multiple X.  Accuracy better than linear OLS; less than logit MLE.
    • XL4A: OLS1A Male|Ht  DemoSlides (12)Data
    • XL4B: OLS1B  Male|Ht+Smoker  DemoSlides (12)Data
    • XL4C: OLS1C Male|Wt  DemoSlides (12)Data
    • XL4D: OLS1D  Male|Ht+Wt  DemoSlides (18)Data
    • XL4E: OLS1E Outcome | Predictor + Confounder.  Results in a Simpson's Reversal  OutputData
    • XL4F: Run logistic demo given underlying coefficients. Sheet 1C: Single continuous predictor. Sheet 2CB: One continuous and one binary.
    • XL4G: Run logistic demo given X50 and slope. Sheet 1C: single continuous predictor.  Sheet 2CB: One continuous and one binary
  • Logistic Regression:  MLE using Excel Solver
  • XL4M: Logistic Regression using MLE: Simpson's Paradox
  • XL4N: Discriminant Analysis  using MLE Logistic Regression:  Demo  Pulse 1Y2X.  Using OLS1D: Slides (18)
  • XL4P: Construct Model of World CO2 Carbon Emissions from Fuel Combustion  Data
  • XL4S: Logistic Regression using OLS2: Grouped data.  Chart Trendline 1Y1X  Data Analysis 1Y2X  Log-odds-Grouped Data OLS
  • XL4T Logistic Regression using OLS3 shortcut.  Output  [Slides  missing]  Input  [Interesting but can't handle multiple X]
  • XL4U: Logistic Regression: Compare MLE with OLS1.  Slides (16)DemoData (xlsx)  [Handles multiple X]
  • XL4V: Logistic Regression: Compare MLE with OLS3.  Slides (18)Data (xlsx)  [Interesting but can't handle multiple X]
  • R^2 and Partial Correlation as a function of Added Variable Correlations.  (1/22/2017)

10. DISCRETE DISTRIBUTIONS USING EXCEL

  • XL1P: Create Binomial, Hypergeometric and Poisson Distributions using Excel.  Discrete Probabilities: OverviewOutput (pdf)  Slides (10)Input
  • XL1Q: Explore Binomial, Hypergeometric and Poisson Distribution PDF and CDF for statistical significance.  #Excel 2013:  Slides  Demo Output (pdf)  PDF & CDF Calculator Input
  • XL1R: Calculator for three discrete distributions:  Excel file

11. CONTINUOUS DISTRIBUTIONS USING EXCEL:

  • Lognormal Distribution of HH by Income given Mean and Median of distribution:  XL5A: using Excel 2013:  Demo OutputSlides (6)Demo Input (xlsx)
  • Distribution of Total Income assuming HH by income are log-normal (Excel 2013)  Irfan (2014)Schield-Income-Inequality
  • XL5C: LogNormal Income:  XLSX by Percentiles  Protected. Input Mean and Median.
  • XL5C1: LogNormal Income Demo: Demo2Demo4Demo5Input  Schield (2014)
  • XL5D: Log-Normal Distribution of Subjects by Value: Percentages only. Spreadsheet
  • XL5D1: Log-Normal Distribution of Households by Assets (Excel 2013)  Spreadsheet  Slides
  • XL5D2: Log-Normal Distribution of  (Excel 2013)  Spreadsheet.  Slides
  • XL5D3: Log-Normal characteristics as a function of Mean-Median ratio: SpreadsheetPDF
  • XL5D9: Log-Normal Distribution of VIX (Excel 2013)  Spreadsheet.  Slides
  • XL5E: Generating Mean or Median of Log-Normal distribution given Standard Deviation.  PDF
  • XL5F: Generate Distribution of Losses given Lognormal Distributions of Frequency and Severity  [Old]  Demo  Slides  DataDemo-SS
  • XL5G: Run Demo to Generate Distribution of Losses given Lognormal Distributions of Frequency and Severity  Demo XLSXDemo PDF
  • XL5H: Distribution of Product of Two Random Variables: Normal and Log-Normal (Excel Simulation).  Output
  • XL5S: Normal Distribution and CDF (Excel 2013)  Spreadsheet.  Slides
  • XL5T: Folded-Normal Distribution and CDF
  • XL5U: Exponential Distribution and CDF
  • XL5V: Chi-squared Distribution and CDF

12. CREATE DISTRIBUTIONS EMPIRICALLY FROM RANDOM PROCESSES USING EXCEL  NNN 2014: Slides (pdf)  PPT

13. SAMPLE STATISTICS FOR PERFECTLY REPRESENTATIVE SAMPLES:

14. CREATE SAMPLING DISTRIBUTIONS USING EXCEL:

15. CREATE AND COMPARE CONFIDENCE INTERVALS USING EXCEL:

  • XL8A: Create Confidence Intervals using Excel 2013:  ResultsSlides (12)Data  Excel 2010:  DemoSlides (12)Data
  • XL8B: Compare overlap of Two Confidence Intervals using Excel 2010.  PDFxlsx
  • XL8C: Display Paired Confidence Intervals using Excel with summary statistics.  Template (xls)  Demo: 6up1up
  • XL8J Reference: Compare confidence functions in Excel 2010 vs. Excel 2003.  PDFxlsx
  • XL8K Reference: Compare Z and T Statistics using Excel 2003.  PDF

16. CREATE HYPOTHESIS TESTS USING EXCEL:

Isaacson data set (240 rows, 8 columns)

  • XL9A: Hypothesis test for a single Population: Z-Test function.  Excel 2008 demo:  Slides (12)6up1up
  • XL9B: Hypothesis test (two populations):  T-Test function.  Excel 2008  DemoSlides (24)6up1up
  • XL9C: Hypothesis test (two populations):  T-Test command.  Excel 2013 Slides (19)6up1up.  Excel 2003 6up1up

17. APPROXIMATE CUTOFFS FOR STATISTICAL SIGNIFICANCE: Sufficient but not necessary.

18. USING MICROSOFT WORD:

  • 20 Minute Quick Start using Word 2013 Styles.  Slides (pdf)  Schield, May 2014

MISCELLANEOUS

To print the SQL statements in a MS Access query, checkout Print Access Query SQL.

Alan Shepard's Sales Rank Express: www.salesrankexpress.com