#!/usr/bin/env python
# coding: utf-8

# # A Graduate Introduction to Probability and Statistics for Scientists and Engineers
# 
# ## [Philip B. Stark](http://www.stat.berkeley.edu/~stark), Department of Statistics, University of California, Berkeley
# 
# ## First offering: a 10-hour short course at University of Tokyo, August 2015
# 
# ## Software requirements
# + Jupyter: http://continuum.io/downloads and Python 2 kernel for Jupyter; see https://ipython.org/install.html
# 
# ## Supplemental Texts
# + Stark, P.B., 1997&ndash;2015. [_SticiGui: Statistical Tools for Internet and Classroom Instruction with a Graphical User Interface_](http://www.stat.berkeley.edu/~stark/SticiGui/index.htm).
# + Stark, P.B., 1990&ndash;2010. Lecture notes for Nonparametrics, [Statistics 240](https://www.stat.berkeley.edu/~stark/Teach/S240/Notes/index.htm)
# 

# # Index
# **These notes are in draft form, with large gaps.**
# I'm happy to hear about any errors, and I hope eventually to fill in some of the missing pieces.
# 
# 1. [Overview](overview.ipynb)
# 1. [Introduction to Jupyter and Python](jupyter.ipynb)
# 1. [Sets, Combinatorics, & Probability](prob.ipynb)
# 1. [Theories of Probability](probTheory.ipynb)
# 1. [Random Variables, Expectation, Random Vectors, and Stochastic Processes](rv.ipynb)
# 1. [Probability Inequalities](ineq.ipynb)
# 1. [Inference](inference.ipynb)
# 1. [Confidence Sets](conf.ipynb)

# # Rough Syllabus for Tokyo Short Course
# 
# ## [Preamble: Introduction to Jupyter and Python](jupyter.ipynb)
# 1. Jupyter notebook
#     + Cells, markdown, MathJax
# 1. Less Python than you need
# 
# ## [Lecture 1: Probability](prob.ipynb)
# 1. What's the difference between Probability and Statistics?
# 1. Counting and combinatorics
#     + Sets: unions, intersections, partitions
#     + De Morgan's Laws
#     + The Inclusion-Exclusion principle
#     + The Fundamental Rule of Counting
#     + Combinations
#     + Permutations
#     + Strategies for counting
# 
# 2. Axiomatic Probability
#     + Outcome space and events, events as sets
#     + Kolmogorov's axioms (finite and countable)
#     + Analogies between probability and area or mass
#     + Consequences of the axioms
#         - Probabilities of unions and intersections
#         - Bounds on probabilities
#         - Bonferroni's inequality
#         - The inclusion-exclusion rule for probabilities
#     + Conditional probability
#         - The Multiplication Rule
#         - Independence
#         - Bayes Rule
# ## Lecture 2: Probability, continued
# 3. Theories of probability
#     + Equally likely outcomes
#     + Frequency Theory
#     + Subjective Theory
#     + Shortcomings of the theories
#     + Rates versus probabilities
#     + Measurement error
#     + Where does probability come from in physical problems?
#     + Making sense of geophysical probabilities
#         - Earthquake probabilities
#         - Probability of magnetic reversals
#         - Probability that Earth is more than 5B years old
# 4. Random variables.
#     + Probability distributions of real-valued random variables
#     + Cumulative distribution functions
#     + Discrete random variables
#         - Probability mass functions
#         - The uniform distribution on a finite set
#         - Bernoulli random variables
#         - Random variables derived from the Bernoulli
#             * Binomial random variables
#             * Geometric
#             * Negative binomial
#         - Hypergeometric random variables
#         - Poisson random variables: countably infinite outcome spaces
# 
# ## Lecture 3: Random variables, contd.
# 5. Random variables, continued
#     + Continuous and "mixed" random variables
#     + Probability densities
#         - The uniform distribution on an interval
#         - The Gaussian distribution
#     + The CDF of discrete, continuous, and mixed distributions
#     + Distribution of measurement errors
#         - The box model for random error
#         - Systematic and stochastic error
# 6. Independence of random variables
#     + Events derived from random variables
#     + Definitions of independence
#     + Independence and "informativeness"
#     + Examples of independent and dependent random variables
#     + IID random variables
#     + Exchangeability of random variables
# 7. Marginal distributions
# 8. Point processes
#     + Poisson processes
#         - Homogeneous and inhomogeneous Poisson processes
#         - Spatially heterogeneous, temporally homogenous Poisson processes as a model for seismicity
#         - The conditional distribution of Poisson processes given N
#     + Marked point processes
#     + Inter-arrival times and inter-arrival distributions
#     + Branching processes
#         - ETAS
# 
# ## Lecture 4: Expectation, Probability Inequalities, and Simulation
# 9. Expectation
#     + The Law of Large Numbers
#     + The Expected Value
#         - Expected value of a discrete univariate distribution
#             * Special cases: Bernoulli, Binomial, Geometric, Hypergeometric, Poisson
#         - Expected value of a continuous univariate distribution
#             * Special cases: uniform, exponential, normal
#         - Expected value of a multivariate distribution
#     + Standard Error and Variance.
#         - Discrete examples
#         - Continuous examples
#         - The square-root law
#         - Standardization and Studentization
#         - The Central Limit Theorem
#     + The tail-sum formula for the expected value
#     + Conditional expectation
#         - The conditional expectation is a random variable
#         - The expectation of the conditional expectation is the unconditional expectation
#     + Useful probability inequalities
#         - Markov's Inequality
#         - Chebychev's Inequality
#         - Hoeffding's Inequality
#         - Jensen's inequality
# 10. Simulation
#     + Pseudo-random number generation
#         - Importance of the PRNG. Period, DIEHARD
#     + Assumptions
#     + Uncertainties
#     + Sampling distributions
# 
# ## Lecture 5: Testing
# 11. Hypothesis tests
#     + Null and alternative hypotheses, "omnibus" hypotheses
#     + Type I and Type II errors
#     + Significance level and power
#     + Approximate, exact, and conservative tests
#     + Families of tests
#     + P-values
#         - Estimating P-values by simulation
#     + Test statistics
#         - Selecting a test statistic
#         - The null distribution of a test statistic
#         - One-sided and two-sided tests
#     + Null hypotheses involving actual, hypothetical, and counterfactual randomness
#     + Multiplicity
#         - Per-comparison error rate (PCER)
#         - Familywise error rate (FWER)
#         - The False Discovery Rate (FDR)
# 
# ## Lecture 6: Tests and Confidence sets
# 12. Tests, continued
#     + Parametric and nonparametric tests
#         - The Kolmogorov-Smirnov test and the MDKW inequality
#         - Example: Testing for uniformity
#         - Conditional test for Poisson behavior
#     + Permutation and randomization tests
#         - Invariances of distributions
#         - Exchangeability
#         - The permutation distribution of test statistics
#         - Approximating permutation distributions by simulation
#         - The two-sample problem
#     + Testing when there are nuisance parameters
# 13. Confidence sets
#     + Definition
#     + Interpretation
#     + Duality between hypothesis tests and confidence sets
#     + Tests and confidence sets for Binomial p
#     + Pivoting
#         - Confidence sets for a normal mean
#             * known variance
#             * unknown variance; Student's t distribution
#     + Approximate confidence intervals using the normal approximation
#         - Empirical coverage
#         - Failures
#     + Nonparametric confidence bounds for the mean of a nonnegative population
#     + Multiplicity
#         - Simultaneous coverage
#         - Selective coverage

# # Rough Syllabus for complete 45-hour course
# 
# ---
# ### Descriptive Statistics
# 
# 1. Summarizing data.
#     1. Types of data: categorical, ordinal, quantitative
#     1. Univariate data.
#         1. Measures of location and spread: mean, median, mode, quantiles, inter-quartile range, range, standard deviation, RMS
#         1. Markov's and Chebychev's inequalities for quantitative lists
#         1. Ranks and ordinal categorical data
#         1. Frequency tables and histograms
#         1. Bar charts
#     1. Multivariate data
#         1. Scatterplots
#         1. Measures of association: Pearson and Spearman correlation coefficients
#         1. Linear regression
#             1. The Least Squares principle
#             1. The Projection Theorem
#             1. The Normal Equations
#                 1. Numerical solution of the normal equations
#                 1. Numerical linear algebra is not the same as abstract linear algebra
#                 1. Condition number
#                 1. Do not invert matrices to solve linear systems: use backsubstitution or factorization
#             1. Errors in regression: RMS error of linear regression
#             1. Least Absolute Value regression
#         1. Principal components and approximation by subspaces: another application of the Projection Theorem
#         1. Clustering
#             1. Distance functions
#             1. Hierarchical methods, tree-based methods
#             1. Centroid methods: K-means
#             1. Density-based clustering: kernel methods, DBSCAN
# 
# ---
# ### Probability
# 
# 1. Counting and combinatorics
#     1. Sets: unions, intersections, partitions
#     1. De Morgan's Laws
#     1. The Inclusion-Exclusion principle.
#     1. The Fundamental Rule of Counting
#     1. Combinations. Application (using the Inclusion-Exclusion Principle): counting derangements
#     1. Permutations
#     1. Strategies for complex counting problems
# 
# 1. Theories of probability
#     1. Equally likely outcomes
#     1. Frequency Theory
#     1. Subjective Theory
#     1. Shortcomings of the theories
# 
# 1. Axiomatic Probability
#     1. Outcome space and events, events as sets
#     1. Kolmogorov's axioms (finite and countable)
#     1. Analogies between probability and area or mass
#     1. Consequences of the axioms
#         1. Probabilities of unions and intersections
#         1. Bounds on probabilities
#         1. Bonferroni's inequality
#         1. The inclusion-exclusion rule for probabilities
#     1. Conditional probability
#         1. The Multiplication Rule
#         1. Independence
#         1. Bayes Rule
# 
# 1. Random variables.
#     1. Probability distributions
#     1. Cumulative distribution functions for real-valued random variables
#     1. Discrete random variables
#         1. Probability mass functions
#         1. The uniform distribution on a finite set
#         1. Bernoulli random variables
#         1. Random variables derived from the Bernoulli
#             1. Binomial random variables
#             1. Geometric
#             1. Negative binomial
#         1. Poisson random variables: countably infinite outcome spaces
#         1. Hypergeometric random variables
#         1. Examples of other discrete random variables
#     1. Continuous and "mixed" random variables
#         1. Probability densities
#         1. The uniform distribution on an interval
#         1. The exponential distribution and double-exponential distributions
#         1. The Gaussian distribution
#         1. The CDF of discrete, continuous, and mixed distributions
#     1. Survival functions and hazard functions
#         1. Counting processes
#     1. Joint distributions of collections of random variables, random vectors
#         1. The multivariate uniform distribution
#         1. The multivariate normal distribution
#         1. Independence of random variables
#             1. Events derived from random variables
#             1. Definitions of independence
#         1. Marginal distributions
#         1. Conditional distributions
#             1. The "memoryless property" of the exponential distribution
#         1. The Central Limit Theorem
#     1. Stochastic processes
#         1. Point processes
#             1. Intensity functions and conditional intensity functions
#             1. Poisson processes
#                 1. Homogeneous and inhomogeneous Poisson processes
#                 1. The conditional distribution of Poisson processes given N
#             1. Marked point processes
#             1. Inter-arrival times and inter-arrival distributions
#             1. The conditional distribution of a Poisson process
#         1. Random walks
#         1. Markov chains
#         1. Brownian motion
# 
# 1. Expectation
#     1. The Law of Large Numbers
#     1. The Expected Value
#         1. Expected value of a discrete univariate distribution
#             1. Special cases: Bernoulli, Binomial, Geometric, Hypergeometric, Poisson
#         1. Expected value of a continuous univariate distribution
#             1. Special cases: uniform, exponential, normal
#         1. (Aside: measurability, Lebesgue integration, and the CDF as a measure)
#         1. Expected value of a multivariate distribution
#     1. Expected values of functions of a random variable
#         1. Change-of-variables formulas for probability mass functions and densities
#     1. Standard Error and Variance.
#         1. Discrete examples
#         1. Continuous examples
#         1. The square-root law
#     1. The tail-sum formula for the expected value
#     1. Conditional expectation
#         1. The expectation of the conditional expectation is the unconditional expectation
#     1. Useful probability inequalities
#         1. Markov's Inequality
#         1. Chebychev's Inequality
#         1. Hoeffding's Inequality
# 
# ---
# ### Sampling
# 
# 1. Empirical distributions
#     1. The ECDF for univariate distributions
#     1. The Kolmogorov-Smirnov statistic and The Massart-Dvoretzky-Kiefer-Wolfowitz inequality
#     1. Inference: inverting the MDKW inequality
#     1. Q-Q plots
# 
# 1. Random sampling.
#     1. Types of samples
#         1. Samples of convenience
#         1. Quota sampling
#         1. Systematic sampling
#         1. The importance of random sampling: stirring the soup.
#         1. Systematic random sampling
#         1. Random sampling with replacement
#         1. Simple random sampling
#         1. Stratified random sampling.
#         1. Cluster sampling
#         1. Multistage sampling
#         1. Weighted random samples
#         1. Sampling with probability proportional to size
#     1. Sampling frames
#     1. Nonresponse and missing data
#     1. Sampling bias
# 
# 1. Simulation
#     1. Pseudo-random number generators
#         1. Why the PRNG matters
#         1. Uniformity, period, independence
#         1. Assessing PRNGs. DIEHARD and other tests
#         1. Linear congruential PRNGs, including the Wichmann-Hill. Group-induced patterns
#         1. Statistically "adequate" PRNGs, including the Mersenne Twister
#         1. Cryptographic quality PRNGs, including cryptographic hashes
#     1. Generating pseudorandom permutations
#     1. Taking pseudorandom samples
#     1. Simulating sampling distributions
# 
# ---
# ### Estimation and Inference
# 
# 1. Estimating parameters using random samples
#     1. Sampling distributions
#     1. The Central Limit Theorem
#     1. Measures of accuracy: mean squared error, median absolute deviation, etc.
#     1. Maximum likelihood
#     1. Loss functions, Risk, and decision theory
#     1. Minimax estimates
#     1. Bayes estimates
#     1. The Bootstrap
#     1. Shrinkage and regularization
# 
# 1. Inference
#     1. Hypothesis tests
#         1. Null and alternative hypotheses, "omnibus" hypotheses
#         1. Type I and Type II errors
#         1. Significance level and Power
#         1. Approximate, exact, and conservative tests
#         1. Families of tests
#         1. P-values
#             1. Estimating P-values by simulation
#         1. Test statistics
#             1. Selecting a test statistic
#             1. The null distribution of a test statistic
#             1. One-sided and two-sided tests
#         1. Null hypotheses involving actual, hypothetical, and counterfactual randomness
#         1. Multiplicity
#             1. Per-comparison error rate
#             1. Familywise error rate
#             1. The False Discovery Rate
#     1. Approaches to testing
#         1. Parametric and nonparametric tests
#         1. Likelihood ratio tests
#         1. Permutation and randomization tests
#             1. Invariances of distributions
#             1. Exchangeability
#             1. Other symmetries
#             1. The permutation distribution of test statistics
#             1. Approximating permutation distributions by simulation
#     1. Confidence sets
#     1. Duality between hypothesis tests and confidence sets
#     1. Conditional tests, conditional and unconditional significance levels
# 
# 1. Tests of particular hypotheses
#     1. The Neyman model of a randomized experiment.
#         1. Strong and weak null hypotheses
#         1. Testing the strong null hypothesis
#             1. The distribution of a test statistic under the strong null
#         1. "Interference"
#         1. Blocking and other designs
#             1. Ensuring that the null hypothesis matches the experiment
#     1. Tests for Binomial p
#     1. The Sign test
#         1. The sign test for the median; tests for other quantiles
#         1. The sign test for a difference in medians
#     1. Tests based on the normal approximation
#         1. The Z statistic and the Z test
#         1. The t statistic and the t test
#         1. 2-sample problems, paired and unpaired tests
#         1. Tests based on ranks
#             1. The Wilcoxon test
#             1. The Wilcoxon signed rank test
#         1. Tests using actual values
#     1. Tests of association
#         1. The hypothesis of exchangeability
#         1. The Spearman test
#         1. The permutation distribution of the Pearson correlation
#     1. Tests of randomness and independence
#         1. The runs test
#     1. Tests of symmetry
#         1. Tests of exchangeability
#         1. Tests of spherical symmetry
#     1. The two-sample problem
#         1. Selecting the test statistic: what's the alternative?
#             1. Mean, sum, Student t
#             1. Smirnov statistic
#             1. Other choices
#         1. The permutation distribution of the test statistic
#         1. The two-sample problem for complex data
#             1. Test statistics
#         1. The k-sample problem
#     1. Stratified permutation tests
#     1. Fisher's Exact Test
#     1. Tests of homogeneity and ANOVA
#         1. The F statistic
#         1. The permutation distribution of the F statistic
#         1. Other statistics
#         1. Ordered alternatives
#     1. Tests based on the distribution function: The Kolmogorov-Smirnov Test
#         1. The universality of the null distribution for continuous variables
#         1. Using the K-S test to test for Poisson behavior
#     1. Sequential tests and Wald's SPRT
#         1. Random walks and Gambler's ruin
#         1. Wald's Theorem
# 
# 1. Confidence intervals for particular parameters
#     1. Confidence intervals for a shift in the Neyman model
#     1. Confidence intervals for Binomial p
#         1. Application: confidence bounds for P-values estimated by simulation
#         1. Application: intervals for quantiles by inverting binomial tests
#     1. Confidence intervals for a Normal mean using the Z and t distributions
#     1. Confidence intervals for the mean
#         1. Nonparametric confidence bounds for a population mean
#             1. The need for a priori bounds
#             1. Nonnegative random variables
#             1. Bounded random variables
#     1. Confidence sets for multivariate parameters
# 
# 1. Density estimation
#     1. Histogram estimates
#     1. Kernel estimates
#     1. Confidence bounds for monotone and shape-restricted densities
#     1. Lower confidence bounds on the number of modes
# 
# 1. Function estimation
#     1. Splines and penalized splines
#         1. Polynomial splines
#         1. Periodic splines
#         1. Smoothing splines as least-squares
#         1. B-splines
#         1. L1 splines
#     1. Constraints
#         1. Balls and ellipsoids
#             1. Smoothness and norms
#             1. Lipschitz conditions
#             1. Sobolev conditions
#         1. Cones
#             1. Nonnegativity
#             1. Shape restrictions
#                 1. Monotonicity
#                 1. Convexity
#         1. Star-shaped constraints
#             1. Sparsity and minimum L1 methods
# 
# ---
# ### *Sketchy from here down*
# ### Experiments
# 
# 1. Experiments versus observational studies
#     1. Controls and the Method of Comparison
#     1. Randomization
#     1. Blinding
# 
# 1. Experimental design
#     1. Blocking
#     1. Orthogonal designs
#     1. Latin hypercube design
# 
# 
# 
# 

# In[1]:


# Version information
get_ipython().run_line_magic('load_ext', 'version_information')
get_ipython().run_line_magic('version_information', 'scipy, numpy, pandas, matplotlib')


# In[ ]: