Parameter Estimation and Evaluation Introduction Introduction Machine learning,which arises due to availability of Big data,shares some common grounds as statistical inference,particularly in terms of sampling inference for population.Like any statistical inference methods,machine learning may suffer from sample bias. As an algorithm-based approach,Machine learning is much more general and flexible than statistical parametric modelling,including the determination of the set of important explanatory variables. ● Statistical nonparametric modelling can provide meaningful interpretations for some important machine learning algorithms,such as decision trees and artificial neural networks. The synthesis of machine learning and statistical inference is expected to open several new directions for statistical sciences. Big Data,Machine Learning and Statistics Introduction to Statistics and Econometrics July8,2020 6/70
Parameter Estimation and Evaluation Big Data, Machine Learning and Statistics Introduction to Statistics and Econometrics July 8, 2020 6/70 Introduction Introduction
CONTENTS 10.1 Introduction 10.2 Empirical Studies and Statistical Inference 10.3 Important Features of Big Data 10.4 Big Data Analysis and Statistics 10.5 Machine Learning and Statistics 10.6 Conclusion Big Data,Machine Learning and Statistics Introduction to Statistics and Econometrics July8,2020 7170
Big Data, Machine Learning and Statistics Introduction to Statistics and Econometrics July 8, 2020 7/70 10.1 Introduction 10.2 Empirical Studies and Statistical Inference 10.3 Important Features of Big Data 10.4 Big Data Analysis and Statistics 10.5 Machine Learning and Statistics 10.6 Conclusion CONTENTS
Parameter Estimation and Evaluation Empirical Studies and Statistical Inference Empirical Studies and Statistical Inference The basic idea of statistical inference is to assume that the system under study is a stochastic process governed by some probability law, and data observed in practice are realizations of the underlying system which is then called a data generating process(DGP). The main objective of statistical analysis is to use the observed data to make inference of the probability law of the DGP and then use it for various applications,such as explaining important empirically styled facts,testing theory and hypotheses,forecasting future trends and changes,evaluating programs and policies,and etc. In statistical modelling and inference,it is usually assumed that the probability law of the DGP can be adaquately characterized by a unique mathematical model which links the dependent variable to a small set of explanatory variables or covariates. Big Data,Machine Learning and Statistics Introduction to Statistics and Econometrics July8,2020 8/70
Parameter Estimation and Evaluation Big Data, Machine Learning and Statistics Introduction to Statistics and Econometrics July 8, 2020 8/70 Empirical Studies and Statistical Inference Empirical Studies and Statistical Inference
Parameter Estimation and Evaluation Empirical Studies and Statistical Inference Empirical Studies and Statistical Inference -Often the mathematical model is assumed to have a known func- tional form but subject to some low-dimensional unknown pa- rameters. The main objective of statistical inference is to use the observed data to estimate the unknown model parameters and conduct hypothesis testing about the parameters. .A popular procedure in empirical studies is to use a prespecified (say 5%)significance level (or equivalently a P-value)to judge whether an estimated parameter is statistically significant.If it is,the associated explanatory variable will be considered as an important factor and thus retained in the model.If a statistically significant variable is not included in the model,it will be called an omitted variable. Big Data,Machine Learning and Statistics Introduction to Statistics and Econometrics July8,2020 9/70
Parameter Estimation and Evaluation Big Data, Machine Learning and Statistics Introduction to Statistics and Econometrics July 8, 2020 9/70 Empirical Studies and Statistical Inference Empirical Studies and Statistical Inference
Parameter Estimation and Evaluation Empirical Studies and Statistical Inference Empirical Studies and Statistical Inference Commonly used examples of standard models include: -classical linear regression models; -probit or logit models in discrete choices; Cox's (1960)proportional hazard models in survival or duration analysis. The important inputs,the recorded data,are often observa- tional in nature,namely they are not produced from controlled experiments.This is usually the case in social sciences and eco- nomics.Observed data typically have moderate sample sizes. Big Data,Machine Learning and Statistics Introduction to Statistics and Econometrics Juy8,2020 10/70
Parameter Estimation and Evaluation Big Data, Machine Learning and Statistics Introduction to Statistics and Econometrics July 8, 2020 10/70 Empirical Studies and Statistical Inference Empirical Studies and Statistical Inference