ECTS credits ECTS credits: 5
ECTS Hours Rules/Memories Student's work ECTS: 85 Hours of tutorials: 5 Expository Class: 20 Interactive Classroom: 15 Total: 125
Use languages Spanish, Galician
Type: Ordinary subject Master’s Degree RD 1393/2007 - 822/2021
Departments: Statistics, Mathematical Analysis and Optimisation
Areas: Statistics and Operations Research
Center Faculty of Mathematics
Call: Second Semester
Teaching: With teaching
Enrolment: Enrollable | 1st year (Yes)
The objective of this subject is for students to become familiar with nonparametric and semi-parametric regression techniques, with special emphasis on their practical application. These objectives are specified in the following learning outcomes:
- To know the main nonparametric and semi-parametric techniques for estimating the regression function.
- To know how to choose the appropriate nonparametric or semi-parametric regression model to analyze the dependence on complex data from real situations.
- To be aware of the limitations of nonparametric techniques in the analysis of real situations with a high number of variables.
- To become autonomous in data analysis in multidisciplinary applied environments using nonparametric and semi-parametric techniques.
Chapter 1. Introduction to kernel estimation
Kernel density estimation. Smoothing parameter selection. Modifications of the kernel density estimator. Estimation of the sparsity. Kernel distribution estimation. Estimation of the density derivatives. Exploratory data analysis based on kernel estimation.
Chapter 2. Kernel regression
The regressogram. Nadaraya-Watson estimator. Local polinomial estimation. Smoothing parameter selection: plug-in and cross-validation. Estimation of the regression derivatives. Nonparametric estimation of the multivariate regression. Local likelihood: nonparametric logistic regression.
Chapter 3. Nearest-neighbors regression
k-Nearest neighbors estimation. Variants of the nearest-neighbors estimator. Loess: a comparisson with ordinary smoothing.
Chapter 4. Spline regression
Penalized least squares. Cubic splines. Interpolation and smoothing splines. Spline regression: B-splines basis. Wavelet regression, Fourier basis and other estimations through basis representations. Penalized regression.
Chapter 5. Partially linear and additive models
Partially linear and additive models. Estimation algorithms: backfitting and iterated least squares. Model testing: likelihood ration with approximated degrees of freedom. Nonparametric interactions.
Chapter 6. Generalized additive models
Generalized additive models. Estimation algorithms and tests on generalized additive models.
Chapter 7. Single-index models
Single-index models. The identification problem. Estimation algorithms and tests.
BASIC BIBLIOGRAPHY
[1] Bowman, A.W. & Azzalini, A. (1997). Applied smoothing techniques for data analysis. Oxford University Press.
[2] Härdle, W. (1990). Applied nonparametric regression. Econometric society monographs, Cambridge University Press.
[3] Härdle, W., Müller, M., Sperlich, S. & Werwatz, A. (2004). Nonparametric and Semiparametric Models. Springer.
[4] Wand, M.P. & Jones, M.C. (1995). Kernel Smoothing. Chapman Hall.
[5] Wasserman, L. (2005). All of Nonparametric Statistics. Springer.
[6] Wood, S.N. (2006). Generalized Additive Models. An Introduction with R. Chapman and Hall.
COMPLEMENTARY BIBLIOGRAPHY
[1] Fan, J. & Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Chapman and Hall.
[2] Green, P.J. & Silverman, B.W. (1994). Nonparametric regression and generalized linear models: A roughness penalty approach. Chapman and Hall.
[3] Hastie, T. & Tibshirani, R. (1990). Generalized Additive Models. Chapman and Hall.
[4] Scott, D.W. (1992). Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley and Sons.
[5] Silverman, B.W. (1986). Density Estimation for Statistics and Data Analysis. Chapman and Hall.
[6] Simonoff, J.S. (1996). Smoothing Methods in Statistics. Springer.
[7] Wahba, G. (1990). Spline Models for Observation Data. Society for Industrial and Applied Mathematics.
In this course the basic, general and transversal competences included in the memory of the MSc will be worked on. The specific competences to be promoted in this area are indicated below:
E1 - Know, identify, model, study and solve complex statistical and operational research problems, in a scientific, technological or professional context, arising from real applications.
E2 - Develop autonomy for the practical resolution of complex problems arising in real applications and for the interpretation of results in order to assist decision-making.
E3 - Acquire advanced knowledge of the theoretical foundations underlying the different methodologies of statistics and operational research, which allow their specialized professional development.
E4 - Acquire the necessary skills in the theoretical-practical management of probability theory and random variables that allow their professional development in the scientific / academic, technological or specialized and multidisciplinary professional field.
E5 - Deepen the knowledge in the specialized theoretical-practical foundations of modeling and study of different types of dependency relationships between statistical variables.
E6 - Acquire advanced theoretical-practical knowledge of different mathematical techniques, specifically oriented to assist decision-making, and develop reflective capacity to evaluate and decide between different perspectives in complex contexts.
E8 - Acquire advanced theoretical-practical knowledge of techniques for making inferences and contrasts related to variables and parameters of a statistical model, and know how to apply them with sufficient autonomy in a scientific, technological or professional context.
E9 - Know and know how to apply autonomously in scientific, technological or professional contexts, machine learning techniques and high-data analysis techniques (big data).
E10 - Acquire advanced knowledge on methodologies for obtaining and processing data from different sources, such as surveys, the internet, or "cloud" environments.
Teaching methodology will consist of lectures and interactive classes, as well as the tutoring of the learning and the works assigned to the students. Course follow-up material will be provided, as well as other orientation material for learning the software. During the lectures and interactive classes, examples will be solved using the R software, so it is necessary for the students to have a computer.
The on-campus activity, together with the corresponding and necessary personal work of the students for their preparation, is valued with three ECTS credits. This workload includes the final exam. One and a half hour of personal work is considered sufficient for the preparation of each theoretical-practical session. The other two ECTS credits of the subject correspond to practical work that the students will have to do throughout the course.
Below is an approximation of the hours that will be devoted to each topic.
CHAPTER 1. INTRODUCTION (4h expository, 5h laboratory)
CHAPTER 2. KERNEL REGRESSION (2h expository, 3h laboratory)
CHAPTER 3. KNN (2h expository, 2h laboratory)
CHAPTER 4. SPLINES (2h expository, 3h laboratory)
CHAPTER 5. PLM AND ADDITIVE MODELS (2h expository, 3h laboratory)
CHAPTER 6. GAM (1h expository, 2h of laboratory)
CHAPTER 7. SINGLE-INDEX (2h expository, 2h laboratory)
Continuous assessment (50%): the continuous assessment will be carried out based on the resolution of practical cases by the students. In these problems, students will use the R program and write the corresponding reports. In some cases there will also be oral presentations of the work done. The resolution of exercises in class by completing evaluation forms is also contemplated. The grade obtained will be kept between the opportunities (ordinary and extraordinary) within the call of each course. With the different activities that will be proposed throughout the course, the level of acquisition of basic and general skills, CB6-CB10 and CG1-CG5 will be assessed. The level reached in the transversal skills CT1-CT5 and the specific skills E2, E6, E9 and E10 will also be evaluated.
Final test (50%): the final test will consist of several theoretical-practical questions about the contents of the subject, which may include the interpretation of results obtained with the program used in interactive teaching. The exam will assess the acquisition of specific skills E1, E3, E4, E5 and E8.
Presentation to the evaluation: it is considered that a student is a candidate for being evaluated when he / she participates in some evaluation activity, either continuous assessment or test attendance. The weight of the continuous assessment in the extraordinary opportunity (July tests) will be the same as in the ordinary evaluation. In the second assessment opportunity, an exam will be taken and the final grade will be a maximum of three quantities: the grade of the ordinary evaluation, the grade of the new exam and the weighted average of the new exam and the continuous evaluation.
It should be noted that in cases of fraudulent performance of exercises or tests, the provisions of the “Regulations for evaluating student academic performance and reviewing grades” will apply.
Each ECTS credit translates into 7 lecture/interactive hours. It is estimated that the students will need one hour to prepare the material corresponding to each classroom hour, prior to the class itself. Subsequently, it will take an hour and a half to fully understand the content, including activities associated with exercises and other tasks. In total 24.5 hours per ECTS credit will result.
In order to successfully pass the course, it is advisable to attend the different sessions, and daily monitoring of the work done in the sessions is essential. It is also recommended that students will be able to use the R statistical software to explore the possibilities of the various nonparametric techniques explained throughout the course. In addition, for a better learning of the subject, it is convenient to present the practical sense of the methods introduced throughout the course.
In the event that the teaching is carried out on-campus, the lectures will inform the students about the learning objectives to be achieved and the contents to be worked on during the corresponding week. In the event that the teaching is carried out partially or totally on-line, the “weekly working plan” will be prepared, which will be provided to the students at the beginning of each week. In this plan, a series of guidelines will be given to help students in their autonomous learning, and where the contents to work and the recommended activities will be specified.
The development of the contents of the subject will be carried out taking into account that the competences to be acquired by the students must meet the MECES3 level. The contents included in this course are novel and highly specialized. Work will be done on the correct formulation of models, the construction of estimators and the validation and analysis of the different proposals studied.
In the case of fraudulent exercises or tests, the provisions of the respective regulations of the universities participating in the Master in Statistical Techniques will apply.
This guide and the criteria and methodologies described in it are subject to the modifications derived from the regulations and guidelines of the universities participating in the Master in Statistical Techniques.
Rosa María Crujeiras Casais
Coordinador/a- Department
- Statistics, Mathematical Analysis and Optimisation
- Area
- Statistics and Operations Research
- Phone
- 881813212
- rosa.crujeiras [at] usc.es
- Category
- Professor: University Professor
Jose Ameijeiras Alonso
- Department
- Statistics, Mathematical Analysis and Optimisation
- Area
- Statistics and Operations Research
- Phone
- 881813165
- jose.ameijeiras [at] usc.es
- Category
- Professor: LOU (Organic Law for Universities) PhD Assistant Professor