MENU

BOYI HU

Title: Functional Regression Models
Date:
Friday, August 25th, 2023
Time: 10:00AM
Location:
Hybrid (LIB 2020/ Zoom)
Supervised by: Dr. Jiguo Cao

Abstract: The conventional method for functional quantile regression is to fit the regression model for each quantile of interest separately. The slope function of the regression, as a bivariate function indexed by time and quantile, is actually estimated as a univariate function of time only by first fixing the quantile. This estimation strategy has two major limitations. The monotonicity of conditional quantiles can not be guaranteed, and the smoothness of the slope estimator as a bivariate function can not be controlled. We develop a new framework for functional quantile regression to overcome the two limitations. We propose to simultaneously fit the functional quantile regression model for multiple quantiles under some constraints so that the estimated quantiles satisfy the monotonicity conditions. Meanwhile, the smoothness of the slope estimator is controlled. Motivated by an application of modeling the impact of daily temperature, annual precipitation and irrigation system on soybean yield, we propose two locally sparse estimation methods under a semiparametric functional quantile regression model. In the target application, the daily temperature is a functional predictor, and the influence of daily temperature on soybean yield may not always exist during the whole growing season. We aim to identify the time regions where the influence exists. For this purpose, in two projects, we use two different penalized estimation methods, functional SCAD and modified group lasso, to obtain locally sparse estimations for the bivariate slope function associated with the functional predictor. Focusing on the soybean yield application introduced above, we further propose a novel semiparametric functional generalized linear model (FGLM) to analyze the relationship between the environmental factors and the soybean yield. In this project, we consider the data from different years as from different populations due to the fact that the climate conditions can be very different year by year. Based on the new assumption, the main challenge is that we only have limited number of observations for each year. To solve this issue, we combine a density ratio model with the proposed semi-parametric FGLM so that the new framework can be fitted using the pool data. We propose to use a combination of penalized B-spline and empirical likelihood method to fit the model. The proposed method is highly flexible and robust to model misspecification.