QUESTION -
Now, consider the reduced model with just MA_NOX_total (call this model M2). Create a 1 by 3 plotting window with a plot of the standardized residuals for M2, a Normal QQ plot of the residuals for M2 (with confidence bands), and a plot of the sample ACF of the residuals for M2. Do the regression assumptions for M2 appear to be reasonable based on these plots? Explain your answer.
CONDITION -
Air pollution is a major problem in many parts of the world. In
particular, extreme levels of surface-level ozone (O3) is highly
problematic. In this exam, you will attempt to model the annual
maximum daily O3 value at a location in Massachussetts based on a
number of other explanatory variables. These data are for the years
1996-2020 and are given in the R dataframe MA_df that is saved in
the file MA_air_pollution_DF.RData (available in Canvas). The
variables are as follows.
Response Variable
MA_O3: Annual maximum daily O3 value at this location
Explanatory Variables
MA_NOX_total: Total NOx emitted in MA
MA_VOC_total: Total volatile organic compound (VOC) emitted in MA
MA_temp: Average summer temp in MA
MA_rh: Average summer relative humidity in MA
MA_dswrf: Average summer downward shortwave radiative flux in MA
MA_apcp: Average summer precipitation in MA
MA_lftx: Average summer lifted index in MA
MA_hpbl: Average summer height of the planetary boundary layer in MA
MA_SO2_total: Total SO2 emitted in MA
MA_CO_total: Total CO emitted in MA
You will mostly be asked to analyze these data using linear regression models with Normal errors. Importantly, this may not be an appropriate way to analyze annual maxima data. Extreme value theory offers models and approaches specifically designed to analyze data sets like this!