Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Downscaling satellite observations of dust with deep learning
(USC Thesis Other)
Downscaling satellite observations of dust with deep learning
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Downscaling Satellite Observations of Dust with Deep Learning
by
Lichen Du
A Thesis Presented to the
FACULTY OF THE USC KECK SCHOOL OF MEDICINE
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
MASTER OF SCIENCE
(BIOSTATISTICS)
May2020
Copyright 2020 Lichen Du
ii
TABLE OF CONTENTS
LISTS OF FIGURES ............................................................................................................................. iii
LISTS OF TABLES ............................................................................................................................... iv
ABSTRACT ........................................................................................................................................... v
INTRODUCTION .................................................................................................................................. 1
MATERIALS ......................................................................................................................................... 3
Study Region....................................................................................................................................... 3
Data .................................................................................................................................................... 4
METHOD ............................................................................................................................................... 5
Feature Engineering ............................................................................................................................ 5
Train-test Split .................................................................................................................................... 7
Standardization and Counter Standardization ....................................................................................... 8
Evaluation ........................................................................................................................................... 8
Ridge Regression ................................................................................................................................ 9
Neural Network ................................................................................................................................... 9
Summary of Satellite Observations of Dust ........................................................................................ 10
Model Comparison ............................................................................................................................ 12
Downscaled Visualizations ................................................................................................................ 14
DISCUSSION ....................................................................................................................................... 17
REFERENCES ..................................................................................................................................... 18
iii
LISTS OF FIGURES
Figure 1 Middle East Region ................................................................................................................... 3
Figure 2 Feature Engineering .................................................................................................................. 6
Figure 3 Model building and Downscaling .............................................................................................. 6
Figure 4 Structure of the Neural Network .............................................................................................. 10
Figure 5 Scatterplot of Dust_7 and Dust_50 from 2005-2006................................................................. 11
Figure 6 Downscaling visualization in summer dataset from 06/2005-08/2005....................................... 15
Figure 7 Downscaling visualization in winter dataset from 12/2005-02/2006 ......................................... 16
iv
LISTS OF TABLES
Table 1 Descriptive statistics of seasonal dust data ................................................................................ 11
Table 2 Performance of Ridge Models for seasonal downscaling .......................................................... 13
Table 3 Performance of Neural Networks for seasonal downscaling ...................................................... 14
v
ABSTRACT
Dust particles play an important role in both climate change and human health. Accurate estimation
of spatiotemporal variability of dust is therefore important for improving our understanding of its
role in climate change and on adverse health effects. The Modern-era Retrospective analysis for
Research and Applications, version 2(MERRA-2) is a global atmospheric reanalysis produced by
the NASA Global Modeling and Assimilation Office (GMAO). MERRA-2 has coarse spatial
resolutions of 50 km in the horizontal direction. The Goddard Earth Observing System, version 5
(GEOS-5) Nature Run (G5NR) is a special run of MERRA-2 providing a 2-year global non-
hydrostatic mesoscale simulation for the period June 2005 through May 2007 with a 7 km
horizontal resolution. Both GEOS-5 and G5NR provide data on dust aerosols. In this study we use
a deep learning method to downscale dust aerosols from 50 km resolution to 7 km resolution based
on both MERRA-2 and G5NR. The neural network models and the ridge models were developed
for each season in every year to see the seasonal variability. Random sampling and leave-one-day-
out sampling (excluding Thursday’s data from the week as the training set) were implemented to
validate the models. The R
2
value of the models on the summer dataset were much larger than the
models on rest of the seasons. Additionally, the neural network models outperformed the ridge
models to downscale dust extinction from a 50 km resolution to a 7 km resolution. But overfitting
issues appeared when validating the neural network models using leave-one-day-out sampling.
1
INTRODUCTION
Particulate matter (PM), arises in the atmosphere from a wide variety of sources. Both size
and chemical compositions of PM vary widely in relation to the nature of the source and its
transport history. Particulates whose aerodynamic diameter are less than 2.5 µm (PM2.5) are
defined as fine particles and those with aerodynamic diameter between 2.5 and 10 µm in
aerodynamic diameter are defined as coarse particles (PM10 – 2.5). Some particles can attract water
vapor and grow to form small droplets under humid conditions. The term 'aerosol' is often used to
describe solid particles and droplets suspended in air (UK Air Pollution Information System, 2019).
Dust is typically defined by coarse or large, non-spherical particles (Kahn and Gaitley,
2015) and has been projected to have significant impact on a changing climate, particularly in arid
regions such as the Middle East (Kok et al., 2018). Middle East desert dust events can raise PM10
concentrations above 5,000 µg/m3 (for reference the U.S. National Ambient Air Quality Standard
for 24-hour PM10 is 150 µg/m3) (Shashavani et al., 2012). The indirect effect of dust particles on
climate includes their modification of cloud properties (Rosenfeld et al., 2001) and as droughts
become more common, dust levels will increase.
Dust also plays a role on health, but the health burden of dust is different than from
anthropogenic emissions of PM. Dust storms can intensify local pollution and can also have direct
health effects. However, ground-level dust measurements are not routinely conducted, so
researchers typically rely on examining differential health effects on dust and non-dust storm days.
For example, on dust storm days a 3.28% (95% CI 2.42, 4.15%) increase in daily mortality was
observed per 10 µg/m3 increase in PM10 in Ahvaz, Iran. This is compared to non-dust days where
there was a non-statistically significant 1.03% (95% CI -0.02, 2.08%) increase in daily mortality
2
(Shahsavani et al., 2020) Another study found that cardiovascular hospital admissions were
increased by 10.4% in Nicosia, Cyprus on Middle Eastern dust storm days (Middleton et al., 2008).
Reliable estimates of dust are important for understanding its role in climate change
(Voiland 2010) and for facilitating additional health effects studies. Given the paucity of dust
measurements, particularly in regions of the world where routine PM monitoring networks are
sparse, satellite data coupled with historical records and deterministic models are a viable
alternative. The Modern-Era Retrospective analysis for Research and Applications, version 2
(MERRA-2) is a global atmospheric reanalysis produced by the NASA Global Modeling and
Assimilation Office (GMAO) (Gelaro et al 2017). It incorporates observations from satellite
instruments and records of the global atmosphere from 1980 to the present. It also incorporates
additional aspects of the climate system including cryospheric processes, improved land surface
representation, and trace gas constituents (MERRA-2, 2014). The data has high vertical (½°
latitude by ⅝° longitude by 72 model levels) and temporal (hourly) resolutions but a relatively
coarse 50 km horizontal resolution (Gelaro et al 2017). In addition to standard meteorological
parameters (wind, surface pressure, moisture, temperature), MERRA-2 also simulates 15 aerosol
tracers (dust, sea salt, black and organic carbon), ozone (O3), carbon monoxide (CO), and carbon
dioxide (CO2).
The Goddard Earth Observing System, version 5 (GEOS-5) Nature Run (G5NR, Ganymed
Release) is a special run of the MERRA-2 with 2-year global, non-hydrostatic mesoscale
simulation over the duration from June 2005 through May 2007 at 7 km horizontal resolution
(GEOS-5 Nature Run, 2014). Essentially it provides a finer resolution representation of MERRA-
2’s global aerosol concentrations and climate (Gelaro et al 2015). Dust extinction aerosol optical
3
thickness (AOT) is one of the 15 components (Randles et al., 2017) and the target variable for this
study. It mostly ranges between 0 and 1 and is a unitless measure of dust aerosol loading in the
column from the surface to the top of the atmosphere.
In this study we developed a deep learning neural network to downscale the dust extinction
from 50 km resolution to 7 km resolution based on the overlapping period of MERRA-2 and G5NR.
We included elevation as an auxiliary variable to help predictions and compared the neural
networks to the results of Ridge regression models. By leveraging the overlapping data in this
region and 2-year time period, our trained and tested deep learning model can be applied to
downscale full MERRA-2 from 50 km to 7 km, which will facilitate health effects studies of dust
in the Middle East region (Figure 1), which is highly impacted by dust storms.
Figure 1 Middle East Region
MATERIALS
Study Region
The study region encompasses the Middle East from Northeastern Africa to Afghanistan
(Longitude 25.69 to 74.88 degrees East; latitude 10.94 to 42.06 degrees North).
4
Data
Both the MERRA-2 (50 km resolution, 2000-2018) and the G5NR (7 km resolution, 2005-
2007) data were obtained from Global Modeling and Assimilation Office (GMAO) at the NASA
Goddard Space Flight Center (GSFC, 2018) as NetCDF-4 format files.
There were 32,284,980 observations in the MERRA-2 dataset. At 50 km resolution, the
gridded dimensions of longitude (east-west) included 78 pixels; and, that of latitude (north-south)
included 63 pixels. There were a total of 6,570 days observed in this dataset from 2000-05-16 to
2018-05-16. The MERRA-2 had high surface (½° latitude by ⅝° longitude by 72 model levels)
and temporal (hourly) resolutions with a 50 km horizontal resolution. The variable “DUEXTTAU”
represented the value of dust extinction AOT. For this analysis daily data were derived from hourly
observations.
There were 287,044,760 observations in the G5NR dataset. At 7 km resolution, the gridded
dimensions of longitude (east-west) included 788 pixels; and, that of latitude (north-south)
included 499 pixels. There were a total of 730 days from 2005-05-16 to 2007-05-16 observed in
this dataset. Again, the variable “DUEXTTAU” represented the value of dust extinction AOT.
Daily data were derived from hourly observations.
The ETOPO1 is a one-arcminute global relief model of the Earth's surface that integrated
land topography and ocean bathymetry developed by the he National Geophysical Data Center
(NGDC), an office of the National Oceanic and Atmospheric Administration (NOAA) (Amante
and Eakins, 2009). Developed from global and regional datasets, it was available in "Ice Surface"
(top of Antarctic and Greenland ice sheets) and "Bedrock" (base of the ice sheets). The ETOPO1
Global Relief Model was used to calculate the Volumes of the World's Oceans and to derive a
hypsographic curve of earth's surface (NOAA, ETOPO1 Global Relief Model). The variable
5
“elevation” provides land-surface elevation. The resolution of the elevation is much higher than
that of the MERRA-2 and the G5NR datasets (2/125° latitude by 2/125° longitude or
approximately 2 km). There is no temporal resolution to the elevation variable used in our analysis.
METHOD
Feature Engineering
The G5NR grid is nested within the MERRA-2 grid, so first the longitude and latitude of
the G5NR were rounded to the same resolution with MERRA-2. Then, the MERRA-2 with the
G5NR datasets could be combined together. Rounded the longitude and latitude of the Elevation
to same resolution with the G5NR; and merged the Elevation to the “MERRA-2+G5NR” dataset.
Finally, there were five numeric variables (Latitude_7, Longitude_7, Dust_50, Dust_7, Elevation)
and one date variable (Date_7).
Hourly data were averaged to daily data. The variable of “Year”, “Season”, “dayofweek”
were recoded from variable “Date_7”. “Season” were converted into four categories “Spring”
(March, April, May), “Summer” (June, July, August), “Fall” (September, October, November) and
“Winter” (December, January, February). Then, the one-hot coding function was applied to it in
python to make all categorical variables (“year”, “Season” and “dayofweek”) to dummy variables.
In the end, there were five numeric variables and 13 binary variables in the finalized dataset.
6
Figure 2 Feature Engineering
7
Figure 3 Model Building and Downscaling
Train-test Split
Two sampling methods were used to split the dataset into training and testing sets. The first
method was random sampling: 15% observations were sampled as the testing set and the rest as
the training set. Since the testing set was randomly sampled from the whole dataset, training set
and testing set were similar. The second sampling method was used to see the downscaling
8
performance on a nearly new dataset. The data from all Thursdays were selected from the whole
dataset as testing set and the rest of the days served as training set. The proportion rate of this
testing set is 14.3% (1/7), close to the proportion rate of random sampling method. This testing set
could also be called an independent testing set.
Standardization and Counter Standardization
The numeric variables should be standardized first and then used in model building. All
numeric variables were standardized by subtracting the mean and then divided by the standard
deviation. Additionally, both of the training set and the testing set were standardized based on
mean and standard deviation of the training set. After achieving the downscaled target variable
estimations of testing set, downscaled estimations would be counter-standardized to original scales
by multiplying the standard deviation and then adding the mean from training set, which would
make the comparison results more meaningful and realistic.
Evaluation
According to the counter-standardized target variable estimation and original target
variable of testing set, three measures were used to evaluate the performance of all models. The
R
2
ranged from 0 to 1; and, larger R
2
meant that more variance could be explained by the model.
Thus, the larger the R
2
, the better the model was developed. MSE, RMSE were similar measures
explaining the deviation of downscaled values from true values. Smaller MSE/RMSE indicate
better predictions.
9
Ridge Regression
The ridge regression was a linear regression method with a penalty on the size of the
coefficients (I2 regularization). A l2 regularization term α was added in the model to control the
amount of shrinkage: the larger the value of α, the greater the amount of shrinkage; therefore, the
coefficients are more robust to collinearity issues. Scikit-learn library (v0.21.3) in Python (v3.7)
was used to perform the ridge regression. The α was set to be 1; and all other hyperparameters
were the default of the ridge() function of scikit-learn.
Neural Network
Neural networks are a set of algorithms, modeled loosely after the human brain that were
designed to recognize patterns. The sensory data was interpreted through a kind of machine
perception, labeling or clustering raw input. Often referred to as deep learning neural networks are
based on a set of layers that progressively extract different levels of features from the input. In
this project, a five-layer neural network model was implemented to do the downscaling.
The “keras” package was used to build the neural network model in this study. The
optimizer used was “Adam”; and this algorithm leveraged the power of adaptive learning rates
methods to find individual learning rates for each parameter. It also had advantages of both the
Adagrad, which worked well in settings with sparse gradients but struggled in non-convex
optimization of neural networks, and the RMSprop, which tackled to resolve the previous listed
problem and worked well with on-line settings (Adam, 2018).
The early stopping was also added to this model to prevent the overfitting issue, batch size
was set to 256 and epochs to 30. Patient was set as 5, min delta as 0.0001 and monitor set as
“val_loss”. Other details of the neural network could be found in Figure 4.
10
Figure 4 Structure of the Neural Network
RESULTS
Summary of Satellite Observations of Dust
Descriptive statistics of multi-year (2005-2006, 2006-2007) and seasonal (Spring, Summer,
Fall and Winter) observed dust extinction are shown in Table 1. Mean values of Dust_50 did not
show a big difference between the first year (2005-2006) and the second year (2006-2007), while
the second-year Dust_7 means were larger than those of the first year, especially in the spring and
summer. Moreover, the Dust_7 had larger maximum and smaller minimum than Dust_50, which
is expected since Dust_50 has a larger grid and extreme points are likely smoothed out.
The means of dust extinction in the summer were much higher than in those of the rest of
the year (spring second, fall third and winter the last). The rank of the Dust_7 standard deviations
of different seasons followed the same patterns as the mean. But, for the Dust_50, the winter had
11
smaller means but larger SDs than those of the fall, indicating a higher variability, which reflected
potential difficulties of predictions.
Table 1 Descriptive statistics of seasonal dust data
Scatterplots of Dust_7 versus Dust_50 from 2005 to 2006 (Figure 5) indicate that the
relationship between the two is not that strong and has significant variability.
Figure 5 Scatterplot of Dust_7 and Dust_50 from 2005-2006
12
Model Comparison
Performance statistics of the ridge models are shown in Table 2. For both sampling
methods, the summer dataset had the largest R
2
, whereas the winter dataset had the smallest, which
showed that the linear relationship between the Dust_7 and Dust_50 in the summer dataset was
more significant than that of the winter. From another perspective, the model performances based
on random sampling and subset sampling did not reveal a huge difference. The R
2
values of the
training set and the testing set were very close and were both under 40%, indicating that there did
not exist any significant linear relationship between the target variable and the features, which was
consistent with the deduction observed from the scatter plot. The R
2
also exhibited that the models
had big biases and small variances. Although the R
2
of the training set and testing set
were very
close, outperformances happened in some models based on one-day-out sampling (first year spring,
first year winter and first year whole). One of the reasons for this might be that the data from
Thursday was not extreme thus easy to predict. Training the whole dataset averaged the
performance of all models (R
2
over 33%), which got very close to the performance of the best
seasonal model.
Overall, ridge models did not perform very well in downscaling. The R
2
was below 40%,
meaning less than 40% of the variance can be explained by the model.
13
Table 2 Performance of Ridge Models for seasonal downscaling
Performance statistics for the neural network models are shown in Table 3. Different from
the results of ridge regression model, models based on spring dataset showed the worst
performance. For randomly sampling, training R
2
and testing R
2
were very close. All testing R
2
were larger than 40% except for models based on spring dataset (around 39%). R
2
of summer
model was largest, more than 55%. This told that models had tolerant bias and small variance.
However, for leave one day out sampling, overfitting happened in every model. Model based on
the second-year winter dataset saw the worst overfitting with training R
2
(41.6%) more than twice
of the testing R
2
(19.9%). Even the first-year spring dataset had smaller training R
2
(38.9%) than
that of winter dataset (46.1%), the testing R
2
(32.5%) of spring was larger than that of winter
(28.9%). Second year spring dataset have similar training R
2
(40.23%) with winter dataset (41.6%),
while the testing R
2
(26.37%) of spring was also larger than that of winter (19.96%). Models based
on winter dataset showed strong ability of overfitting and performed worst. Training on the whole
dataset averaged the performance of seasonal models but no improvement.
14
Overall, the neural network models performed much better than the ridge models when
random sampling was used. But the neural network had minor issues of overfitting in an
independent new dataset (Thursday dataset) when implementing the subset sampling.
Table 3 Performance of Neural Networks for seasonal downscaling
Downscaled Visualizations
The spatial availability of the deep learning models was visualized separately for the spring,
summer, fall and winter of the first year (From May 2005 to April 2006). All plots were based on
the average values of the original Dust_50, Dust_7 and downscaled Dust_7 from the testing set.
Results of summer and winter were shown in Figure 6 and Figure 7. As shown from the generated
plots, regions of density of the average of the Dust_50 and the average of the Dust_7 were very
similar, so it was reasonable to downscale from the Dust_50 to Dust_7. The density areas of the
Dust_7 and downscaled Dust_7 were close for random sampling but a little different for subset
sampling. The range of the Dust_7 and downscaled Dust_7 had tiny differences in the summer
15
dataset but huge differences in the winter dataset, indicating models performed better on the
summer dataset than the winter. All conclusions were consistent with results from Table 3.
Figure 6 Downscaling visualization in summer dataset from 06/2005-08/2005
16
Figure 7 Downscaling visualization in winter dataset from 12/2005-02/2006
17
DISCUSSION
In this study, the dust extinction showed seasonal variability, and the means of the dust
extinction were much higher in the summer than the rest. It was possible there existed a potential
relationship between the dust extinction and the temperature, which could be further analyzed.
Performances of the models also differ by season, with summer models showing the best
performance. Training the whole dataset averaged the performance of seasonal models, but also
hid seasonal patterns of the target variable.
Sampling methods also made significant differences in this analysis. Performance of the
models with leave-one-day-out sampling were worse than the models with random sampling.
Because in random sampling, the testing set and training set came from the same distribution,
while the other method literally created an independent dataset. Sampling method had a huge effect
on neural network models, but showed only minor effects on ridge regression models in this project.
However, overfitting existed in developing deep learning models with leave-one-day-out sampling.
The deep learning models did not do well to predict the downscaled data on Thursday. Parameter
tuning and cross validation in neural network could be applied in further research.
18
REFERENCES
1. UK Air Pollution Information System (2019). Dusts. Retrieved from
http://www.apis.ac.uk/overview/pollutants/overview_particles.htm
2. Kahn, R. a, & Gaitley, B. J. (2015). An analysis of global aerosol type as retrieved by
MISR. Journal of Geophysical Research: Atmospheres, 120(9), 4248–4281.
https://doi.org/10.1002/2015JD023322
3. Kok, J. F., Ward, D. S., Mahowald, N. M., & Evan, A. T. (2018). Global and regional
importance of the direct dust-climate feedback. Nature Communications, 9(1).
https://doi.org/10.1038/s41467-017-02620-y
4. Shahsavani, A., Naddafi, K., Jafarzade Haghighifard, N., Mesdaghinia, A., Yunesian, M.,
Nabizadeh, R., Arahami, M., Sowlat, M. H., Yarahmadi, M., Saki, H., Alimohamadi, M., Nazmara,
S., Motevalian, S. A., & Goudarzi, G. (2012). The evaluation of PM10, PM2.5, and PM1
concentrations during the Middle Eastern Dust (MED) events in Ahvaz, Iran, from April through
September 2010. Journal of Arid Environments, 77(1), 72–83.
https://doi.org/10.1016/j.jaridenv.2011.09.007
5. Rosenfeld, D., Y. Rudich, and R. Lahav (2001), Desert dust suppressing precipitation: A
possible desertification feedback loop, Proc. Natl. Acad. Sci. U.S.A., 98(11), 5975–5980.
6. Shahsavani, A., Tobías, A., Querol, X., Stafoggia, M., Abdolshahnejad, M., Mayvaneh, F.,
Guo, Y., Hadei, M., Saeed Hashemi, S., Khosravi, A., Namvar, Z., Yarahmadi, M., & Emam, B.
(2020). Short-term effects of particulate matter during desert and non-desert dust days on mortality
in Iran. Environment International, 134,105299. https://doi.org/10.1016/j.envint.2019.105299
7. Middleton, N.; Yiallouros, P.; Kleanthous, S.; Kolokotroni, O.; Schwartz, J.; Dockery,
D.W.; Demokritou, P.; Koutrakis, P. A 10-year time-series analysis of respiratory and
cardiovascular morbidity in Nicosia, Cyprus: The effect of short-term changes in air pollution and
dust storms. Environ. Health 2008, 7, 1–16. [CrossRef]
https://www.ncbi.nlm.nih.gov/pubmed/18647382
8. Voiland, A. (2010). Aerosols: Tiny Particles, Big Impact.
https://www.studocu.com/en-ca/document/simon-fraser-university/behavior-in-
organizations/tutorial-work/nasa-eo-voiland-2010-aerosols-tiny-particles-big-
impact/7218283/view
9. Gelaro R, McCarty W, Suárez MJ, et al. The modern-era retrospective analysis for research
and applications, version 2 (MERRA-2). J Clim. 2017;30(14):5419-5454. doi:10.1175/JCLI-D-
16-0758.1.
https://journals.ametsoc.org/doi/pdf/10.1175/JCLI-D-16-0758.1
10. NASA MERRA-2 (2014). Retrieved from
https://gmao.gsfc.nasa.gov/reanalysis/MERRA-2/
19
11. NASA GEOS-5 Nature Run (2014), Ganymed Release. Retrieved from
https://gmao.gsfc.nasa.gov/global_mesoscale/7km-G5NR/
12. Gelaro, R., Putman, W.M., Pawson, S., Draper, C., Molod, A., Norris, P.M., Ott, L., Prive,
N., Reale, O., Achuthavarier, D. and Bosilovich, M., 2015. Evaluation of the 7-km GEOS-5 Nature
Run.
https://ntrs.nasa.gov/search.jsp?R=20150011486
13. Randles CA, da Silva AM, Buchard V, et al. (2017). The MERRA-2 aerosol reanalysis,
1980 onward. Part I: System description and data assimilation evaluation. J Clim.;30(17):6823-
6850. doi:10.1175/JCLI-D-16-0609.1.
https://doi.org/10.1175/JCLI-D-16-0609.1
14. NASA, GSFC (2018). Retrieved from
https://gmao.gsfc.nasa.gov
15. Amante and Eakins, 2009. Retrieved from
https://www.ngdc.noaa.gov/mgg/global/relief/ETOPO1/docs/ETOPO1.pdf
16. Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.
17. Adam, 2018. deep learning optimization. Retrieved from
https://towardsdatascience.com/adam-latest-trends-in-deep-learning-optimization-6be9a291375c
Abstract (if available)
Abstract
Dust particles play an important role in both climate change and human health. Accurate estimation of spatiotemporal variability of dust is therefore important for improving our understanding of its role in climate change and on adverse health effects. The Modern-era Retrospective analysis for Research and Applications, version 2(MERRA-2) is a global atmospheric reanalysis produced by the NASA Global Modeling and Assimilation Office (GMAO). MERRA-2 has coarse spatial resolutions of 50 km in the horizontal direction. The Goddard Earth Observing System, version 5 (GEOS-5) Nature Run (G5NR) is a special run of MERRA-2 providing a 2-year global non-hydrostatic mesoscale simulation for the period June 2005 through May 2007 with a 7 km horizontal resolution. Both GEOS-5 and G5NR provide data on a deep learning method to downscale dust aerosols from 50 km resolution to 7 km resolution based on both MERRA-2 and G5NR. The neural network models and the ridge models were developed for each season in every year to see the seasonal variability. Random sampling and leave-one-day-out sampling (excluding Thursday’s data from the week as the training set) were implemented to validate the models. The R² values of the models on the summer dataset were much larger than the models for the rest of the seasons. Additionally, the neural network models outperformed the ridge models to downscale dust extinction from a 50 km resolution to a 7 km resolution. But overfitting issues appeared when validating the neural network models using leave-one-day-out sampling.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Machine learning approaches for downscaling satellite observations of dust
PDF
Statistical downscaling with artificial neural network
PDF
Using multi-angle imaging spectroradiometer aerosol mixture properties and meteorology for PM₂.₅ assessment in Iran
PDF
Comparison of models for predicting PM2.5 concentration in Wuhan, China
PDF
Cluster detection of burn pits exposure in Iraq and Afghanistan using satellite observations from 2002 to 2012
PDF
Uncertainty quantification in extreme gradient boosting with application to environmental epidemiology
PDF
Identifying prognostic gene mutations in colorectal cancer with random forest survival analysis
PDF
Assessment of land cover change in Southern California from 2003 to 2011 using Landsat Thematic Mapper
PDF
Best practice development for RNA-Seq analysis of complex disorders, with applications in schizophrenia
PDF
Identification of differentially connected gene expression subnetworks in asthma symptom
PDF
Covariance-based distance-weighted regression for incomplete and misaligned spatial data
PDF
Associations between perfluoroalkyl substances exposure and metabolic pathways in youth
PDF
Forecasting traffic volume using machine learning and kriging methods
PDF
Inference correction in measurement error models with a complex dosimetry system
PDF
Comparison of Cox regression and machine learning methods for survival analysis of prostate cancer
PDF
Infants in non-rhabdomyosarcoma soft tissue sarcoma
PDF
Modeling mutational signatures in cancer
PDF
Sentiment analysis in the COVID-19 vaccine willingness among staff in the University of Southern California
PDF
Nonlinear modeling of the relationship between smoking and DNA methylation in the multi-ethnic cohort
PDF
Predicting ototoxicity evaluated by SIOP in children receiving cisplatin
Asset Metadata
Creator
Du, Lichen
(author)
Core Title
Downscaling satellite observations of dust with deep learning
School
Keck School of Medicine
Degree
Master of Science
Degree Program
Biostatistics
Publication Date
05/03/2020
Defense Date
05/01/2020
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
downscaling,dust,G5NR,MERRA-2,Middle East region,neutral networks,OAI-PMH Harvest
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Franklin, Meredith (
committee chair
), Lewinger, Juan (
committee member
), Marjoram, Paul (
committee member
)
Creator Email
lichendu@usc.edu,lichendu1996@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-293775
Unique identifier
UC11666003
Identifier
etd-DuLichen-8408.pdf (filename),usctheses-c89-293775 (legacy record id)
Legacy Identifier
etd-DuLichen-8408.pdf
Dmrecord
293775
Document Type
Thesis
Rights
Du, Lichen
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
downscaling
G5NR
MERRA-2
Middle East region
neutral networks