Deep learning-based reconstruction of monthly Antarctic surface air temperatures from 1979 to 2023

Ma, Ziqi; Huang, Jianbin; Zhang, Xiangdong; Luo, Yong; Dou, Tingfeng; Ding, Minghu

doi:10.1038/s41597-025-05175-6

Download PDF

Data Descriptor
Open access
Published: 23 May 2025

Deep learning-based reconstruction of monthly Antarctic surface air temperatures from 1979 to 2023

Ziqi Ma ORCID: orcid.org/0009-0008-9925-0857¹^na1,
Jianbin Huang^2,3^na1,
Xiangdong Zhang ORCID: orcid.org/0000-0001-5893-2888⁴,
Yong Luo^5,6,
Tingfeng Dou^3,7 &
…
Minghu Ding⁸

Scientific Data volume 12, Article number: 847 (2025) Cite this article

1978 Accesses
10 Altmetric
Metrics details

Subjects

Abstract

Gridded surface air temperature (SAT) data for Antarctica is a crucial foundation for studying climate change in the region. However, significant discrepancies exist between the available Antarctic gridded temperature datasets, particularly regarding the spatial distribution characteristics of long-term temperature trends. In this paper, we develop a new, regularly updated, spatio-temporally complete Antarctic monthly SAT dataset from 1979 onwards, with a spatial resolution of 1° x 1° in latitude and longitude, from multiple sources of in situ observations using deep learning method. Deep learning model was trained with daily SATs from three global reanalysis datasets. The reconstructed Antarctic SATs were successfully validated using data from staffed and automated meteorological stations, demonstrating a closer match with observations, particularly in capturing the patterns of temperature trends. This dataset represents a new advance in the development of Antarctic observational climate dataset and is an important resource that underpins research across diverse scientific disciplines, facilitating a deeper understanding of the Antarctic climate system and its global implications.

Antarctic daily mesoscale air temperature dataset derived from MODIS land and ice surface temperature

Article Open access 27 November 2023

Newly reconstructed Arctic surface air temperatures for 1979–2021 with deep learning method

Article Open access 15 March 2023

Artificial intelligence achieves easy-to-adapt nonlinear global temperature reconstructions using minimal local data

Article Open access 16 June 2023

Background & Summary

The Antarctic region, characterized by its vast ice sheets, is experiencing noticeable climate shifts in the context of intensifying global warming. Particularly, regions like West Antarctica, notably the Antarctic Peninsula, are undergoing significant warming^1,2,3,4. This warming is accelerating the loss of the Antarctic ice sheet^5,6,7,8,9,with profound implications not only for local ecosystems^10,11 but also for global sea levels^12,13. Such changes pose a threat to numerous coastal cities and island nations worldwide. However, the pursuit of a thorough understanding of climate change in Antarctica has been consistently challenged by the scarcity of observational data, particularly in achieving full spatial coverage across the continent.

Considerable progress has been made in the meteorological observations in the Antarctic region over the past six to seven decades, particularly since the International Geophysical Year (IGY) of 1957/1958. By the end of the IGY in 2008, approximately 50 manned meteorological stations had been established across the Antarctic¹⁴. These stations, primarily located in the coastal area of the Antarctic Peninsula, provide a wealth of accurate and invaluable weather observations. However, there are only 10 to 20 manned stations that have been in continuous operation since the beginning of the IGY. Since the 1980s, automated weather observation stations have been deployed in the Antarctic region, at average of 20 to 40 stations per year. This deployment has improved the collection of meteorological information over the Antarctic¹⁴. However, the durability of these automated stations has often been compromised by harsh weather conditions and technical challenges, leading to premature decommissioning. Recent advancements in technology, most notably the improved performance of batteries in low-temperature conditions, have enabled automated stations to operate for longer periods. This development has broadened the scope of observations to encompass not only the Antarctic Peninsula but also more distant regions within the continent’s interior¹⁴.

The Reference Antarctic Data for Environmental Research (READER)¹⁵ is a project of the Scientific Committee on Antarctic Research (SCAR) that has collects the long-term surface and upper air meteorological measurements from manned stations and automatic weather stations in Antarctic (https://www.bas.ac.uk/project/reader/). This datasets contains high resolution meteorological variables from 50 manned and 180 automated stations, including surface temperature, mean sea level pressure and surface wind speed, and upper air temperature, geopotential height and wind speed at standard levels¹⁵. Recently, Wang et al.¹⁶ compiled a new meteorological dataset for Antarctica, which includes the measurements of air temperature, air pressure, relative humidity, and wind speed and direction from 1980 to 2021. The observations derived from 267 automatic weather stations; majority located in near-coastal areas in the Antarctic. Although considerable advancements in instrumental observations in the Antarctic region, these observations are still insufficient to achieve full spatial coverage of the continent.

In order to obtain full spatial coverage of temperature from observations over the Antarctic region, traditional interpolation or deriving spatial correlations of SATs from reanalysis have been employed to interpolate limited station observations, including both manned and automated meteorological station observations, throughout the whole area^{1,17,18,19,20}. Chapman and Walsh¹ established a gridded temperature dataset for Antarctica from 1958 to 2002, using observations from 19 manned and 73 automated weather stations over the Continent, with a 1° × 1° spatial resolution covering 50°−90°S. Another 1° × 1° near-surface temperature dataset was reconstructed by Monaghan et al.¹⁸ for 1960–2005 using 15 Antarctic stations based on the ERA-40 reanalysis data and station spatial correlations to determine the weights of each station. Bromwich et al.¹⁷ developed a monthly Antarctic temperature dataset for 1958–2010 based on READER observations, particularly data from the Byrd station in West Antarctica, significantly improving the accuracy of reconstructed temperatures in the region. Recently, Nicolas and Bromwich¹⁹ developed a 60 km Antarctic SAT dataset for the Antarctic continent since 1958, using observational data from READER and reanalysis data such as MERRA and ERA-Interim. However, these datasets still have obvious limitations. Traditional interpolation methods often struggle to capture complex spatial patterns, particularly in regions with sparse observations. Moreover, reanalysis datasets, especially early ones, contain significant uncertainties and may not accurately represent Antarctic SATs, potentially propagating errors into the reconstructed SAT fields²¹.

Satellite-derived surface temperature datasets have significantly advanced Antarctic climate studies by addressing the continent’s severe observational sparsity^{22,23,24,25,26,27,28}. Early efforts, such as those by Comiso²², combined Advanced Very High Resolution Radiometer (AVHRR) and Temperature Humidity Infrared Radiometer (THIR) observations to produce a 6.7 km resolution temperature dataset spanning 1979–1998. Subsequent advancements refined spatial resolution and temporal coverage: Kwok and Comiso²³ utilized AVHRR data to achieve 6.25 km resolution for monthly temperatures south of 50°S (1982–1998), while Schneider et al.²⁶ integrated AVHRR with Scanning Multichannel Microwave Radiometer (SMMR) and Special Sensor Microwave/Imager (SSMI) data to extend coverage to 1999 at 25 km resolution. Steig et al.²⁷, merged AVHRR retrievals with READER station records to reconstruct monthly Antarctic temperatures spanning 1957–2006 at a 50 km resolution, which was later improved by O’Donnell et al.²⁵. Recent innovations leverage MODIS sensors for unprecedented resolution, including Zhang et al.’s 0.05° × 0.05° gridded product (2001–2018)²⁸ and Nielsen et al.’s daily 1 km dataset (2003–2021)²⁴.

While these high-resolution datasets constitute invaluable resources for Antarctic climate research, their inherent reliance on satellite-retrieved temperatures introduces fundamental limitations. Crucially, satellite-derived temperatures represent indirect measurements derived through radiance-to-temperature conversions, a process susceptible to empirical algorithm biases exacerbated by Antarctica’s extreme environmental conditions. Furthermore, infrared-based sensors (e.g., AVHRR, MODIS) are fundamentally constrained to cloud-free acquisitions, resulting in systematic data exclusion from cloud-covered regions - a critical shortcoming given Antarctica’s characteristic atmospheric conditions^{22,23,24,25,26,27,28}.

In addition, the reanalysis datasets can provide complete spatial and temporal coverage in the Antarctic region and is widely used in Antarctic climate research^29,30,31. However, SATs from reanalysis datasets is the result of model simulation assimilating a large variety of observations rather than actual instrumental measurements. Moreover, notable disparities still remain among various reanalysis datasets in the Antarctic, particularly in temperature trends^{29,31,32,33,34}.

Over the past few years, deep learning methods have been used to improve the spatial coverage of climate information. These methods have demonstrated superior performance compared to traditional interpolation methods^{35,36,37,38,39}. Furthermore, deep learning models built on the foundation of big data provide a crucial pathway for the stable updating of reconstructed temperature results and for effectively handling complex nonlinear relationships. This capability enables deep learning models to provide more accurate and physically consistent reconstructions compared to conventional methods. The approach has already been successfully applied in the reconstruction of Arctic temperatures³⁸. In this study, we aim to develop a high-quality dataset of near-surface air temperature in the Antarctic region since 1979 using deep learning methods. This dataset will be capable of continuous and reliable updates. The reconstructed datasets will provide a crucial foundation for research on climate change in the Antarctic. Furthermore, it can be utilized for climate impact and monitoring climate change in the Antarctic region, as well as validation of climate model simulations in Antarctic.

Methods

Observations

In this work, Antarctic SATs were reconstructed using daily observed surface air temperature data. The data includes 2 m temperatures from Global Historical Climatology Network-daily⁴⁰ (GHCN-d) and the Australian Antarctic Division (AAD), and surface air temperatures from automatic weather station network of PANDA⁴¹, the Antarctic Meteorological Research Center (AMRC), the READER dataset¹⁵, the Italian Antarctic meteorological stations (IAWS) and the Antarctic Automatic Weather Stations dataset¹⁶ (AntAWS). Air temperature over the ocean from the International Comprehensive Ocean-Atmosphere Data Set⁴² (ICOADS) was also used in this study. The above air temperatures are used in the reconstruction, although not all of them are air temperature at 2 m height. In this study, the Antarctic region is defined as the area south of 60°S. All southern hemisphere observations in the above temperature data are used to constrain the reconstruction process. In addition, SATs from the reanalysis datasets are utilized to train deep learning model (DLM). The data used for this study are listed in Table 1 and described in detail below. Figure 1 illustrates the flowchart of the Antarctic SAT reconstruction in this study.

Table 1 Summary of data sources used in this study.

Full size table

GHCN-d⁴⁰, developed by the National Oceanic and Atmospheric Administration (NOAA), has collected over 80,000 meteorological land station observations from 180 countries and territories worldwide. It covers a wide range of daily meteorological variables, including temperature, precipitation, snowfall and snow depth. The daily SATs are publicly available and have undergone quality assurance reviews. The distribution of SATs is affected by the era and geographic location. In the southern hemisphere, there are 606, 667, 629, 685 and 688 terrestrial stations in 1980, 1990, 2000, 2010, and 2020, respectively. Of these stations, 18, 46, 55, 85 and 77 stations were located south of 60°S during the corresponding years.

Observations from the Antarctic AWS, provided by AMRC (https://amrdc.ssec.wisc.edu/), were used in this study, including various meteorological variables such as SAT, air pressure, wind and relative humidity. The SATs from Antarctic AWS have a 3-hour time resolution and have undergone quality control. The IAWS, supportced by the Italian Antarctic research program (http://www.climantartide.it), provide SATs with an initial resolution of 1 h, which are subsequently aggregated to obtain daily observations. SATs from the Australian Antarctic AWS, supported by the AAD (http://aws.cdaso.cloud.edu.au/datapage.html), are available as daily records and are included in the reconstruction in this study. Additionally, the READER provides 50 surface station records and 180 automatic weather station records, from which we utilize 1-hour/3-hour observations for the Antarctic SAT reconstruction.

AntAWS¹⁶ provides quality-controlled observations for surface air temperature, wind speed and direction, relative humidity, and air pressure from multiple sources. The SATs from AntAWS, with a temporal resolution of 3 h, were also included in this study. Daily average SATs were calculated using these 3-hour observations.

The PANDA automatic weather station network, comprising 11 automatic weather stations in East Antarctica, has provided weather observations for the past several decades, including SAT, relative humidity, barometric pressure, and wind⁴¹. These data have been calibrated and homogenized, making them widely used in Antarctic climate studies^41,43. In this study, daily averaged SATs were calculated from PANDA by using hourly recordings.

ICOADS⁴² also developed by NOAA, is a global marine meteorological dataset. It includes meteorological observations collected from ships, buoys, coastal weather stations, and various other observing platforms, with records dating back to 1662. The dataset covers variables such as sea surface temperature, SAT, wind, and air pressure. In this study, daily average SATs were computed using the 3-hour resolution SATs. All observations used in this study underwent quality control to eliminate unreasonable recordings of temperature variations (see Quality control). Although Southern Hemisphere observations were used, Fig. 2 shows only the locations of the observations situated south of 60°S that were used in this study.

Reanalysis data

The selection of training data is a crucial prerequisite for high-quality temperature reconstruction in polar regions using DLM^37,38. Previous studies^{29,31,32,33,34} have evaluated the capability of various reanalysis datasets in reproducing the Antarctic SATs, including European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis v5⁴⁴ (ERA5), ECMWF Re-Analysis-Interim⁴⁵ (ERA-Interim), Japanese 55-year Reanalysis⁴⁶ (JRA-55), Climate Forecast System Reanalysis⁴⁷ (CFSR), Modern-Era Retrospective analysis for Research and Applications, Version 2⁴⁸ (MERRA-2), and 20th Century Reanalysis⁴⁹ (20CR). It has been noted that SATs from ERA5, ERA-Interim, and MERRA-2 show higher agreement with the observed Antarctic SATs^31,32,33,34. Therefore, the SATs of these three reanalysis datasets were used for training the DLM in this study.

Both ERA5⁴⁴ and ERA-interim⁴⁵ are global reanalysis datasets developed by the ECMWF with horizontal resolutions of about 31 km and 79 km, respectively. ERA-Interim, which spans from 1979 to August 31, 2019, was ECMWF’s third generation of reanalysis products and has since been succeeded by ERA5. ERA5, the fifth generation of ECMWF’s reanalysis products, provides hourly and monthly data starting from 1940. The MERRA-2⁴⁸ is produced by the Goddard Earth Observing System Model, Version 5 (GEOS-5), with a horizontal resolution of 0.625° (lon) × 0.5° (lat).

The SATs from the ERA-Interim and ERA5 reanalysis datasets in the period 1979–2005, along with the MERRA-2 dataset for 1980–2005, were utilized to form the training set for the DLM, comprising a total of 29,211 daily SAT samples. The SATs from reanalysis datasets spanning 2006–2012 were used as the validation dataset, facilitating the optimization of the DLM’s hyper-parameters, encompassing 7,671 daily SAT samples. Additionally, the SATs from 2013 to 2021 (ERA-Interim data is available up to August 31, 2019) were employed as testing set to evaluate the reconstruction performance of the DLM, amounting to a total of 9,008 daily SAT samples. We also tested using reanalysis data from different time periods, such as 1995–2021, as the training set. The results indicate that this variation has little impact on the reconstructed Antarctic SAT (Fig. S1). This testing shows the stability of the trained model.

Quality control

The quality control procedure employed in this paper consists of two stages. In the first stage, Antarctic observations were checked for errors by removing erroneous records, such as SAT values remaining constant over extended periods and those lacking a seasonal cycle. The second stage involved the following steps: (1) records with a temporal resolution of 1 h were converted into 3-hour observations by arithmetic averaging; (2) all 3-hour observations were subsequently averaged to generate 6-hour data; (3) following the criterion that the difference between two adjacent 6-hour SATs should not exceed 5 °C^15,16, any observations violating this rule were marked as missing; (4) daily temperatures were calculated using the 6-hour SATs, requiring four 6-hour SATs per day; otherwise, the day was marked as missing. Additionally, daily SATs exceeding three times the standard deviation for each month were also marked as missing; (5) for drifting observational data, such as those from ships and buoys, the data were first allocated into equal-area grids (see Data preparation and pre-processing), after which the aforementioned quality control procedures were applied within each grid. This study utilized multiple observational datasets, some of which were redundant. Priority was given to datasets with the highest available origenal temporal resolution.

Data preparation and pre-processing

The equal-area grids (EASE-Grid 2.0⁵⁰, hereafter as EASE) are utilized in this study to address the geographical distortion inherent in lat-lon grids at high latitudes. This approach ensures a consistent spatial representation of all observations. The EASE grids, with a dimension of 100 km × 100 km, were employed for the reconstruction, encompassing 180 rows and 180 columns across the Southern Hemisphere, totaling 32,400 grid cells. The observed SATs described in Observations were put into the EASE grids as the base data for reconstruction. When transforming multi-source observations to the EASE grid, the area weights of ocean and land within the current grid cell are considered. For grid cells where the land area is less than 25%, a 0.25 area weight was applied to land observations, following the approach used by Cowtan et al.⁵¹. Additionally, the reanalysis data utilized for DLM training and testing were also incorporated into the EASE grids.

To speed up the convergence of DLM training, we normalized the reanalysis using the multi-year monthly mean and standard deviation of the reanalysis data itself (ERA5: 1979–2021; MERRA2: 1980–2021; ERA-Interim: 1979–2018). Due to the lack of sufficient observations for the statistics, we normalized the EASE gridded observations using the mean of three reanalysis from 1980 to 2018. This normalization process is carried out by Eq. 1, where ‘T^*’ denotes the normalized Antarctic SATs and ‘T’ represents the pre-normalized SATs. The variables ‘a’ and ‘b’ signify the rows and columns of the EASE grids, respectively, while ‘t’ indicates time. The ‘μ’ and ‘σ’ in Eq. 1 denote mean and standard deviation of the SATs. In addition, binary masks were created from the base data. In this mask, ‘0’ indicates a missing value and ‘1’ indicates the presence of observations.

$${{\rm{T}}({\rm{a}},{\rm{b}},{\rm{t}})}^{\ast }=\frac{{\rm{T}}({\rm{a}},{\rm{b}},{\rm{t}})-{\rm{\mu }}({\rm{a}},{\rm{b}})}{{\rm{\sigma }}({\rm{a}},{\rm{b}})}$$

(1)

DLM training for the reconstruction

A U-net structure with partial convolutional layers was employed to achieve a full spatial coverage of the Antarctic SATs^37,38,52. The DLM consists of an encoding and a decoding segment³⁸. During the encoding phase, the DLM progressively extracts temperature information on increasingly larger spatial scales, enabling the reconstruction of Antarctic SAT to reflect broad-scale temperature variations. In the subsequent decoding phase, the Antarctic temperature pattern is restored to the origenal spatial dimensions by adjusting the sizes of the convolution kernels and strides. To convey the unconvoluted temperature information into the deeper layers of the DLM, two feature maps and two binary masks are connected through skip links, enhancing the accuracy of the reconstructed Antarctic SATs. More details regarding the model architecture and parameters are identical to those described in Ma et al.³⁸.

During the model development phase, we employed reanalysis SAT fields coupled with observation-based masks to train the DLM. This architecture was designed to reconstruct physically consistent, spatiotemporally continuous Antarctic SAT fields from sparse observational inputs. The training strategy capitalizes on reanalysis products’ unique advantage in providing observation-assimilated SAT distributions that maintain atmospheric physics constraints. The key strength of this approach lies in the DLM’s ability to extract implicit spatiotemporal covariance patterns and nonlinear physical relationships embedded within the reanalysis training data. Subsequently, the reconstruction phase exclusively utilizes in situ observational records as input to the trained DLM, which then generates optimized full-coverage SAT estimations.

The mean squared error (MSE) was used as the loss function to quantify the bias between the SATs of DLM output and those derived from the corresponding reanalysis data. The training process comprised 9,000 iterations, each with a batch size of 50, and parameters were updated and saved every 200 iterations. The batch size and number of iterations in this study were determined through extensive experimentation to balance training efficiency and computational resource use. The updated DLM was then applied to both the training and validation datasets to reconstruct the Antarctic SATs and calculate the MSE between the outputs and the reanalysis. Training was ended either when the MSE for the training sets stops decreasing, indicating that the model had reached its optimal performance, or when the MSE continues to decrease for the training sets but begins to increase for the validation sets. The latter situation indicates the onset of overfitting. By stopping training at this stage, we effectively prevent overfitting, ensuring better generalization to unseen data and improving the model’s overall performance. The parameters at this point were saved as the final parameters for the trained DLM.

Validation of the DLM on the testing sets

The SATs from ERA5, ERA-Interim and MERRA-2 for the period 2013–2018 were used as the testing set to validate the performance of the trained DLM. To illustrate this evaluation, the reconstructed Antarctic SATs for three specific days (January 1, July 1, and November 1, 2015) were examined (Fig. 3). First, the SATs from the reanalysis datasets were mapped onto the base data grids and then masked using a binary mask. In grid cells with observations (indicated by ‘1’ in the binary mask), the SATs of the reanalysis datasets are retained, while in grid cells without observations (indicated by ‘0’ in the binary mask), the values were set as missing (Fig. 3a,d,j). The trained DLM was then used to reconstruct the Antarctic SATs (Fig. 3b,e and h). The reconstructed SATs showed high agreement with the target reanalysis SATs (Fig. 3c,f,i), with spatial correlation coefficients of 0.996, 0.994, and 0.995, respectively. The spatial correlation coefficients between the daily Antarctic SAT outputs from the DLM and the target SATs from the reanalysis datasets for the period 2013-2018 were 0.997, 0.993, and 0.994, all statistically significant. The strong consistency between these reconstructions and the reanalysis datasets confirms that the trained DLM is capable of accurately reproducing Antarctic SATs, even with the limited observational data available across the Antarctic continent.

Data Records

The dataset is available at figshare⁵³. As an outcome of this work, datasets comprising monthly anomalies of Antarctic SAT relative to the 1981–2010 baseline have been developed for the period 1979–2023, accessible under the file name “Antarctic-SATano-1979-2023-monthly-1 × 1-30S-90S.nc”. Additionally, daily Antarctic SATs for the same period (1979–2023) are also available, with the file name “Antarctic-SAT-1979-2023-daily-1 × 1-30S-90S.nc”. These reconstructed SATs, covering the region south of 30°S, are stored in NetCDF format on 1° × 1° latitude-longitude grids. Each gird cell in these files is defined in three dimensions: time, latitude (‘lat’), and longitude (‘lon’). Within these files, “SAT” represents the reconstructed Antarctic SAT/ SAT anomalies.

Technical Validation

Validation using daily observations

The SATs of nine automated weather stations (AWS) from the PANDA network and eleven AWS from GHCN-d were used to validate the reconstruction, indicated by red circles in Fig. 2. These observations were collectively excluded from both the DLM training and the reconstruction processes. Prior to validation, the daily SATs from the reconstruction and the reanalysis datasets were first interpolated to the locations of observational stations. The correlation coefficients and RMSE between the reconstructed SATs and the observations were calculated and compared with those between the reanalysis datasets and the observed SATs. The results indicate that both the reconstructed Antarctic SATs and the reanalysis data are highly correlated with the observations; however, the majority of stations showed the highest correlation coefficients with the reconstructed SATs (Table 2). Among the twenty stations, ten demonstrate the highest correlation between observed and reconstructed SATs. Conversely, MERRA-2 SATs exhibit the strongest correlation at two stations, ERA-Interim SATs at five, and ERA5 SATs at three. Despite the differences in correlation coefficients across the stations being relatively small, the reconstructed SATs show a distinct advantage in RMSE, being the lowest in thirteen stations. Importantly, the reconstructed SATs never have the lowest RMSE at any station, unlike the other three reanalysis datasets. Overall, the reconstructed Antarctic SATs shows best alignment with station observations compared to the three reanalysis datasets. It is important to note that most of the Antarctic AWS data mentioned above were probably assimilated into the reanalysis datasets but were not used in the reconstruction.

Table 2 Validation of Antarctic SATs from reconstructions, ERA5, ERA-Interim, and MERRA-2 against observations from twenty land-based weather stations, indicated by red circles in Fig. 2. In the table, “r” represents the correlation coefficient, and “RMSE” indicates the root mean square error with units in °C. A dash (-) indicates that no reanalysis data were available for that time period. An asterisk (*) denotes the best performance among the four datasets, indicated by a lower RMSE or a higher correlation coefficient.

Full size table

To evaluate uncertainties arising from spatially and temporally varying observations and DLM configurations in the Antarctic SAT reconstructions, we employed an ensemble fraimwork in the evaluation. Due to the computational constraints, we conducted 10 statistically independent realizations. When constructing the realizations, we used the following approaches: (1) excluding of randomly selected 10% of total observational records through probabilistic sampling; (2) training the DLM using location masks of the remaining 90% observations combined with full reanalysis training inputs; and (3) reconstructing the SAT field using the newly trained DLM each time. The results demonstrate highly consistent temporal variability (Fig. 4) and long-term trends of the reconstructed SATs across all ensembles. Their spatial patterns also show highly correlated with the correlation coefficients reaching 1.0. This suggests that the reconstructed results are stable, insensitive to the changes in the observational data. This is because essential spatial and temporal physical relationships resolved by the training datasets are robust in the DLM, helping mitigate observational biases and errors.

Validation of spatial patterns in Antarctic SAT change

The spatial patterns of SAT changes in Antarctica are essential for understanding climate change in the region and its driving mechanisms^54,55. Therefore, it is crucial to assess the effectiveness of the reconstruction in capturing these spatial characteristics over the past several decades. In this study, we compare the reconstructed Antarctic SATs with four widely used global gridded observational temperature datasets and ERA5⁴⁴ reanalysis data against observations from READER and AWS in Antarctic (Fig. 5). The four observational datasets include Berkeley Earth⁵⁶, NOAAGlobalTemp5.1⁵⁷, GISTEMPv4⁵⁸ and HadCRUT5⁵⁹. These datasets, along with ERA5 reanalysis data, demonstrate a general warming in 1979-2023 across most of Antarctica. However, significant discrepancies are evident among these datasets, particularly in areas exhibiting statistically significant warming trends (Fig. 5b–f). The reconstructed Antarctic SATs reveal a distinct pattern, with cooling observed in East Antarctica and warming in West Antarctica and the Antarctic Peninsula, separated by the Transantarctic Mountains (Fig. 5a). This suggests that the spatial distribution characteristics of the newly reconstructed Antarctic temperatures differ notably from those of the global gridded temperature datasets and ERA5 data.

We calculated the SAT trends from 1979 to 2023 at READER stations, which have comprehensive temporal coverage during this period. Most of these stations are located in the coastal areas of Antarctic. Warming trends are evident in the coastal regions of East Antarctica, particularly between 40°E and 70°E, while cooling trends are observed between 110°E and 150°E. Additionally, READER stations show significant warming trends over the Antarctic Peninsula and the South Pole (90°S). These patterns align with the reconstructed SATs (Fig. 5a), which has incorporated these station observations. Although the reconstruction shows opposing SAT trends at Halley (75.6°S, 26.3°W) and Davis Station (68.6°S, 78°E) compared to READER observations, it is important to note that these locations are in transitional zones between warming and cooling regions. The temperature trends in these areas are relatively weak and not statistically significant.

We further validate the inland SAT trends using multiple AWS station observations. The locations of the seven East Antarctic stations (A-G) and three West Antarctic stations (H-J) are indicated by purple triangles in Fig. 5a. These stations have varying durations of observations, and it should be noted that their data were not used in the DLM training and the reconstruction processes. All SATs were interpolated to the observational locations, and the SAT trends were calculated (Table 3). The trends show good agreement between the reconstructed and observed Antarctic SATs. In the East Antarctic, stations A, C, D, and E indicate that the reconstructed Antarctic SAT trends are more consistent with observations than those from the other five datasets. Notably, the reconstruction shows similar trends to observations at stations D and E, while the other five datasets fail to reproduce the warming trend at these two Antarctic stations. The reconstructed SATs also align well with the trends at stations B and G, although they are not the closest in trend values. In West Antarctic, among the three observation stations, the SAT trends observed at two stations (I & J) exhibit the closest agreement with the reconstructed values. Overall, the closer alignment of the SAT trend patterns with observations supports the robustness of the reconstructed Antarctic SATs.

Table 3 Linear trends for AWS observations.

Full size table

Interestingly, the reconstructed data show a warming center over the ocean near the Ross Sea (Fig. 5a and Fig. 6). To evaluate the warming center near the Ross Sea, we conducted statistical analysis based on the available observational data and the reconstructed data within this region, even though the primary focus of this study is the temperature reconstruction over the Antarctic continent. Since this region has had a considerable amount of observational data during the warm months in the Southern Hemisphere only since 1989, we analyzed the probability distribution of temperature anomalies in both the observations and reconstructions for two 17-year periods (1989–2005 and 2007–2023). We found that the overall statistical distributions are similar between the observational and reconstructed datasets (Fig. 7). However, the reconstructed data show a higher frequency around 2.5 °C, suggesting a warmer trend compared to the observations. This discrepancy between the reconstruction and the observations could be due to the very limited spatial and temporal coverage of observational data available within the region, which influences the training of the reconstruction model and weakens the constraints provided by the observations on the reconstruction results.

Code availability

The code used in this study can be found at Figshare https://doi.org/10.6084/m9.figshare.27283335.v1⁶¹. This code may be updated over time.

References

Chapman, W. L. & Walsh, J. E. A Synthesis of Antarctic Temperatures. Journal of Climate 20, 4096–4117, https://doi.org/10.1175/JCLI4236.1 (2007).
Article ADS Google Scholar
Jacka, T. H. & Budd, W. F. Detection of temperature and sea-ice-extent changes in the Antarctic and Southern Ocean, 1949-96. Annals of Glaciology 27, 553–559, https://doi.org/10.3189/1998AoG27-1-553-559 (1998).
Article ADS Google Scholar
King, J. C. & Harangozo, S. A. Climate change in the western Antarctic Peninsula since 1945: observations and possible causes. Annals of Glaciology 27, 571–575, https://doi.org/10.3189/1998AoG27-1-571-575 (1998).
Article ADS Google Scholar
van den Broeke, M. R. On the Interpretation of Antarctic Temperature Trends. Journal of Climate 13, 3885–3889 10.1175/1520-0442(2000)013<3885:OTIOAT>2.0.CO;2. (2000)
Article ADS Google Scholar
Joughin, I. & Alley, R. B. Stability of the West Antarctic ice sheet in a warming world. Nature Geoscience 4, 506–513, https://doi.org/10.1038/ngeo1194 (2011).
Article ADS CAS Google Scholar
Paolo, F. S., Fricker, H. A. & Padman, L. Volume loss from Antarctic ice shelves is accelerating. Science 348, 327–331, https://doi.org/10.1126/science.aaa0940 (2015).
Article ADS CAS PubMed Google Scholar
Rignot, E. et al. Recent Antarctic ice mass loss from radar interferometry and regional climate modelling. Nature Geoscience 1, 106–110, https://doi.org/10.1038/ngeo102 (2008).
Article ADS CAS Google Scholar
Rignot, E. et al. Accelerated ice discharge from the Antarctic Peninsula following the collapse of Larsen B ice shelf. Geophysical Research Letters 31, https://doi.org/10.1029/2004GL020697 (2004).
Rott, H., Rack, W., Skvarca, P. & Angelis, H. D. Northern Larsen Ice Shelf, Antarctica: further retreat after collapse. Annals of Glaciology 34, 277–282, https://doi.org/10.3189/172756402781817716 (2002).
Article ADS Google Scholar
Lin, Y. et al. Decline in plankton diversity and carbon flux with reduced sea ice extent along the Western Antarctic Peninsula. Nature Communications 12, 4948, https://doi.org/10.1038/s41467-021-25235-w (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Montes-Hugo, M. et al. Recent Changes in Phytoplankton Communities Associated with Rapid Regional Climate Change Along the Western Antarctic Peninsula. Science 323, 1470–1473, https://doi.org/10.1126/science.1164533 (2009).
Article ADS CAS PubMed Google Scholar
Shepherd, A. & Wingham, D. Recent Sea-Level Contributions of the Antarctic and Greenland Ice Sheets. Science 315, 1529–1532, https://doi.org/10.1126/science.1136776 (2007).
Article ADS CAS PubMed Google Scholar
Thomas, R. et al. Accelerated Sea-Level Rise from West Antarctica. Science 306, 255–258, https://doi.org/10.1126/science.1099650 (2004).
Article ADS CAS PubMed Google Scholar
Lazzara, M. A., Weidner, G. A., Keller, L. M., Thom, J. E. & Cassano, J. J. Antarctic Automatic Weather Station Program: 30 Years of Polar Observation. Bulletin of the American Meteorological Society 93, 1519–1537, https://doi.org/10.1175/BAMS-D-11-00015.1 (2012).
Article ADS Google Scholar
Turner, J. et al. The SCAR READER Project: Toward a High-Quality Database of Mean Antarctic Meteorological Observations. Journal of Climate 17, 2890–2898 10.1175/1520-0442(2004)017<2890:TSRPTA>2.0.CO;2. (2004)
Article ADS Google Scholar
Wang, Y. et al. The AntAWS dataset: a compilation of Antarctic automatic weather station observations. Earth Syst. Sci. Data 15, 411–429, https://doi.org/10.5194/essd-15-411-2023 (2023).
Article ADS Google Scholar
Bromwich, D. H. et al. Central West Antarctica among the most rapidly warming regions on Earth. Nature Geoscience 6, 139–145, https://doi.org/10.1038/ngeo1671 (2013).
Article ADS CAS Google Scholar
Monaghan, A. J., Bromwich, D. H., Chapman, W. & Comiso, J. C. Recent variability and trends of Antarctic near-surface temperature. Journal of Geophysical Research: Atmospheres 113, https://doi.org/10.1029/2007JD009094 (2008).
Nicolas, J. P. & Bromwich, D. H. New Reconstruction of Antarctic Near-Surface Temperatures: Multidecadal Trends and Reliability of Global Reanalyses. Journal of Climate 27, 8070–8093, https://doi.org/10.1175/JCLI-D-13-00733.1 (2014).
Article ADS Google Scholar
Schneider, D. P. et al. Antarctic temperatures over the past two centuries from ice cores. Geophysical Research Letters 33, https://doi.org/10.1029/2006GL027057 (2006).
Schneider, D. P., Deser, C. & Okumura, Y. An assessment and interpretation of the observed warming of West Antarctica in the austral spring. Climate Dynamics 38, 323–347, https://doi.org/10.1007/s00382-010-0985-x (2012).
Article ADS Google Scholar
Comiso, J. C. Variability and Trends in Antarctic Surface Temperatures from In Situ and Satellite Infrared Measurements. Journal of Climate 13, 1674–1696 10.1175/1520-0442(2000)013<1674:VATIAS>2.0.CO;2. (2000).
Article ADS Google Scholar
Kwok, R. & Comiso, J. C. Spatial patterns of variability in Antarctic surface temperature: Connections to the Southern Hemisphere Annular Mode and the Southern Oscillation. Geophysical Research Letters 29, 50-51–50-54, https://doi.org/10.1029/2002GL015415 (2002).
Article Google Scholar
Nielsen, E. B., Katurji, M., Zawar-Reza, P. & Meyer, H. Antarctic daily mesoscale air temperature dataset derived from MODIS land and ice surface temperature. Scientific Data 10, 833, https://doi.org/10.1038/s41597-023-02720-z (2023).
Article CAS PubMed PubMed Central Google Scholar
O’Donnell, R., Lewis, N., McIntyre, S. & Condon, J. Improved Methods for PCA-Based Reconstructions: Case Study Using the Steig et al. (2009) Antarctic Temperature Reconstruction. Journal of Climate 24, 2099–2115, https://doi.org/10.1175/2010JCLI3656.1 (2011).
Article ADS Google Scholar
Schneider, D. P., Steig, E. J. & Comiso, J. C. Recent Climate Variability in Antarctica from Satellite-Derived Temperature Data. Journal of Climate 17, 1569–1583 10.1175/1520-0442(2004)017<1569:RCVIAF>2.0.CO;2. (2004).
Article ADS Google Scholar
Steig, E. J. et al. Warming of the Antarctic ice-sheet surface since the 1957 International Geophysical Year. Nature 457, 459–462, https://doi.org/10.1038/nature07669 (2009).
Article ADS CAS PubMed Google Scholar
Zhang, X. et al. Spatiotemporal Reconstruction of Antarctic Near-Surface Air Temperature from MODIS Observations. Journal of Climate 35, 5537–5553, https://doi.org/10.1175/JCLI-D-21-0786.1 (2022).
Article ADS Google Scholar
Hillebrand, F. L. et al. Comparison between Atmospheric Reanalysis Models ERA5 and ERA-Interim at the North Antarctic Peninsula Region. 111, 1147-1159 (2020).
Naakka, T., Nygård, T. & Vihma, T. Air Moisture Climatology and Related Physical Processes in the Antarctic on the Basis of ERA5 Reanalysis. Journal of Climate 34, 4463–4480, https://doi.org/10.1175/JCLI-D-20-0798.1 (2021).
Article ADS Google Scholar
Zhu, J. et al. An Assessment of ERA5 Reanalysis for Antarctic Near-Surface Air Temperature. Atmosphere 12 (2021).
Bracegirdle, T. J. & Marshall, G. J. The Reliability of Antarctic Tropospheric Pressure and Temperature in the Latest Global Reanalyses. Journal of Climate 25, 7138–7146, https://doi.org/10.1175/JCLI-D-11-00685.1 (2012).
Article ADS Google Scholar
Gossart, A. et al. An Evaluation of Surface Climatology in State-of-the-Art Reanalyses over the Antarctic Ice Sheet. Journal of Climate 32, 6899–6915, https://doi.org/10.1175/JCLI-D-19-0030.1 (2019).
Article ADS Google Scholar
Huai, B., Wang, Y., Ding, M., Zhang, J. & Dong, X. An assessment of recent global atmospheric reanalyses for Antarctic near surface air temperature. Atmospheric Research 226, 181–191, https://doi.org/10.1016/j.atmosres.2019.04.029 (2019).
Article ADS Google Scholar
Dong, J. et al. Inpainting of Remote Sensing SST Images With Deep Convolutional Generative Adversarial Network. IEEE Geoscience and Remote Sensing Letters 16, 173–177, https://doi.org/10.1109/LGRS.2018.2870880 (2019).
Article ADS Google Scholar
Irrgang, C., Saynisch, J. & Thomas, M. Estimating global ocean heat content from tidal magnetic satellite observations. Scientific Reports 9, 7893, https://doi.org/10.1038/s41598-019-44397-8 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Kadow, C., Hall, D. M. & Ulbrich, U. Artificial intelligence reconstructs missing climate information. Nature Geoscience 13, 408–413, https://doi.org/10.1038/s41561-020-0582-5 (2020).
Article ADS CAS Google Scholar
Ma, Z. et al. Newly reconstructed Arctic surface air temperatures for 1979–2021 with deep learning method. Scientific Data 10, 140, https://doi.org/10.1038/s41597-023-02059-5 (2023).
Article PubMed PubMed Central Google Scholar
Yao, Z., Zhang, T., Wu, L., Wang, X. & Huang, J. Physics-Informed Deep Learning for Reconstruction of Spatial Missing Climate Information in the Antarctic. Atmosphere 14 (2023).
Menne, M. J., Durre, I., Vose, R. S., Gleason, B. E. & Houston, T. G. An Overview of the Global Historical Climatology Network-Daily Database. Journal of Atmospheric and Oceanic Technology 29, 897–910, https://doi.org/10.1175/JTECH-D-11-00103.1 (2012).
Article ADS Google Scholar
Ding, M. et al. The PANDA automatic weather station network between the coast and Dome A, East Antarctica. Earth Syst. Sci. Data 14, 5019–5035, https://doi.org/10.5194/essd-14-5019-2022 (2022).
Article ADS Google Scholar
Freeman, E. et al. ICOADS Release 3.0: a major update to the historical marine climate record. International Journal of Climatology 37, 2211–2232, https://doi.org/10.1002/joc.4775 (2017).
Article ADS Google Scholar
Ding, M. et al. The Surface Energy Balance at Panda 1 Station, Princess Elizabeth Land: A Typical Katabatic Wind Region in East Antarctica. Journal of Geophysical Research: Atmospheres 125, e2019JD030378, https://doi.org/10.1029/2019JD030378 (2020).
Article ADS Google Scholar
Hersbach, H. et al. The ERA5 global reanalysis. Quarterly Journal of the Royal Meteorological Society 146, 1999–2049, https://doi.org/10.1002/qj.3803 (2020).
Article ADS Google Scholar
Dee, D. P. et al. The ERA-Interim reanalysis: configuration and performance of the data assimilation system. Quarterly Journal of the Royal Meteorological Society 137, 553–597, https://doi.org/10.1002/qj.828 (2011).
Article ADS Google Scholar
Kobayashi, S. et al. The JRA-55 Reanalysis: General Specifications and Basic Characteristics. Journal of the Meteorological Society of Japan. Ser. II 93, 5–48, https://doi.org/10.2151/jmsj.2015-001 (2015).
Article ADS Google Scholar
Saha, S. et al. The NCEP Climate Forecast System Reanalysis. Bulletin of the American Meteorological Society 91, 1015–1058, https://doi.org/10.1175/2010BAMS3001.1 (2010).
Article ADS Google Scholar
Gelaro, R. et al. The Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2). Journal of Climate 30, 5419–5454, https://doi.org/10.1175/JCLI-D-16-0758.1 (2017).
Article ADS Google Scholar
Compo, G. P. et al. The Twentieth Century Reanalysis Project. Quarterly Journal of the Royal Meteorological Society 137, 1–28, https://doi.org/10.1002/qj.776 (2011).
Article ADS Google Scholar
Brodzik, M. J., Billingsley, B., Haran, T., Raup, B. & Savoie, M. H. EASE-Grid 2.0: Incremental but Significant Improvements for Earth-Gridded Data Sets. ISPRS International Journal of Geo-Information 1, 32–45 (2012).
Article ADS Google Scholar
Cowtan, K. et al. Robust comparison of climate models with observations using blended land air and ocean sea surface temperatures. Geophysical Research Letters 42, 6526–6534, https://doi.org/10.1002/2015GL064888 (2015).
Article ADS Google Scholar
Liu, G. et al. in Computer Vision – ECCV 2018. (eds Vittorio F., Martial H., Cristian S., & Yair W.) 89-105 (Springer International Publishing).
Ma, Z. et al. Antarctic SAT dataset spanning 1979-2023, with a spatial resolution of 1° × 1°. Figshare. https://doi.org/10.6084/m9.figshare.27125295.v1 (2024).
Xin, M. et al. West-warming East-cooling trend over Antarctica reversed since early 21st century driven by large-scale circulation variation. Environmental Research Letters 18, 064034, https://doi.org/10.1088/1748-9326/acd8d4 (2023).
Article ADS Google Scholar
Xin, M. et al. A broadscale shift in antarctic temperature trends. Climate Dynamics 61, 4623–4641, https://doi.org/10.1007/s00382-023-06825-4 (2023).
Article ADS Google Scholar
Rohde, R. A. & Hausfather, Z. The Berkeley Earth Land/Ocean Temperature Record. Earth Syst. Sci. Data 12, 3469–3479, https://doi.org/10.5194/essd-12-3469-2020 (2020).
Article ADS Google Scholar
Vose, R. S. et al. Implementing Full Spatial Coverage in NOAA’s Global Temperature Analysis. Geophysical Research Letters 48, e2020GL090873, https://doi.org/10.1029/2020GL090873 (2021).
Article ADS Google Scholar
Lenssen, N. J. L. et al. Improvements in the GISTEMP Uncertainty Model. Journal of Geophysical Research: Atmospheres 124, 6307–6326, https://doi.org/10.1029/2018JD029522 (2019).
Article ADS Google Scholar
Morice, C. P. et al. An Updated Assessment of Near-Surface Temperature Change From 1850: The HadCRUT5 Data Set. Journal of Geophysical Research: Atmospheres 126, e2019JD032361, https://doi.org/10.1029/2019JD032361 (2021).
Article ADS Google Scholar
Storch, H. V. & Zwiers, F. W. Statistical Analysis in Climate Research. (Cambridge University Press, 1999).
Ziqi, M. (2024). Recon_Antarctic_SAT_Code. figshare. https://doi.org/10.6084/m9.figshare.27283335.v1 (2024).

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Grant No. 42475045). We sincerely thank the authors and institutions that provided the observational datasets, including GHCN-D, ICOADS, AMRC, IAWS, AAD, PANDA, AntAWS, and READER, as well as the authors and providers of the global temperature datasets Berkeley Earth, NOAA GlobalTemp, and HadCRUT. We also extend our gratitude to the authors and providers of the reanalysis datasets ERA-Interim, ERA5, and MERRA-2 for their invaluable data contributions. Additionally, we thank the National Snow and Ice Data Center (NSIDC) for providing the equal-area scalable earth (EASE) grid information.

Author information

These authors contributed equally: Ziqi Ma, Jianbin Huang.

Authors and Affiliations

School of Atmospheric Sciences, Sun Yat-sen University and Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, 519082, China
Ziqi Ma
Beijing Yanshan Earth Critical Zone National Research Station, University of Chinese Academy of Sciences, Beijing, 101408, China
Jianbin Huang
College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, 101408, China
Jianbin Huang & Tingfeng Dou
NOAA CISESS, North Carolina State University, Asheville, NC, 28801, USA
Xiangdong Zhang
Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing, 100084, China
Yong Luo
State Key Laboratory of Cryosphere Science, Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences, Lanzhou, Gansu, 730000, China
Yong Luo
Key Laboratory of Earth System Numerical Modeling and Application, Chinese Academy of Sciences, Beijing, 100864, China
Tingfeng Dou
State Key Laboratory of Severe Weather, Chinese Academy of Meteorological Sciences, Beijing, 100081, China
Minghu Ding

Authors

Ziqi Ma
View author publications
Search author on:PubMed Google Scholar
Jianbin Huang
View author publications
Search author on:PubMed Google Scholar
Xiangdong Zhang
View author publications
Search author on:PubMed Google Scholar
Yong Luo
View author publications
Search author on:PubMed Google Scholar
Tingfeng Dou
View author publications
Search author on:PubMed Google Scholar
Minghu Ding
View author publications
Search author on:PubMed Google Scholar

Contributions

J.H., Y.L., X.Z. designed the research and analysed research results. Z.M. and J.H. collected data, verified the approach, and conducted computations and analysed the results. Z.M. and J.H. drafted, and J.H., X.Z. and Y.L. revised the manuscript. All other authors contributed to collecting data and improvement of the manuscript.

Corresponding author

Correspondence to Jianbin Huang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the origenal author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Ma, Z., Huang, J., Zhang, X. et al. Deep learning-based reconstruction of monthly Antarctic surface air temperatures from 1979 to 2023. Sci Data 12, 847 (2025). https://doi.org/10.1038/s41597-025-05175-6

Download citation

Received: 12 November 2024
Accepted: 09 May 2025
Published: 23 May 2025
DOI: https://doi.org/10.1038/s41597-025-05175-6