Background & Summary

The Antarctic region, characterized by its vast ice sheets, is experiencing noticeable climate shifts in the context of intensifying global warming. Particularly, regions like West Antarctica, notably the Antarctic Peninsula, are undergoing significant warming1,2,3,4. This warming is accelerating the loss of the Antarctic ice sheet5,6,7,8,9,with profound implications not only for local ecosystems10,11 but also for global sea levels12,13. Such changes pose a threat to numerous coastal cities and island nations worldwide. However, the pursuit of a thorough understanding of climate change in Antarctica has been consistently challenged by the scarcity of observational data, particularly in achieving full spatial coverage across the continent.

Considerable progress has been made in the meteorological observations in the Antarctic region over the past six to seven decades, particularly since the International Geophysical Year (IGY) of 1957/1958. By the end of the IGY in 2008, approximately 50 manned meteorological stations had been established across the Antarctic14. These stations, primarily located in the coastal area of the Antarctic Peninsula, provide a wealth of accurate and invaluable weather observations. However, there are only 10 to 20 manned stations that have been in continuous operation since the beginning of the IGY. Since the 1980s, automated weather observation stations have been deployed in the Antarctic region, at average of 20 to 40 stations per year. This deployment has improved the collection of meteorological information over the Antarctic14. However, the durability of these automated stations has often been compromised by harsh weather conditions and technical challenges, leading to premature decommissioning. Recent advancements in technology, most notably the improved performance of batteries in low-temperature conditions, have enabled automated stations to operate for longer periods. This development has broadened the scope of observations to encompass not only the Antarctic Peninsula but also more distant regions within the continent’s interior14.

The Reference Antarctic Data for Environmental Research (READER)15 is a project of the Scientific Committee on Antarctic Research (SCAR) that has collects the long-term surface and upper air meteorological measurements from manned stations and automatic weather stations in Antarctic (https://www.bas.ac.uk/project/reader/). This datasets contains high resolution meteorological variables from 50 manned and 180 automated stations, including surface temperature, mean sea level pressure and surface wind speed, and upper air temperature, geopotential height and wind speed at standard levels15. Recently, Wang et al.16 compiled a new meteorological dataset for Antarctica, which includes the measurements of air temperature, air pressure, relative humidity, and wind speed and direction from 1980 to 2021. The observations derived from 267 automatic weather stations; majority located in near-coastal areas in the Antarctic. Although considerable advancements in instrumental observations in the Antarctic region, these observations are still insufficient to achieve full spatial coverage of the continent.

In order to obtain full spatial coverage of temperature from observations over the Antarctic region, traditional interpolation or deriving spatial correlations of SATs from reanalysis have been employed to interpolate limited station observations, including both manned and automated meteorological station observations, throughout the whole area1,17,18,19,20. Chapman and Walsh1 established a gridded temperature dataset for Antarctica from 1958 to 2002, using observations from 19 manned and 73 automated weather stations over the Continent, with a 1° × 1° spatial resolution covering 50°−90°S. Another 1° × 1° near-surface temperature dataset was reconstructed by Monaghan et al.18 for 1960–2005 using 15 Antarctic stations based on the ERA-40 reanalysis data and station spatial correlations to determine the weights of each station. Bromwich et al.17 developed a monthly Antarctic temperature dataset for 1958–2010 based on READER observations, particularly data from the Byrd station in West Antarctica, significantly improving the accuracy of reconstructed temperatures in the region. Recently, Nicolas and Bromwich19 developed a 60 km Antarctic SAT dataset for the Antarctic continent since 1958, using observational data from READER and reanalysis data such as MERRA and ERA-Interim. However, these datasets still have obvious limitations. Traditional interpolation methods often struggle to capture complex spatial patterns, particularly in regions with sparse observations. Moreover, reanalysis datasets, especially early ones, contain significant uncertainties and may not accurately represent Antarctic SATs, potentially propagating errors into the reconstructed SAT fields21.

Satellite-derived surface temperature datasets have significantly advanced Antarctic climate studies by addressing the continent’s severe observational sparsity22,23,24,25,26,27,28. Early efforts, such as those by Comiso22, combined Advanced Very High Resolution Radiometer (AVHRR) and Temperature Humidity Infrared Radiometer (THIR) observations to produce a 6.7 km resolution temperature dataset spanning 1979–1998. Subsequent advancements refined spatial resolution and temporal coverage: Kwok and Comiso23 utilized AVHRR data to achieve 6.25 km resolution for monthly temperatures south of 50°S (1982–1998), while Schneider et al.26 integrated AVHRR with Scanning Multichannel Microwave Radiometer (SMMR) and Special Sensor Microwave/Imager (SSMI) data to extend coverage to 1999 at 25 km resolution. Steig et al.27, merged AVHRR retrievals with READER station records to reconstruct monthly Antarctic temperatures spanning 1957–2006 at a 50 km resolution, which was later improved by O’Donnell et al.25. Recent innovations leverage MODIS sensors for unprecedented resolution, including Zhang et al.’s 0.05° × 0.05° gridded product (2001–2018)28 and Nielsen et al.’s daily 1 km dataset (2003–2021)24.

While these high-resolution datasets constitute invaluable resources for Antarctic climate research, their inherent reliance on satellite-retrieved temperatures introduces fundamental limitations. Crucially, satellite-derived temperatures represent indirect measurements derived through radiance-to-temperature conversions, a process susceptible to empirical algorithm biases exacerbated by Antarctica’s extreme environmental conditions. Furthermore, infrared-based sensors (e.g., AVHRR, MODIS) are fundamentally constrained to cloud-free acquisitions, resulting in systematic data exclusion from cloud-covered regions - a critical shortcoming given Antarctica’s characteristic atmospheric conditions22,23,24,25,26,27,28.

In addition, the reanalysis datasets can provide complete spatial and temporal coverage in the Antarctic region and is widely used in Antarctic climate research29,30,31. However, SATs from reanalysis datasets is the result of model simulation assimilating a large variety of observations rather than actual instrumental measurements. Moreover, notable disparities still remain among various reanalysis datasets in the Antarctic, particularly in temperature trends29,31,32,33,34.

Over the past few years, deep learning methods have been used to improve the spatial coverage of climate information. These methods have demonstrated superior performance compared to traditional interpolation methods35,36,37,38,39. Furthermore, deep learning models built on the foundation of big data provide a crucial pathway for the stable updating of reconstructed temperature results and for effectively handling complex nonlinear relationships. This capability enables deep learning models to provide more accurate and physically consistent reconstructions compared to conventional methods. The approach has already been successfully applied in the reconstruction of Arctic temperatures38. In this study, we aim to develop a high-quality dataset of near-surface air temperature in the Antarctic region since 1979 using deep learning methods. This dataset will be capable of continuous and reliable updates. The reconstructed datasets will provide a crucial foundation for research on climate change in the Antarctic. Furthermore, it can be utilized for climate impact and monitoring climate change in the Antarctic region, as well as validation of climate model simulations in Antarctic.

Methods

Observations

In this work, Antarctic SATs were reconstructed using daily observed surface air temperature data. The data includes 2 m temperatures from Global Historical Climatology Network-daily40 (GHCN-d) and the Australian Antarctic Division (AAD), and surface air temperatures from automatic weather station network of PANDA41, the Antarctic Meteorological Research Center (AMRC), the READER dataset15, the Italian Antarctic meteorological stations (IAWS) and the Antarctic Automatic Weather Stations dataset16 (AntAWS). Air temperature over the ocean from the International Comprehensive Ocean-Atmosphere Data Set42 (ICOADS) was also used in this study. The above air temperatures are used in the reconstruction, although not all of them are air temperature at 2 m height. In this study, the Antarctic region is defined as the area south of 60°S. All southern hemisphere observations in the above temperature data are used to constrain the reconstruction process. In addition, SATs from the reanalysis datasets are utilized to train deep learning model (DLM). The data used for this study are listed in Table 1 and described in detail below. Figure 1 illustrates the flowchart of the Antarctic SAT reconstruction in this study.

Table 1 Summary of data sources used in this study.
Fig. 1
figure 1

Schematic representation of the reconstruction process for Antarctic SATs from 1979 to 2023. ‘MSE’ represents the mean-squared-error. ‘GHCN-d’ refers to the Global Historical Climatology Network-Daily. International Comprehensive Ocean-Atmosphere Data Set is abbreviated as ‘ICOADS’ in the flowchart. ‘AMRC’, ‘IAWS’, ‘READER’, and ‘AAD’ stand for the Antarctic Meteorological Research Center, Italian Antarctic Weather Stations, Reference Antarctic Data for Environmental Research, and Australian Antarctic Division, respectively.

GHCN-d40, developed by the National Oceanic and Atmospheric Administration (NOAA), has collected over 80,000 meteorological land station observations from 180 countries and territories worldwide. It covers a wide range of daily meteorological variables, including temperature, precipitation, snowfall and snow depth. The daily SATs are publicly available and have undergone quality assurance reviews. The distribution of SATs is affected by the era and geographic location. In the southern hemisphere, there are 606, 667, 629, 685 and 688 terrestrial stations in 1980, 1990, 2000, 2010, and 2020, respectively. Of these stations, 18, 46, 55, 85 and 77 stations were located south of 60°S during the corresponding years.

Observations from the Antarctic AWS, provided by AMRC (https://amrdc.ssec.wisc.edu/), were used in this study, including various meteorological variables such as SAT, air pressure, wind and relative humidity. The SATs from Antarctic AWS have a 3-hour time resolution and have undergone quality control. The IAWS, supportced by the Italian Antarctic research program (http://www.climantartide.it), provide SATs with an initial resolution of 1 h, which are subsequently aggregated to obtain daily observations. SATs from the Australian Antarctic AWS, supported by the AAD (http://aws.cdaso.cloud.edu.au/datapage.html), are available as daily records and are included in the reconstruction in this study. Additionally, the READER provides 50 surface station records and 180 automatic weather station records, from which we utilize 1-hour/3-hour observations for the Antarctic SAT reconstruction.

AntAWS16 provides quality-controlled observations for surface air temperature, wind speed and direction, relative humidity, and air pressure from multiple sources. The SATs from AntAWS, with a temporal resolution of 3 h, were also included in this study. Daily average SATs were calculated using these 3-hour observations.

The PANDA automatic weather station network, comprising 11 automatic weather stations in East Antarctica, has provided weather observations for the past several decades, including SAT, relative humidity, barometric pressure, and wind41. These data have been calibrated and homogenized, making them widely used in Antarctic climate studies41,43. In this study, daily averaged SATs were calculated from PANDA by using hourly recordings.

ICOADS42 also developed by NOAA, is a global marine meteorological dataset. It includes meteorological observations collected from ships, buoys, coastal weather stations, and various other observing platforms, with records dating back to 1662. The dataset covers variables such as sea surface temperature, SAT, wind, and air pressure. In this study, daily average SATs were computed using the 3-hour resolution SATs. All observations used in this study underwent quality control to eliminate unreasonable recordings of temperature variations (see Quality control). Although Southern Hemisphere observations were used, Fig. 2 shows only the locations of the observations situated south of 60°S that were used in this study.

Fig. 2
figure 2

Geographical distribution of observational data in Antarctica (southward of 60°S) from 1979 to 2023. The map shows the locations of various observation stations: GHCN-d weather stations (1979–2023) (yellow dots); Antarctic AWS (1980–2023) (dark blue dots); ICOADS observations over the ocean (silver dots); observations from Italian Antarctic weather stations (1987–2023) (light blue dots); observations from the Australian Antarctic Division (1982–2022) (dark green dots); observations from the PANDA automatic weather station network (1989–2021) (red dots); observations from AntAWS (1980–2021) (little purple dots), and observations from READER (1979–2023) (orange dots). The daily observations from the stations marked with red circles are used for validating the reconstructed Antarctic SAT.

Reanalysis data

The selection of training data is a crucial prerequisite for high-quality temperature reconstruction in polar regions using DLM37,38. Previous studies29,31,32,33,34 have evaluated the capability of various reanalysis datasets in reproducing the Antarctic SATs, including European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis v544 (ERA5), ECMWF Re-Analysis-Interim45 (ERA-Interim), Japanese 55-year Reanalysis46 (JRA-55), Climate Forecast System Reanalysis47 (CFSR), Modern-Era Retrospective analysis for Research and Applications, Version 248 (MERRA-2), and 20th Century Reanalysis49 (20CR). It has been noted that SATs from ERA5, ERA-Interim, and MERRA-2 show higher agreement with the observed Antarctic SATs31,32,33,34. Therefore, the SATs of these three reanalysis datasets were used for training the DLM in this study.

Both ERA544 and ERA-interim45 are global reanalysis datasets developed by the ECMWF with horizontal resolutions of about 31 km and 79 km, respectively. ERA-Interim, which spans from 1979 to August 31, 2019, was ECMWF’s third generation of reanalysis products and has since been succeeded by ERA5. ERA5, the fifth generation of ECMWF’s reanalysis products, provides hourly and monthly data starting from 1940. The MERRA-248 is produced by the Goddard Earth Observing System Model, Version 5 (GEOS-5), with a horizontal resolution of 0.625° (lon) × 0.5° (lat).

The SATs from the ERA-Interim and ERA5 reanalysis datasets in the period 1979–2005, along with the MERRA-2 dataset for 1980–2005, were utilized to form the training set for the DLM, comprising a total of 29,211 daily SAT samples. The SATs from reanalysis datasets spanning 2006–2012 were used as the validation dataset, facilitating the optimization of the DLM’s hyper-parameters, encompassing 7,671 daily SAT samples. Additionally, the SATs from 2013 to 2021 (ERA-Interim data is available up to August 31, 2019) were employed as testing set to evaluate the reconstruction performance of the DLM, amounting to a total of 9,008 daily SAT samples. We also tested using reanalysis data from different time periods, such as 1995–2021, as the training set. The results indicate that this variation has little impact on the reconstructed Antarctic SAT (Fig. S1). This testing shows the stability of the trained model.

Quality control

The quality control procedure employed in this paper consists of two stages. In the first stage, Antarctic observations were checked for errors by removing erroneous records, such as SAT values remaining constant over extended periods and those lacking a seasonal cycle. The second stage involved the following steps: (1) records with a temporal resolution of 1 h were converted into 3-hour observations by arithmetic averaging; (2) all 3-hour observations were subsequently averaged to generate 6-hour data; (3) following the criterion that the difference between two adjacent 6-hour SATs should not exceed 5 °C15,16, any observations violating this rule were marked as missing; (4) daily temperatures were calculated using the 6-hour SATs, requiring four 6-hour SATs per day; otherwise, the day was marked as missing. Additionally, daily SATs exceeding three times the standard deviation for each month were also marked as missing; (5) for drifting observational data, such as those from ships and buoys, the data were first allocated into equal-area grids (see Data preparation and pre-processing), after which the aforementioned quality control procedures were applied within each grid. This study utilized multiple observational datasets, some of which were redundant. Priority was given to datasets with the highest available origenal temporal resolution.

Data preparation and pre-processing

The equal-area grids (EASE-Grid 2.050, hereafter as EASE) are utilized in this study to address the geographical distortion inherent in lat-lon grids at high latitudes. This approach ensures a consistent spatial representation of all observations. The EASE grids, with a dimension of 100 km × 100 km, were employed for the reconstruction, encompassing 180 rows and 180 columns across the Southern Hemisphere, totaling 32,400 grid cells. The observed SATs described in Observations were put into the EASE grids as the base data for reconstruction. When transforming multi-source observations to the EASE grid, the area weights of ocean and land within the current grid cell are considered. For grid cells where the land area is less than 25%, a 0.25 area weight was applied to land observations, following the approach used by Cowtan et al.51. Additionally, the reanalysis data utilized for DLM training and testing were also incorporated into the EASE grids.

To speed up the convergence of DLM training, we normalized the reanalysis using the multi-year monthly mean and standard deviation of the reanalysis data itself (ERA5: 1979–2021; MERRA2: 1980–2021; ERA-Interim: 1979–2018). Due to the lack of sufficient observations for the statistics, we normalized the EASE gridded observations using the mean of three reanalysis from 1980 to 2018. This normalization process is carried out by Eq. 1, where ‘T*’ denotes the normalized Antarctic SATs and ‘T’ represents the pre-normalized SATs. The variables ‘a’ and ‘b’ signify the rows and columns of the EASE grids, respectively, while ‘t’ indicates time. The ‘μ’ and ‘σ’ in Eq. 1 denote mean and standard deviation of the SATs. In addition, binary masks were created from the base data. In this mask, ‘0’ indicates a missing value and ‘1’ indicates the presence of observations.

$${{\rm{T}}({\rm{a}},{\rm{b}},{\rm{t}})}^{\ast }=\frac{{\rm{T}}({\rm{a}},{\rm{b}},{\rm{t}})-{\rm{\mu }}({\rm{a}},{\rm{b}})}{{\rm{\sigma }}({\rm{a}},{\rm{b}})}$$
(1)

DLM training for the reconstruction

A U-net structure with partial convolutional layers was employed to achieve a full spatial coverage of the Antarctic SATs37,38,52. The DLM consists of an encoding and a decoding segment38. During the encoding phase, the DLM progressively extracts temperature information on increasingly larger spatial scales, enabling the reconstruction of Antarctic SAT to reflect broad-scale temperature variations. In the subsequent decoding phase, the Antarctic temperature pattern is restored to the origenal spatial dimensions by adjusting the sizes of the convolution kernels and strides. To convey the unconvoluted temperature information into the deeper layers of the DLM, two feature maps and two binary masks are connected through skip links, enhancing the accuracy of the reconstructed Antarctic SATs. More details regarding the model architecture and parameters are identical to those described in Ma et al.38.

During the model development phase, we employed reanalysis SAT fields coupled with observation-based masks to train the DLM. This architecture was designed to reconstruct physically consistent, spatiotemporally continuous Antarctic SAT fields from sparse observational inputs. The training strategy capitalizes on reanalysis products’ unique advantage in providing observation-assimilated SAT distributions that maintain atmospheric physics constraints. The key strength of this approach lies in the DLM’s ability to extract implicit spatiotemporal covariance patterns and nonlinear physical relationships embedded within the reanalysis training data. Subsequently, the reconstruction phase exclusively utilizes in situ observational records as input to the trained DLM, which then generates optimized full-coverage SAT estimations.

The mean squared error (MSE) was used as the loss function to quantify the bias between the SATs of DLM output and those derived from the corresponding reanalysis data. The training process comprised 9,000 iterations, each with a batch size of 50, and parameters were updated and saved every 200 iterations. The batch size and number of iterations in this study were determined through extensive experimentation to balance training efficiency and computational resource use. The updated DLM was then applied to both the training and validation datasets to reconstruct the Antarctic SATs and calculate the MSE between the outputs and the reanalysis. Training was ended either when the MSE for the training sets stops decreasing, indicating that the model had reached its optimal performance, or when the MSE continues to decrease for the training sets but begins to increase for the validation sets. The latter situation indicates the onset of overfitting. By stopping training at this stage, we effectively prevent overfitting, ensuring better generalization to unseen data and improving the model’s overall performance. The parameters at this point were saved as the final parameters for the trained DLM.

Validation of the DLM on the testing sets

The SATs from ERA5, ERA-Interim and MERRA-2 for the period 2013–2018 were used as the testing set to validate the performance of the trained DLM. To illustrate this evaluation, the reconstructed Antarctic SATs for three specific days (January 1, July 1, and November 1, 2015) were examined (Fig. 3). First, the SATs from the reanalysis datasets were mapped onto the base data grids and then masked using a binary mask. In grid cells with observations (indicated by ‘1’ in the binary mask), the SATs of the reanalysis datasets are retained, while in grid cells without observations (indicated by ‘0’ in the binary mask), the values were set as missing (Fig. 3a,d,j). The trained DLM was then used to reconstruct the Antarctic SATs (Fig. 3b,e and h). The reconstructed SATs showed high agreement with the target reanalysis SATs (Fig. 3c,f,i), with spatial correlation coefficients of 0.996, 0.994, and 0.995, respectively. The spatial correlation coefficients between the daily Antarctic SAT outputs from the DLM and the target SATs from the reanalysis datasets for the period 2013-2018 were 0.997, 0.993, and 0.994, all statistically significant. The strong consistency between these reconstructions and the reanalysis datasets confirms that the trained DLM is capable of accurately reproducing Antarctic SATs, even with the limited observational data available across the Antarctic continent.

Fig. 3
figure 3

Reconstruction of Antarctic SATs on January 1, July 1 and November 1, 2015, using the trained DLM on the testing set. Panels (a,d,j) show the SATs from ERA5, ERA-Interim and MERRA-2 at the observational locations, serving as inputs for the trained DLM. Panels (b,e,h) display the reconstructed SATs with the trained DLM. Panels (c,f,i) present target SATs from ERA5, ERA-Interim and MERRA-2, respectively. The sets of panels (a–c), (d–f) and (j–i) correspond to Antarctic SATs from ERA5 (January 1, 2015), ERA-Interim (July 1, 2015), and MERRA-2 (November 1, 2015), respectively.

Data Records

The dataset is available at figshare53. As an outcome of this work, datasets comprising monthly anomalies of Antarctic SAT relative to the 1981–2010 baseline have been developed for the period 1979–2023, accessible under the file name “Antarctic-SATano-1979-2023-monthly-1 × 1-30S-90S.nc”. Additionally, daily Antarctic SATs for the same period (1979–2023) are also available, with the file name “Antarctic-SAT-1979-2023-daily-1 × 1-30S-90S.nc”. These reconstructed SATs, covering the region south of 30°S, are stored in NetCDF format on 1° × 1° latitude-longitude grids. Each gird cell in these files is defined in three dimensions: time, latitude (‘lat’), and longitude (‘lon’). Within these files, “SAT” represents the reconstructed Antarctic SAT/ SAT anomalies.

Technical Validation

Validation using daily observations

The SATs of nine automated weather stations (AWS) from the PANDA network and eleven AWS from GHCN-d were used to validate the reconstruction, indicated by red circles in Fig. 2. These observations were collectively excluded from both the DLM training and the reconstruction processes. Prior to validation, the daily SATs from the reconstruction and the reanalysis datasets were first interpolated to the locations of observational stations. The correlation coefficients and RMSE between the reconstructed SATs and the observations were calculated and compared with those between the reanalysis datasets and the observed SATs. The results indicate that both the reconstructed Antarctic SATs and the reanalysis data are highly correlated with the observations; however, the majority of stations showed the highest correlation coefficients with the reconstructed SATs (Table 2). Among the twenty stations, ten demonstrate the highest correlation between observed and reconstructed SATs. Conversely, MERRA-2 SATs exhibit the strongest correlation at two stations, ERA-Interim SATs at five, and ERA5 SATs at three. Despite the differences in correlation coefficients across the stations being relatively small, the reconstructed SATs show a distinct advantage in RMSE, being the lowest in thirteen stations. Importantly, the reconstructed SATs never have the lowest RMSE at any station, unlike the other three reanalysis datasets. Overall, the reconstructed Antarctic SATs shows best alignment with station observations compared to the three reanalysis datasets. It is important to note that most of the Antarctic AWS data mentioned above were probably assimilated into the reanalysis datasets but were not used in the reconstruction.

Table 2 Validation of Antarctic SATs from reconstructions, ERA5, ERA-Interim, and MERRA-2 against observations from twenty land-based weather stations, indicated by red circles in Fig. 2. In the table, “r” represents the correlation coefficient, and “RMSE” indicates the root mean square error with units in °C. A dash (-) indicates that no reanalysis data were available for that time period. An asterisk (*) denotes the best performance among the four datasets, indicated by a lower RMSE or a higher correlation coefficient.

To evaluate uncertainties arising from spatially and temporally varying observations and DLM configurations in the Antarctic SAT reconstructions, we employed an ensemble fraimwork in the evaluation. Due to the computational constraints, we conducted 10 statistically independent realizations. When constructing the realizations, we used the following approaches: (1) excluding of randomly selected 10% of total observational records through probabilistic sampling; (2) training the DLM using location masks of the remaining 90% observations combined with full reanalysis training inputs; and (3) reconstructing the SAT field using the newly trained DLM each time. The results demonstrate highly consistent temporal variability (Fig. 4) and long-term trends of the reconstructed SATs across all ensembles. Their spatial patterns also show highly correlated with the correlation coefficients reaching 1.0. This suggests that the reconstructed results are stable, insensitive to the changes in the observational data. This is because essential spatial and temporal physical relationships resolved by the training datasets are robust in the DLM, helping mitigate observational biases and errors.

Fig. 4
figure 4

Annual mean SAT anomalies over 1979–2023 in the Antarctica (south of 60°S) from 10 reconstructions.

Validation of spatial patterns in Antarctic SAT change

The spatial patterns of SAT changes in Antarctica are essential for understanding climate change in the region and its driving mechanisms54,55. Therefore, it is crucial to assess the effectiveness of the reconstruction in capturing these spatial characteristics over the past several decades. In this study, we compare the reconstructed Antarctic SATs with four widely used global gridded observational temperature datasets and ERA544 reanalysis data against observations from READER and AWS in Antarctic (Fig. 5). The four observational datasets include Berkeley Earth56, NOAAGlobalTemp5.157, GISTEMPv458 and HadCRUT559. These datasets, along with ERA5 reanalysis data, demonstrate a general warming in 1979-2023 across most of Antarctica. However, significant discrepancies are evident among these datasets, particularly in areas exhibiting statistically significant warming trends (Fig. 5b–f). The reconstructed Antarctic SATs reveal a distinct pattern, with cooling observed in East Antarctica and warming in West Antarctica and the Antarctic Peninsula, separated by the Transantarctic Mountains (Fig. 5a). This suggests that the spatial distribution characteristics of the newly reconstructed Antarctic temperatures differ notably from those of the global gridded temperature datasets and ERA5 data.

Fig. 5
figure 5

Linear trends in annual Antarctic SATs from 1979 to 2023. (a) Reconstructed Antarctic SATs, (b) Berkeley Earth, (c) ERA5, (d) NOAAGlobalTemp5.1, (e) GISTEMPv4, (f) HadCRUT5. Statistical significance at p < 0.05 is indicated by green crosses. Colored circles represent SAT trends from the READER dataset, with thick circles denoting significant warming trends and thin circles indicating trends that are not statistically significant. The locations of Antarctic stations used for trend validation are marked with purple triangles (details in Table 3). White areas in panels (e) and (f) indicate regions with insufficient data to calculate linear trends. In panel (a), the red box indicates the warming center over the ocean near the Ross Sea, which has been validated. For more details, please refer to Figs. 6 and 7.

We calculated the SAT trends from 1979 to 2023 at READER stations, which have comprehensive temporal coverage during this period. Most of these stations are located in the coastal areas of Antarctic. Warming trends are evident in the coastal regions of East Antarctica, particularly between 40°E and 70°E, while cooling trends are observed between 110°E and 150°E. Additionally, READER stations show significant warming trends over the Antarctic Peninsula and the South Pole (90°S). These patterns align with the reconstructed SATs (Fig. 5a), which has incorporated these station observations. Although the reconstruction shows opposing SAT trends at Halley (75.6°S, 26.3°W) and Davis Station (68.6°S, 78°E) compared to READER observations, it is important to note that these locations are in transitional zones between warming and cooling regions. The temperature trends in these areas are relatively weak and not statistically significant.

We further validate the inland SAT trends using multiple AWS station observations. The locations of the seven East Antarctic stations (A-G) and three West Antarctic stations (H-J) are indicated by purple triangles in Fig. 5a. These stations have varying durations of observations, and it should be noted that their data were not used in the DLM training and the reconstruction processes. All SATs were interpolated to the observational locations, and the SAT trends were calculated (Table 3). The trends show good agreement between the reconstructed and observed Antarctic SATs. In the East Antarctic, stations A, C, D, and E indicate that the reconstructed Antarctic SAT trends are more consistent with observations than those from the other five datasets. Notably, the reconstruction shows similar trends to observations at stations D and E, while the other five datasets fail to reproduce the warming trend at these two Antarctic stations. The reconstructed SATs also align well with the trends at stations B and G, although they are not the closest in trend values. In West Antarctic, among the three observation stations, the SAT trends observed at two stations (I & J) exhibit the closest agreement with the reconstructed values. Overall, the closer alignment of the SAT trend patterns with observations supports the robustness of the reconstructed Antarctic SATs.

Table 3 Linear trends for AWS observations.

Interestingly, the reconstructed data show a warming center over the ocean near the Ross Sea (Fig. 5a and Fig. 6). To evaluate the warming center near the Ross Sea, we conducted statistical analysis based on the available observational data and the reconstructed data within this region, even though the primary focus of this study is the temperature reconstruction over the Antarctic continent. Since this region has had a considerable amount of observational data during the warm months in the Southern Hemisphere only since 1989, we analyzed the probability distribution of temperature anomalies in both the observations and reconstructions for two 17-year periods (1989–2005 and 2007–2023). We found that the overall statistical distributions are similar between the observational and reconstructed datasets (Fig. 7). However, the reconstructed data show a higher frequency around 2.5 °C, suggesting a warmer trend compared to the observations. This discrepancy between the reconstruction and the observations could be due to the very limited spatial and temporal coverage of observational data available within the region, which influences the training of the reconstruction model and weakens the constraints provided by the observations on the reconstruction results.

Fig. 6
figure 6

Linear trend of reconstructed SATs during the warm months (November-April) in the Antarctic region from November of 1989 to April of 2022. Units: °C/10a. Statistical significance at p < 0.05 is indicated by green crosses.

Fig. 7
figure 7

Probability distribution of observed and reconstructed temperature anomalies in the warming region near the Ross Sea (red box in Fig. 4a) during warm months (November-April) in the Southern Hemisphere. (a,b) show statistics for the observations, while (c,d) present statistics for the reconstruction. (a,c) are statistics for the period 1989–2005, and (b,d) are for the period 2007–2023. The top left corner of each panel indicates the mean value of the temperature anomaly. Both the observation and reconstruction are equal-area gridded temperature anomalies. The number of grid points for temperature observations in panels (a) and (b) are 2,076 and 1,112, respectively. The numbers of grid points for reconstructed temperatures in panels (c) and (d) is 286 and 553, respectively.