Return levels of temperature extremes in southern Pakistan

Abstract. Southern Pakistan (Sindh) is one of the hottest regions in the world and is highly vulnerable to temperature extremes. In order to improve rural and urban planning, it is useful to gather information about the recurrence of temperature extremes. In this work, return levels of the daily maximum temperature Tmax are estimated, as well as the daily maximum wet-bulb temperature TWmax extremes. We adopt the peaks over threshold (POT) method, which has not yet been used for similar studies in this region. Two main datasets are analyzed: temperatures observed at nine meteorological stations in southern Pakistan from 1980 to 2013, and the ERA-Interim (ECMWF reanalysis) data for the nearest corresponding locations. The analysis provides the 2-, 5-, 10-, 25-, 50-, and 100-year return levels (RLs) of temperature extremes. The 90 % quantile is found to be a suitable threshold for all stations. We find that the RLs of the observed Tmax are above 50 °C at northern stations and above 45 °C at the southern stations. The RLs of the observed TWmax exceed 35 °C in the region, which is considered as a limit of survivability. The RLs estimated from the ERA-Interim data are lower by 3 to 5 °C than the RLs assessed for the nine meteorological stations. A simple bias correction applied to ERA-Interim data improves the RLs remarkably, yet discrepancies are still present. The results have potential implications for the risk assessment of extreme temperatures in Sindh.


Introduction
Extreme maximum temperature events have received much attention in recent years because of the associated dangerous impact on the increased risk of mortality (IPCC, 2012).Additionally, climate change scenarios suggest that in most regions the probability of occurrence of extremely high temperature is very likely to increase in the future (Sheridan and Allen, 2015).An example of the potential impact of raising maximum temperatures is the recent heat wave in southern Pakistan (Sindh), which occurred between 17 and 24 June 2015 and broke all the records with a death toll of 1400 people and over 14 000 people hospitalized.The temperatures in different cities of the Sindh region were in the range of 45.5-49 • C during the event (Imtiaz and Rehman, 2015).Karachi had the highest number of fatalities (1200 people approximately).The Pakistan Meteorological Department (PMD) issued a technical report stating a very high heat index (measuring the heat stress on humans due to high temperature and relative humidity) during this heat wave (Chaudhry et al., 2015).
In summer, Sindh becomes very hot and with the arrival of a monsoon the humidity increases in the region (Chaudhry and Rasul, 2004).The extremely hot and humid conditions can have lethal effects and can impact the overall human habitability of a region (Pal and Eltahir 2015).The human body generally maintains a temperature of around 37 • C.However, the human skin regulates at or below 35 • C to release heat (Sherwood and Huber, 2010).Under combined high temperatures and high levels of moisture content in the atmosphere, the human body cannot maintain the skin temperature below 35 • C and can develop ailments like hyperthermia, heat strokes, and cardiovascular problems.Hyperthermia is a con-dition in which extremely high body temperature is reached, resulting from the inability of the body to release the excess heat.Hyperthermia can occur even in the fittest human beings, if exposed for at least 6 h to an environment in which the wet-bulb temperature is greater than 35 • C.
This study devotes special attention to Sindh (23.5-28.5 • N and 66.5-71.1 • E) because of its recent exposure to the intense temperature extremes (Zahid and Rasul, 2012).It is bounded in the west by the Kirthar Mountains, to the north by the Punjab Plains, in the east by the Thar Desert, and to the south by the Arabian Sea (Indian Ocean), while in the center there is fertile land around the Indus River.Cotton, wheat, sugar cane, rice, wheat, and gram crops are cultivated near the banks of the Indus River (Chaudhry and Rasul, 2004).Cotton is the cash crop of the country.High population density, limited resources, poor infrastructure, and high dependence of the local agriculture on climatic factors mark this region as highly vulnerable to the impacts of climate change.The Intergovernmental Panel on Climate Change (IPCC) scenarios estimate an increase in the temperature of the order of 4 • C for this region by the end of 2100.This may significantly reduce crop yields and cause huge economic losses to the country (Islam et al., 2009;Rasul et al., 2012;IPCC, 2012IPCC, , 2014)).Furthermore, the risks of heat strokes, cardiac arrest, high fever, diarrhea, cholera, and vector-borne diseases might increase.
Extreme value theory (EVT) provides the statistical basis for increasingly widespread quantitative investigations of extremes in climate studies (Coles, 2001;Zhang et al., 2004;Brown et al., 2008;Faranda et al., 2011;Acero et al., 2014).The peaks over threshold (POT) approach aims to describe the distribution of the exceedances of the stochastic variable of interest above a threshold.Under very general conditions, the exceedances are asymptotically distributed according to the generalized Pareto distribution (GPD).The GPD has remarkable properties of universality when the asymptotic behavior is considered (Lucarini et al., 2016), while one can expect that the threshold level above which the asymptotic behavior is achieved depends on the characteristics of the analyzed time series.In particular, when looking at spatial fields, the threshold level depends on the geographical location.
In this study, we have chosen to analyze the temperature extremes in the Sindh region taking the point of view of threshold exceedances associated with the GPD family of distributions.We have done this because the statistical inference provided by the POT method provides a more efficient use of data and has better properties of convergence when finite datasets are considered with respect to alternative methods for the analysis of extremes, such as the block maxima method, which is used to fit the observed data to the generalized extreme value (GEV) distribution (Lucarini et al., 2016).Additionally, here we are interested in investigating the actual tails of the distributions and not the statistics of yearly maxima, for example.Thus, the POT approach is indeed more appropriate.While the POT method has been applied for studying temperature extremes in different regions of the world (Burgueño et al., 2002;Nogaj et al., 2006;Coelho et al., 2007;Ghil et al., 2011), to our knowledge, it has never been used to analyze the statistics of temperature extremes in Sindh.Thanks to the properties of universality of the GPD distribution (Lucarini et al., 2016), the POT approach can in principle provide reliable estimates of return periods and also the return levels (RLs) for time ranges longer than what is actually observed.This information and this predictive power can be beneficial for policy-makers and other stakeholders since it is exactly the kind of information planners need when designing infrastructures that must last a very long time.Note that commonly used, more empirical approaches to the study of extremes, such as those used more for assessing the "moderate extremes" (IPCC, 2012), do not have any property of universality and might have weak predictive power.
It is useful to consider two indicators of extremely hot conditions: (1) temperature extremes T max and (2) wet-bulb temperature extremes TW max .Therefore, we estimate the return levels of T max and TW max over different return periods during summer (May-September) in Sindh.We apply the POT method to the observational data of the nine weather stations provided by the PMD and the ERA-Interim reanalysis data of European Center for Medium-Range Weather Forecasts (ECMWF) for the corresponding grid points from 1980 to 2013.ERA-Interim reanalysis data are generally very good at also replicating trends in temperature percentile (Cornes and Jones, 2013).Nonetheless, it is, in principle, not obvious that ERA-Interim data can simulate meteorological extremes well, as reanalyses are constructed in such a way that typical conditions are well reproduced.This is why we look at how well ERA-Interim data perform in the target area against observations.If the ERA-Interim dataset characterizes the extremes well, it could be an option for the regions within Sindh where no observational data are available.Furthermore, a standard bias correction is applied to the ERA-Interim data to assess whether removing the bias in the bulk of the statistics substantially improves representation of the return levels of extremes.Given the shortness of the datasets, as we will show later, it is appropriate to analyze the extremes without taking into consideration possible long-term trends (Frei and Schär, 2001); see also the discussion in Felici et al. (2007).The provision of POT-based information on stationary extremes is already quite relevant in terms of impacts for the public and private sectors as it fills a big data gap in Sindh.A possibility for investigating time dependency in the temperature extremes is considering the centennial NCEP reanalysis (Compo et al., 2011) and using suitable bias correction procedures.Such an analysis is not performed at this stage as we focus on observational data.
The paper is organized as follows.In Sect. 2 we present the datasets we study and the statistical methods we use for assessing the properties of extremes.In Sect. 3 we show and discuss the main results.In Sect. 4 we make a summary of the main findings and present our conclusions and perspectives for future investigations.
2 Data and methodology

Meteorological station data
The daily maximum temperature and relative humidity data recorded at nine meteorological stations in Sindh from 1980 to 2013 are provided by the PMD (see Table 1).We select nine stations, which contain a negligible amount of missing values after 1980 and are suitable for the POT analysis (Fig. 1).An additional criterion is that only those stations at which no changes occurred in measuring instruments during the last 33 years were chosen (Brunetti et al., 2006).None of the station data show gaps with a duration longer than 2 days, which are treated by replacing the missing value with the average of the two previous values.
The temperature data are discretized unevenly with intervals up to 1 • C. Deidda and Puliga (2006) proposed a Monte Carlo approach for addressing this issue.They showed that finite resolution in precipitation data affects the convergence of parameter estimation in the extreme value analysis.They suggested generating many synthetic datasets by adding numerical noise to the original data and then providing the best estimate of the parameters of the extreme value distributions by averaging over all the best fits obtained in each synthetic dataset.Following their suggestion, we produce highresolution data to compensate for the effect of discretization and thus to improve the convergence of the estimator.In order to convert the temperature readings to higher resolution, we add a uniform random variable in the interval [−0.5, 0.5].The main property of this noise is round (T + r) = T , where T is the temperature with 1 • resolution and "round" is the numerical function, which maps the interval [T − 0.5, T + 0.5] to T .Thus, adding the noise does not perturb the information content of the observations.This procedure is applied to all temperature data, irrespective of the actual resolution, and replicated 100 times using a Monte Carlo approach.For each synthetic dataset, we perform the statistical best fit described later in the paper and then average the results.We check the influence of this noise parameterization and find no significant bias in the RL estimates.The advantage of adding noise is to avoid the spurious statistical effects associated with the presence of discrete values assigned to the temperature readings.Using the described bootstrap method, we reduce such problems without biasing the data.

ERA-Interim reanalysis data
The gridded daily maximum temperature and relative humidity data of ERA-Interim reanalysis are obtained from the ECMWF Public Datasets web interface (http://apps.ecmwf.int/datasets/).ERA-Interim is generated by the ECMWF model with a resolution of 0.75 • × 0.75 • (Dee et al., 2011).The gridded data are then extracted at the closest grid points of all stations for the period 1980-2013 (Fig. 1).The latitude www.earth-syst-dynam.net/8/1263/2017/Earth Syst.Dynam., 8, 1263-1278, 2017 and longitude of the ERA-Interim stations are displayed in Table 1.
The extreme temperature analysis is restricted to the summer season (May-September) over a period of 33 years.We have tested the datasets by applying the Mann-Kendall test; the results show that trends are not significant in such a short time interval.One of the main requirements for performing the POT analysis is assuming the stationarity of the time series.Therefore, as in Bramati et al. (2014), the augmented Dickey-Fuller (ADF) test of stationarity is performed on all time series (Dickey and Fuller, 1979).In all cases we find no sign of long-term correlations in the data.Short-term correlations (daily timescale) typically lead to clusters of extreme values and are studied by computing the extremal index θ in all time series and treated using the associated standard de-clustering technique (see more details in Sect.2.4).

Wet-bulb temperature calculations
The wet-bulb temperature measures the heat stress better than other existing heat indices because it establishes the clear thermodynamic limit on heat transfer that cannot be overcome by adaptations like clothing, activity, and acclimatization (Pal and Eltahir, 2015;Sherwood and Huber, 2010).Here, we use an empirical equation developed by Stull (2011) to measure the wet-bulb temperature.

Peaks over threshold
In order to determine the return levels of extreme maximum temperatures and maximum wet-bulb temperatures, the POT approach is applied to the data obtained from the meteorological stations in Sindh and from the ERA-Interim archive.
Multiple occurrences are an important characteristic of extreme climatic events and are referred to as clustering.Clusters are consecutive occurrences of above-threshold events.It is important to post-process the clustered extremes in order to take into account the assumption of a weak, short time correlation between extreme events, which is crucial for our statistical analysis.We have treated the clusters using the concept of the extremal index (EI) (see Newell, 1964;Loynes, 1965;O'Brien, 1974;Leadbetter, 1983;Smith, 1989;Davison and Smith, 1990).The EI θ measures the degree of clustering of extremes.It ranges between 0 and 1 (θ = 0 means strong clustering and dependence, θ = 1 absence of clusters and independence).Leadbetter (1983) interprets 1/θ as the mean number of exceedances in a cluster.
The EI θ can be estimated in two different ways.Here, we apply the "intervals estimator" automatic de-clustering by Ferro and Segers (2003).A positive aspect of this method is that it avoids the subjective choice of cluster parameters.The main ingredient is the use of an asymptotic result for the times between threshold exceedances.The exceedance times are split into two types, a set of vanishing intra-exceedance times within the clusters and an exponentially distributed set of inter-exceedance times between clusters.The method is iterative, starting with the largest return times and stopping when a limit for the inter-exceedance times is reached.The standard errors of the estimated parameters are obtained using a bootstrap procedure.In this study, once we select an appropriate value for the threshold (see below) the EI value is ≤ 0.5 in all the considered time series.Therefore, it is necessary to de-cluster the extremes by choosing the largest event in each cluster before fitting it to the GPD.
As mentioned before, for the exceedances over threshold, we use the GPD, which is characterized by two parameters, the shape ξ and the scale σ .The GPD for exceedances x − u of a random variable x reads as where u is the threshold.The shape parameter ξ determines the tail behavior while the scale parameter σ measures the variability.For a negative shape parameter, ξ < 0, the distribution is bounded (Weibull distribution).For vanishing shape parameter, ξ = 0, the distribution is exponential, and for a positive shape parameter, ξ > 0, the distribution has no upper bound (Pareto distribution).
In particular, for a negative shape parameter ξ < 0 the GPD has the upper bound where A max is an absolute maximum (Lucarini et al., 2014).
In general, the best estimate for the two parameters' shape ξ and scale σ depend on the threshold u (Coles, 2001).The choice of the optimal threshold for performing statistical inference from a time series is crucial.Choosing a very large value for u reduces the number of exceedances to a few values, inflating the variance of the estimators, so that the analysis is unlikely to yield any useful results.Conversely, choosing a too-small value for u would violate the asymptotic nature of the model, with a possibly biased estimation and wrong model selection (Coles, 2001); see details later in Sect.3.1.The shape ξ , the scale σ , and the RLs are estimated using the maximum likelihood estimator (MLE) using the R software (R Development core team, 2015), which also provides an estimate of the standard error of the estimates.Additionally, we wish to investigate the N -years RLs x N , which are exceeded on the timescale of N years (Coles, 2001) and can be expressed as where N represents the return period in years, n y is the number of observations per year, ζ u is the probability of an individual observation exceeding the threshold u, the shape parameter is ξ , and the scale parameter is σ .

Bias correction method
A simple bias correction is applied to each ERA-Interim time series through a rescaling that adjusts the first two moments (mean and variance) to the sample moments calculated for the corresponding observations (Acharya et al., 2013).Therefore, the bias correction is applied to the entire time series and it is not tailored to the extreme events only.The idea is to check whether by adjusting the properties of the bulk of the statistics we improve the skill of the ERA-Interim dataset considerably in describing extreme events.The biascorrected ERA-Interim time series x is expressed as where y ERA is the ERA-Interim time series, y and σ y its mean and standard deviation, and z and σ z are the mean and standard deviation of the meteorological station temperatures.The properties of extremes are commonly assumed to be closely controlled by the first two moments of the underlying distribution -for example, the IPCC (2012) relates changes in the properties of extremes to changes in the mean and in the standard deviation of the underlying distributions.EVT clarifies that, in fact, only a loose link exists between true extremes and the bulk of the events.Note that the proposed method of bias corrections has no impact on the estimates of the shape parameter, while it affects the scale and location parameters, thus impacting the RLs.

Threshold selection
The threshold selection is the first step in a POT analysis.
One needs to test whether the asymptotic regime is reached, i.e., whether one is choosing true extremes.It must be noted that EVT does not predict where (in terms of quantiles) one should expect the asymptotic regime to start.This can be investigated by checking whether the best fits of the shape parameter ξ and the modified scale parameter σ * = σ u − ξ u are stable with respect to increases in the chosen value of u (Scarrott and MacDonald, 2012).The optimal threshold u is selected as the lowest value at which the two parameters are invariant in order to reach the asymptotic limit (Coles, 2001;Furrer et al., 2010).This choice allows for having as many data as possible for performing the statistical inference, thus having lower variance for the estimators of the parameters.Figure 2 shows the parameter stability plots of the T max reading for Karachi, as an example to explain the threshold selection procedure.
In addition to diagnostic plots of the modified scale parameter σ * and the shape parameter ξ , the mean residual life plot is used to select the appropriate threshold for the POT analysis (Davison and Smith, 1990).The idea is to select the lowest value of the threshold when the plot is approximately linear.In the case of the Karachi data for T max , the plot appears to be linear and stable for u = 36 • C, indicating u = 36 as the most suitable threshold for the POT analysis (Fig. 3).We observe that the 90 % quantile is an appropriate threshold for all the station data, as well as the ERA-Interim datasets, and for both T max and TW max .

GPD fit
The goodness of fit is evaluated using quantile-quantile (Q-Q) plots and hypothesis testing.The Q-Q plot analysis is performed for the stations observed, the ERA-Interim, the bias-corrected ERA-Interim daily T max , and TW max .The Q-Q plots of the observed T max show that the GPD fits well in most stations.However, in a few stations like Jacobabad, Mohenjo Daro, Padidan, and Chor the empirical values show slight deviation from the modeled values.In spite of minor deviations at some stations, most of the exceedances are still fitted well by the model.The Q-Q plots of the observed TW max also fits well to the model at all stations.The Q-Q plots of the empirical ERA-Interim T max and TW max data reveal substantial differences with respect to the corresponding GPD fits.The empirical values of the higher quantiles deviate from the theoretical quantiles in all stations.However, if the higher quantiles are disregarded, then stations like Jacobabad, Mohenjo Daro, Rohri, Padidan, Nawabshah, Chor, and Badin fit very well with the model.The Q-Q plots of the bias-corrected ERA-Interim T max and TW max show better results than the ERA-Interim.We notice that the T max of the ERA-Interim and bias-corrected ERA-Interim fits better than the TW max if the highest quantiles are ignored, indicating that the bias procedure is, as expected, unable to correctly treat the statistics of the largest events.
In order to assess the goodness of fit, we apply the Kolmogorov-Smirnov test and Anderson-Darling test to the data of meteorological stations, ERA-Interim, bias-corrected ERA-Interim T max , and TW max .The p values indicate a good performance of the fit procedure.Table 2 shows the results of the Kolmogorov-Smirnov and Anderson-Darling statistics of the T max and TW max in all the datasets.

Parameter estimates
Here, we analyze the shape parameter ξ , the scale parameter σ , and threshold u for all considered datasets.The standard errors of the shape ξ and the scale σ parameters are given in Table 3.The spatial distribution of the shape parameter ξ and the scale parameter σ of the GPD in Sindh is shown in Fig. 4. The shape parameters ξ are negative in all datasets at all stations.This is hardly surprising, as meteorological and physical processes make sure that the temperature cannot grow locally without control.One finds a certain degree of variability across stations in the estimated value of the shape parameter.In the case of the observed T max , one obtains ξ estimates ranging between −0.418 and −0.223, while for TW max the range is between −0.323 and −0.177, so that values slightly closer to zero are found, thus allowing for larger excursions towards very high values with respect to the case of the extremes of the actual temperature.When looking at the biascorrected ERA-Interim data, the range of values for the shape parameter of T max (TW max ) is between −0.305 and −0.002 (−0.18 and −0.01).While there is a good match in the spatial patterns of the estimates for the observation vs. ERA-Interim datasets, the presence of values much closer to zero in the second case suggests the presence of some inadequacies in the representation of extremes in the reanalysis.This is not entirely unexpected, as reanalyses are constructed in such a way that typical conditions are well reproduced.Note that our simple bias correction procedure, while not impacting the estimates of the shape parameters, allows for the improvement of the estimates of the return levels, as discussed below.
The scale parameters σ measure the variability in the GPD distributions.The highest values of the scale parameters σ of T max and TW max are observed at stations such as Jacobabad, Padidan, Karachi, Hyderabad, and Chor in all datasets.This indicates that the variability in temperature extremes is higher at these stations, and one can expect higher return values of T max and TW max here with a similar shape parameter and the same threshold according to Eq. ( 4).The scale pa-  rameters σ of the observed T max range from 2.08 to 2.76, and the TW max values from 1.86 to 2.76.In the ERA-Interim analysis, the scale parameter σ of T max is between 1.00 and 1.95, and TW max between 0.74 and 1.75.We observe a difference in the scale parameters of both the observed, ERA-Interim T max , and TW max .We find that, unsurprisingly, the scale parameters of the bias-corrected ERA-Interim data are much closer to those estimated for T max and TW max using the station data.In the bias corrected ERA-Interim T max , the scale parameters σ are in the range of 1.50-2.75,while for TW max they are in the range of 1.40-2.40(Fig. 4).All the temperature scale parameters are in degrees Celsius.

Absolute maxima
Once the shape parameters ξ , the scale parameters σ , and the thresholds u are determined, it is possible to compute the theoretical absolute maxima using Eq.(3) (Sect.2.4).Theoretical absolute maxima can be compared with the observed ones for each station to better understand whether our fits are in agreement with the observed data.The daily maxi-mum temperature T max and the maximum wet-bulb temperature TW max (station data, the ERA-Interim, and the bias corrected ERA-Interim) have negative shape parameters ξ at all stations.This means that according to Eq. (2) in Sect.2.4, the probability distribution function (pdf) is bounded by the maximum values.These maximum values are the theoretical upper limits predicted with the GPD fit.The analysis shows that the observed absolute maxima T max and TW max at all stations of the three datasets are below the theoretical absolute maximum, as expected (Fig. 5).This gives us confidence on the quality of our fit.The following piece of information can also be derived: assume that in the future one observes an extreme event larger than the maximum inferred in the present dataset; this may suggest some non-stationarity in the most recent portion of the dataset.

Return levels
The RLs are computed considering various return periods (2, 5, 10, 20, 50, 100 years).As stated above, using a statistical approach based on the universality of EVT, we are Earth Syst.Dynam., 8, 1263-1278, 2017 www.earth-syst-dynam.net/8/1263/2017/able to extrapolate the results for time horizons longer than the one for which observations are taken.Clearly, uncertainties grow when longer time horizons are considered.The RL plots of the stations observed, the ERA-Interim, the biascorrected ERA-Interim daily maximum temperature T max , and the daily maximum wet-bulb temperature TW max are displayed in Figs. 6 and 7.The values of the RLs follow the north-south gradient of the climatic mean temperatures.The northern part of Sindh (Jacobabad, Mohenjo Daro, Rohri, Padidan, and Nawabshah) is hotter than the southern part (Hyderabad, Chor, Karachi, and Badin).The 2-, 5-, 10-, 20-, 50-, and 100-year RLs estimated in Sindh for station observed T max reach over 50 • C in Jacobabad, Mohenjo Daro, Padidan, and Nawabshah, and over 45 • C in Rohri, Hyderabad, Chor, Karachi, and Badin.The corresponding ERA-Interim T max RLs are at least 3 to 5 • C lower in all stations, while having correct representation of the geographical variability in the field.For example, the RL of 42 • C at Badin has a 3-year return period in the observations T max but a 30-year return period in ERA-Interim (Fig. 6).
The RLs of TW max are above 35 • C at all meteorological stations.As for ERA-Interim, the RLs of TW max are greater than 30 • C for all the stations except Karachi, which has RLs less than 30 • C. Here, we see again that the RLs of the ERA-Interim TW max are lower than the RLs of station TW max .Going again to the Badin stations, the 4-year return period observed for TW max is 38 • C, while the ERA-Interim dataset shows the same RL in a 15-year return period (Fig. 7).
The bias-corrected ERA-Interim T max and TW max show some improvements in the RLs at all stations.When looking at the Nawabshah, Hyderabad, Karachi, and Badin stations, the RLs agree with those obtained from the station data in the range of 5-100 years, while disagreements exist in the range of 2-5 years.At the rest of the stations, the bias-corrected data RLs are closer to those of the station data, yet not statistically compatible with them.When looking at the wet-bulb temperature TW max analysis, the RLs of the bias-corrected ERA-Interim show some overlap with those derived from station observations in Mohenjo Daro, Hyderabad, and Chor, while no overlap is found at the other stations.One understands that the simple bias correction methods proposed im- prove the quality of the representation of extremes by ERA-Interim, but many discrepancies remain (Figs. 6 and 7).
We also spatially plot the station and bias-corrected ERA-Interim T max and TW max RLs for the 5-, 10-, 25-, and 50-year return periods (Figs. 8 and 9), as a detailed spatial overview of the temperature extremes in Sindh might be of interest to policy-makers.The spatial RLs of the station and biascorrected ERA-Interim T max show differences in temperature; the hottest stations have the highest RLs.We notice that for Jacobabad, Mohenjo Daro, Padidan, and Nawabshah the RLs are between 50 and 53.6 • C and for Rohri, Hyderabad, Chor, Karachi, and Badin they are between 45.5 and 50 • C in a 5-to 50-year return period (Fig. 8).These extreme temperatures can impact the yields because crops are very sensitive to temperature variations, and even a rise of 1 • Celsius can cause detrimental changes in the phenological stages of the crops (Hatfield and Preuger, 2015).Every crop has a certain limit to tolerance of temperature.When temperature exceeds this limit, the crop yield is drastically reduced.Abbas et al. (2017) notice a 33 % decrease in major crops of Sindh due to warmer and drier weather.Karachi and Badin are expected to have decreased rice cultivation, hatching of fisheries, and mangrove forests surrounding these cities.Furthermore, temperature extremes can pose a serious threat to cotton, wheat, and rice yields in the Rohri and Mohenjo Daro areas due to increased crop water requirements.
In summer, the temperature and humidity increase to an extent that there are high chances of pests rapidly spreading in the crops.Temperature extremes not just directly impact the quantity and quality of grains but can also be a reason for urban flooding affecting agricultural lands (Luo et al., 2015a).Sindh produces cotton, wheat, rice, mango, banana, and dates; thus, a correct estimate of temperature extremes is very important.
The spatial RLs of station and bias-corrected ERA-Interim TW max for the 5-, 10-, 25-, and 50-year return periods show the highest RL as greater than 35 • C at all stations (Fig. 9).This is very serious for human health due to the working-day hours of populations in agriculture, building construction, and port activities.Karachi and Badin, being closest to the coast, are at the highest risk of temperature extremes.Thus, an immediate plan for adaptations is needed in Sindh to deal with such a hazard.The high values of TW max also indicate high levels of humidity in the region during summer, which is also proved by Kalim and Shouting (2012) and Freychet et al. (2015).

Summary and conclusion
The main objective of this study is the assessment of the return levels of the extreme daily maximum temperatures T max and wet-bulb temperatures TW max in southern Pakistan (Sindh).In addition, the performance of the ERA-Interim TW max is compared to the weather station TW max to assess its ability to estimate temperature extremes in Sindh.Moreover, a simple bias correction is applied to the ERA-Interim data to see whether correcting the first two moments of its statistics helps in improving its performance in representing temperature extremes.
The POT method is applied to the daily maximum temperature (T max ) and wet-bulb temperature (TW max ) data of nine stations and to the corresponding nearest ERA-Interim tem-perature data.After testing the asymptotic statistical properties, the 90 % quantile is found to be an appropriate threshold choice for all datasets.The Q-Q plots are used to assess the GPD fit, which is acceptable for both T max and TW max station data for all three datasets.However, the bias-corrected ERA-Interim data show more improved GPD fits than the ERA-Interim data.The shape parameters ξ are, in general, negative at all stations.The scale parameters σ show high values in Jacobabad, Padidan, Karachi, Hyderabad, and Chor, indicating higher variability in temperature extremes in these regions.The RLs of T max and TW max are estimated for the 2-, 5-, 10-, 25-, 50-, and 100-year return periods in all datasets.The RLs of T max estimated using the meteorological station temperatures are greater than 50 • C in Jacobabad, Mohenjo Daro, Padidan, and Nawabshah and greater than 45 • C in Rohri, Hy- Our results predict extremely high values of T max and TW max in the region.The T max extremes contribute to an increased rate of evaporation, which in turn may intensify the hydrological cycle, causing precipitation events and flooding (Cheema et al., 2012;Luo et al., 2015b).Additionally, crop varieties need to be changed under such a hot climate to avoid the risks of temperature extremes.The extremes of daily maximum wet-bulb temperature TW max are estimated as above the human survivability threshold of 35 • C throughout the region; thus, the risk of hyperthermia is very high here.The most vulnerable people are those who are involved in everyday outdoor activities like farming, fishing, and building construction and athletes, the elderly, and infants can suffer from heat strokes, dehydration, etc.The human habitability in such a warm region is already at risk and one can expect that these issues will be worse in future climate conditions.
We found that the RLs from the stations and ERA-Interim showed differences between 3 and 5 • C for both shorter and longer return periods due to the minor variations in the shape and scale parameters.Although the ERA-Interim dataset does not capture the magnitude of the extremes well, it still provides a good representation of their spatial fields.The biases between the station and the ERA-Interim data are relevant when one wishes to address the impact of hot cli-matic extremes on human life and on active crop production in the region.It is of primary importance to understand the physical reasons behind such inconsistencies, which make it hard to reasonably use ERA without bias correction.Clearly, such inconsistencies might result either from a misrepresentation of local processes dominated by near-surface processes (namely, heat and water fluxes) or from an inadequacy of the reanalysis in reproducing synoptic and sub-synoptic conditions responsible for extremely hot and humid conditions.This matter is surely worth investigating but is well beyond the scope of this paper.
We applied a simple bias correction, i.e., adjusting the mean and standard deviation for ERA-Interim T max and TW max data, to check the improvements in return levels.We noticed that the bias-corrected ERA-Interim T max and TW max give the return levels closer to the ones observed by the meteorological stations than the original ERA-Interim return levels at all stations.Although the bias-corrected ERA-Interim shows a good correspondence with the meteorological station data, statistically significant differences remain in most cases.Therefore, one must use more advanced bias correction methods for precisely analyzing extremes.We propose repeating this analysis in global climate models (GCMs) (CMIP5, CMIP6) and regional climate models (RCMs) (CORDEX) to study the properties of extremes.All models use reanalysis as input and generate information of extremes, which involves biases that if not corrected can lead to significant errors in prediction of present and future extremes.Therefore, in order to reduce the uncertainties in impact assessment, it is necessary to improve the reanalysis before using it in GCMs and RCMs.
The results have practical implications for assessing the risk of extreme temperature events in Sindh.All the results are placed in the web tool SindheX (www.sindhex.org),which will be freely available online soon after the publication of this paper.The maps and graphs are prepared to guide the local administrations in prioritizing the regions in terms of adaptations like preparation of baseline contingency plans for dealing with strong heat waves based on the current climatology.Such measures are not yet present in the territory and lead to many casualties each year.Our results will not only contribute to the regional planning but can also be useful for ongoing EU projects (SUCCESS, CSCCC), the World Bank project (Sindh Resilience Project), and megaconstruction projects like the China-Pakistan Economic Corridor.

Figure 2 .
Figure 2. Modified scale (σ * ) and shape parameter (ξ ) of the observed T max ( • C) at Karachi.The red vertical lines represent the selected threshold according to the station quantiles.

Figure 3 .
Figure 3. Mean residual life plot of the station-observed T max ( • C) at Karachi.

Figure 4 .
Figure 4. Spatial distribution of the shape parameters ξ and scale parameters σ of the station-observed data, ERA-Interim, and bias-corrected ERA-Interim T max (a-c) and TW max (d-f) ( • C).

Figure 5 .
Figure 5. Absolute maxima A max in degrees Celsius.(a) Station-observed T max .(b) ERA-Interim and bias-corrected ERA-Interim T max .(c) Station-observed TW max .(d) ERA-Interim and bias-corrected ERA-Interim TW max .

Figure 6 .
Figure 6.Return level plots of the station-observed T max (black), ERA-Interim T max (red), and bias-corrected ERA-Interim T max (green) in degrees Celsius.The blue line is to show a difference in the observed and ERA-Interim RLs.

Figure 7 .
Figure 7.Return level plots of the station-observed TW max (blue), ERA-Interim T max (pink), and bias-corrected ERA-Interim T max (green) in degrees Celsius.The black line is to show a difference in the observed and ERA-Interim RLs.

Figure 8 .
Figure 8. Spatial distribution of the station-observed T max (red) and bias-corrected ERA-Interim T max (blue) return levels in degrees Celsius corresponding to return periods of 5, 10, 25, and 50 years in southern Pakistan.

Figure 9 .
Figure 9. Spatial distribution of the station-observed TW max (brown) and bias-corrected ERA-Interim TW max (orange) return levels in degrees Celsius corresponding to return periods of 5, 10, 25, and 50 years in southern Pakistan.

Table 1 .
Code, name, geographic coordinates, and altitude of the stations.

Table 2 .
Results of the Kolmogorov-Smirnov goodness of fit test and Anderson-Darling test between empirical and GPD fits.

Table 3 .
Estimated parameters shape ξ , scale σ , and standard error ξ , σ of all the datasets.