Tuesday, 12 November 2013

Has COST HOME (2007-2011) passed without true impact on practical homogenisation?

Guest post by Peter Domonkos, one of the leading figures in the homogenization of climate data and developer of the homogenization method ACMANT, which is probably the most accurate method currently available.

A recent investigation done in the Centre of Climate Change of University Rovira i Virgili (Spain) showed that the ratio of the practical use of HOME-recommended monthly homogenisation methods is very low, namely it is only 8.4% in the studies published or accepted for publication in 6 leading climatic journals in the first half of 2013.

The six journals examined are the Bulletin of the American Meteorological Society, Climate of the Past, Climatic Change, International Journal of Climatology, Journal of Climate and Theoretical and Applied Climatology. 74 studies were found in which one or more statistical homogenisation methods were applied for monthly temperature or precipitation datasets, the total number of homogenisation exercises in them is 119. A large variety of homogenisation methods was applied: 34 different methods have been used, even without making distinction among different methods labelled by the same name (as it is the case with the procedures of SNHT and RHTest). HOME-recommended methods were applied only in 10 cases (8.4%) and the use of objective or semi-objective multiple break methods was even much rare, 3.4% only.

In the international blind test experiments of HOME, the participating multiple break methods produced the highest efficiency in terms of the residual RMSE and trend bias of homogenised time series. (Note that only methods that detect and correct directly the structures of multiple breaks are considered multiple break methods.) The success of multiple break methods was predictable, since their mathematical structures are more appropriate for treating the multiple break problem than the hierarchic organisation of single break detection and correction.

Thus the closing study of HOME (in Climate of the Past) recommended all of the participating multiple break methods (MASH, PRODIGE, ACMANT, Craddock-test) to use in practice. One hierarchic method (USHCN) approached the efficiency of the multiple break methods and showed some other good features, therefore it is also included in the HOME-recommended methods. The difference between the well-performing and poorer methods turned out to be large therefore no other homogenisation method was recommended by HOME.

A short time has passed since the end of HOME, so we might hope that the positive impact of HOME will be manifested later. However, the authors of the present report were surprised by the fact that the HOME-recommendations are never cited or discussed in the studies of 2013. So the question arises if we are in the right way in achieving advance in time series homogenisation or not.

The whole report includes the analysis of the possible causes of the delay in the expected advance and does recommendations for the future. It has been published in the 13th annual meeting of the European Meteorological Society (Reading, Sept. 2013).

Peter Domonkos
University Rovira i Virgili
Centre for Climate Change (C3)

UPDATE by VV: It may be that this post was published too early. That we were a little impatient. At the EUMETNET Data Management Workshop 2013, a large number of climatologists used modern methods for homogenization, especially HOMER was applied a lot.


Gregor Vertacnik said...

The low number of HOME-recommended methods is quite shocking. I would say this partly reflects the "my method is the best method" inertia and partly the higher number of references for some methods (e.g. SNHT) in the past.

I and my colleagues have used HOMER to homogenise monthly climate data in Slovenia and presented some of our results at the ICAM conference in July this year (http://meteo.fmf.uni-lj.si/sites/default/files/ICAM2013_Book_of_abstracts.pdf, page 24). We are planning to submit a paper on the topic. Our experience with HOMER is quite good, despite some problems, which have been solved.

Victor Venema said...

Yes, Gregor, I think inertia is important. In two ways, that people keep on using the method they are familiar with and that many papers that appear now still describe studies that were started before the HOME paper.

What worries me a little is that the paper is well cited, but that in the papers I read, there was no discussion of the results. The citations were typically of the form, that homogenization methods have been well validated.

However, they never wrote that multiple-breakpoint methods were better and that you have to take the inhomogeneous reference problem into account. That makes me wonder whether people actually read the rather long paper. Maybe we need a shorter summary. Or maybe it is just a matter of time and we are being impatient.

PeterThorne said...

I suspect that part of the problem is beyond your control in that there is an inertia from funders to redo work they see as 'done'. That is to say funders say "we have three global series / 2 regional series / 2 national series* so why do we need another?". This then begets the need to educate about the need for improved estimates with better characterization of uncertainties if the data is going to be robust and useful for applications.

*- delete as appropriate for funder x

In this context a paper may be useful, but I don't think the target is necessarilly scientific peers in the niche of homogenization. Indeed, that is probably the one constituency who are best 'sold' on the need to improve techniques / apply multiple techniques and understand them. Rather it is data users and policy makers who need to understand the implications of HOME and this requires a very different paper targeted in a very different way that highlights the cost of retaining the status quo or using that data unthinkingly as some sort of truth. Talking again to ourselves does little to further that.

Beyond that there are issues around data availability (keep an eye out Monday ...), code availability, applicability of HOME results more globally given the great heterogeneities in data quality and density etc. etc. Many of these can and will be addressed by initiatives such as the International Surface Temperature Initiative (if you will permit a single solitary plug here Victor?).

Indeed, one could argue that uptake of HOME recommended methods is an unduly narrow assessment criteria for success. Asking a broader question about whether HOME type approaches to validation, uncertainty quantification etc. have been adopted and whether it may have helped spur / inform other efforts would likely give you a more meaningful and rosy set of impacts assessment metrics here?

Peter Domonkos said...

Replies on P. Thorne's two statements:

1) "Data users and policy makers who need to understand the implication of HOME"
In my opinion, primarily the climatologists who apply homogenisation methods should understand what are the differences between the various options of homogenisation. Policy makers must understand only the generalities (why all these researches are important).

2) "HOME recommended methods is not an unduly narrow assessment criteria for success"
Although HOME experiments are not satisfactory to determine the values of specific methods once for ever, their scientific value is significant, moreover we also have other arguments.
2.1.) In the report of Domonkos and Efthymiadis we emphasize that the HOME results confirmed our previous, math-based knowledge about the advantages of multiple break methods. I remember well that in a homogenisation seminar held in Budapest around 2000, once there was a sharp debate between Tamas Szentimrey and Hans Alexandersson. And the only serious argument of Hans was that the advantage of higher math should be proved by tests... Hans could not live the time when well organised tests proved that his method is not bad, but multiple break methods function even better.
2.2.) Although all we are proud of the seriousness of HOME tests, P. Thorne is right when he thinks that the representativity of the obtained results has important limitations. First, because HOME Benchmark is based on a purely European observed dataset, second because the break-frequency and other parameters of the inserted inhomogeneities can be argued and can be very different for different observed datasets existing in the world, but even more importantly, because we used only 15 networks for the evaluation, and the within-network residual errors are strongly dependent statistically. Therefore, what we really miss is not the fast change of the methodology, but the more precise reference to the HOME results with e.g. "according to present knowledge multiple break methods perform significantly better than more traditional methods" or something alike. Nowadays we cannot refer to more reliable test results than which were achieved under HOME, thus we should refer to their main conclusions regularly and with high scientific clarity.

2.3. The 3-4% ratio of the multiple break method applications would still be too low if we have never done the HOME tests. The development of multiple break methods started in the 90’s based on well established mathematical theses and profound knowledge of the characteristics of observational datasets. Since then, all the experiences with multiple break methods are positive.

PeterThorne said...


I think we are talking at cross-purposes here. Firstly, my comment was largely a response to Victor's and not to the main post. Apologies if that was not clear. My points were in no sense a critique of HOME or the Venema et al analysis or a defense of using crummy methods which have been superceded.

So, to try to restate my points more elegantly:

1. I believe the primary constituency that you need to reach out to to make your results have impact is funders / policy makers / users and not homogenizers. If you can convince funders of the need to reanalyse old datasets and to fund only homogenization efforts that use modern techniques then the rest will follow. We are all on either soft or directed hard funds and he who pays the piper names the tune. Ipso facto you should be educating the payers of the pipers here and not just the pipers. I can want to play a Motzart concerto with every single strand of my being but if the guy paying me says they want me to play Bee bee busy bee busy busy buzzy bee well, frankly, I'm going to play that. This requires reaching out in a substantively different way to a substantively different constituency.

2. Surely the assessment of the success of HOME is more than simply whether certain homogenization techniques are taken up? By concentrating and arbitrarily assigning success on such a narrow metric you fail to capture other positive impacts. For example, how many analyses have undertaken or plan to undertake rigorous benchmarking along the lines of HOME? I can think of several. If that is not a 'success' of HOME in the eyes of HOME participants then maybe it should be. So, I was saying that whilst uptake of methods is a key metric for evaluating the impact of HOME I believe it would be a mistake to make it the sole such metric and it would actually do you and all your colleagues a disservice to do so.

Peter Domonkos said...

Hi Peter T.,
I think that the connection between funders and scientists is more complex. I do not believe that funders (generally) more easily give money for homogenising with one method (say: with SNHT) than for another method (a multiple break method). Just the fact that we, climatologists, hardly use the up-to-date methodology is a bad message for funders / politicians.
How elegant would be if in our applications for funds we could refer to the present revolution of the use in time series homogenisation referring to the large number of studies...

I agree with you on the rest of your last comments.