Playing with Data

Personal Views Expressed in Data

NWS Verification: A Lesson in Gaming the System

As I write this quick post, a strong bow-echo is racing across northern Minnesota and northern Wisconsin. The National Weather Service Forecast Office in Duluth, MN issued several large warnings (3rd largest NWS warning and 9th largest NWS warning) for the areas along and ahead of the approaching bow echo. I’ll leave the debate over whether issuing such large warnings are good for service to a later date.

Instead, I want to take a quick moment to highlight what I perceive to be a deficiency in how the NWS does it’s verification. This deficiency actually rewards forecasters for issuing larger warnings, such as those issued tonight by NWS Duluth. Consider the three warnings below, all which were valid at the same time.

First, let’s assume that a severe weather report occurs in the domain shown at the time this image was taken. Next, let’s assume that a severe weather report occurs at any of the areas denoted with a “1″. In this case the single storm report verifies the single warning that contains the report. However, if a severe weather report occurs at any of the areas denoted with a “2″, that single report verifies both warnings that contain the report. Thus, a single report verifies two warnings! Now, consider the scenario in which a single severe weather report occurrs at location identified with a “3″. In this case, the single report would verify all three warnings that contains that single report!

So, what does this mean? The larger a warning, the more area in which a report can occur to verify the entire warning. Furthermore, if warnings overlap each other, a single report can verify multiple warnings. Thus, in terms of determining the NWS’s/office’s/forecaster’s verification scores, each are actually rewarded for engaging in this practice. Now, I’m not a NWS forecaster, nor have I ever been. I cannot say (and highly doubt) that gaming the verification system is consciously thought of in the heat of the warning process. However, it highlights what I consider a short-coming of the NWS’ verification process; one that rewards larger warnings in a time in which storm-based warnings were designed to promote smaller warnings.

Comments on “Attempts at Assessing Chaser Contributions to the Warning Process”

If you’ve visited this website in the past week, chances are you were here to read and/or comment on the blog post Attempts at Assessing Chaser Contributions to the Warning Process. Wow. Talk about a passionate response — on all sides! That entry has prompted the biggest response, in terms of comments, in quite some time. I have read every comment posted, but am too busy preparing for the Hazardous Weather Testbed’s Experimental Forecast Program to post a long response to every one. Instead I thought I would act as my own ombudsman and address some of the reoccurring themes that keep appearing in the aforementioned post’s comments.

When reading below, one may ask themselves why I haven’t removed the offending post. The reason is multi-faceted. First, as soon as I do that I subject myself to (in my opinion) a more severe criticism of writing somewhat of a hit piece and then taking it down when I didn’t like the response. I personally do not think that is right and so in that regard, the post must stay. If I’m going to write something that I know is going to be controversial, I must be prepared to accept the negative comments that result. My philosophy is that as long as a comment is not spam, not profane, nor attacking myself or others personally, I’ll never remove it. No matter how much I may disagree, it is only in hearing all sides of an issue that I can expand my horizons. Secondly, there have been some good discussions that have resulted from the post, both for my position and against it. As long as these conversations refrain from snarks, I see no reason why they should not be allowed to be seen.

Now, for my thoughts on the post…

Let me begin by saying that this certainly was not my best post in terms of scientific content. I made a fundamental flaw in the post that quite a few commenters picked up on, and I’ll admit that I did it. No matter what these data presented suggest, they cannot prove one way or another the intention of a person. At best, with better data than presented, one might be able to assess chaser impact, but not motives. I tried to be cognizant of this fact in some aspects of the blog post (such as not titling the post some variant on “Are Chasers Chasing to Save Lives”), but failed miserably in others (using the data shown to justify the line “Please don’t insult my intelligence by claiming to chase to ‘save lives’.”). This is something I must make sure I do not do again, and if I do, I trust you will hold me accountable.

Next, some commenters accused me of doing “bad science”. In response to at least one of these comments I responded that I never claimed to be doing science. However, after a couple days of thinking about this issue I believe that the original comment and my response both miss the point. This isn’t “bad science”, nor is it “not science”. It’s “unfinished science”. If I left things as is, said that the matter was closed, and closed my mind to differing points of view, then it most certainly would be “bad science”. Instead, I tried to go out of my way to imply that my view points were far from definitive. I wrote things such as “circumstantial evidence”, “Attempts at Assessing”, and “To summarize, I believe…”. I posted these ideas on a blog website, not a scientific journal for a reason, they are initial ideas and certainly would not hold up in a court of law nor a scientific journal.

What the post tried to do, and admittedly failed miserably at doing, was attempt to objectively assess the contributions chasers have to the warning process. I put forth an idea, people attacked it and poked holes in it. If I am to act like the scientist I would like to think I am, I should not take these criticisms personally, but rather use them to continue to evaluate my idea(s), refine the idea(s), and try again. This is how science is supposed to work! At the end of the process the final idea(s) will be stronger and more refined than anything initially proposed.

Assessing chaser impact on the warning process is an extremely complex problem as there are many variables and many signals. As several commenters suggest, I did allow myself to fall into the “confirmation bias” trap — I saw what I wanted from irrelevant and/or inconclusive data. But, by putting my thoughts out in the open, people were quick to point out the idea’s flaws, which will allow me (in time) to do better analyses with differing datasets and strengthen my position. Again, this is how I believe science should work. Putting this data and ideas on the website wasn’t the mistake, but intertwining my personal opinions so strongly was. And for that, I do have regrets; I’ll be better about that moving forward.

However, removing my personal beliefs, this is the first attempt, to my knowledge, that tries to objectively assess the contributions chasers have on the warning process. Due to the complex nature of the problem, and the fact I did this as sort of a “back-of-the-envelope” calculation, I merely looked at aggregate measures using simple NWS performance metrics. Possible ideas that could be done were suggested in the comments, and when I have time, I’ll certainly try and investigate some of these. (Aside, if a reader would like to do this, I’m more than happy to share my datasets.) There are a lot of other potential impacts, both negative and positive, that need to be assessed as well. As it stands now, a lot of anecdotal stories are offered by those on either side of this issue, but do we really have any idea what the actual impact is? From a chaser point of view, being able to demonstrate a positive impact in the warning process could help counter the negative perceptions current circulating in several news outlets and improve interactions with emergency response officials. From an emergency response official perspective, knowing chaser impacts might lead to a new respect for chasers, or more clout in trying to regulate them. But then again, maybe both sides would rather not know…

None of the comments have changed my underlying assumptions that most chasers chase for personal reasons, not the noble reasons of saving lives and doing it for the NWS often offered by chasers when interviewed by the media. However, we are (I am?) a long ways off from being able to assess this objectively. My previous post was a first attempt at this. I’m sure it won’t be my last. And I’m sure that there will always be someone out there challenging my views. That’s the way it is supposed to be.

Tornado Warning Seminar

FIG 1: Yearly Mean Tornado Warnings on a 1KM grid derived from the 10-year period 2002 through 2011. Only polygon coordinates were used. (Note: This figure is not shown in the presentation. The figures shown in the presentation are divided based on the Storm-Based Warning switch date: 01 October 2007.)

Today I gave a version of my presentation on Tornado Warnings. This presentation was originally given earlier this month at the University of Alabama at Huntsville. We recorded today’s presentation so that others could see it, but I will warn you that my delivering the presentation this time did not go as smoothly as it did in Huntsville. (I stumbled over my words a couple of times and missed a few points I wanted to make.) But as a good sport, and someone who wants to see the conversation continue, I’m posting the link to the recording so that others may watch it and contribute feedback on what they thought.

When watching the presentation, a couple of questions I would love for you to keep in mind:

  • I am not an operational forecaster. No matter what I may say; no matter what I may think, I have never been in the position of actually having to issue a warning. Until I am in that position, everything I say should be considered my opinion. This seminar is in no way an attack on operational forecasters. They do a tremendous job under extremely stressful situations. This seminar is aimed at fostering a discussion on policy, not on specific actions a forecaster should or should not take.
  • Current Tornado Warning metrics center around Probability of Detection, False Alarm Ratio, and other contingency table measures. However, not every detection and not every false alarm are created equal. Are there better metrics that could be used to measure tornado warning performance? If so, what would they look like?
  • As mentioned above, not all false alarms are created equal. Furthermore, issues such as areas within the warning not being impacted by a severe event and broadcast meteorologists interrupting regular programming to cover warnings within demographic areas all give rise to the notion of perceived false alarm ratio. How can we adequately measure this, and maybe more importantly, is there anything we as a community can do to address issues arising from this?
  • Warning philosophies (severe and tornado) vary from office to office, leading to the sometimes asked question, “Do we have a single National Weather Service or 122 Local Weather Services?” Are these differing warning philosophies a good thing or a bad thing? If it is a good thing, how can we better communicate the different philosophies to users, or is that even necessary? If it is a bad thing, how do determine which philosophy(ies) do we standardize around? Or, is there a third option here that we’re (I’m) missing?
  • Should warnings be meteorology centric or people centric? Although population centers appear to show up in the datasets, is this a reflection of being people centric or merely a reflection that radar locations tend to be co-located with population centers and our understanding of thunderstorm phenomena are inherently tied to radars?
  • Instead of moving toward an Impact Based Warning paradigm, or a tiered warning paradigm, is it time to consider including probabilities or other means of communicating certainty/uncertainty information into the warning process? If so, how do we go about doing this in a manner that does not leave the users of these products behind? In other words, how do we move toward an uncertainty paradigm in which average citizens can understand?

I firmly believe that the warning system in place has undoubtedly saved thousands of lives throughout it’s history. However, I do believe that it has problems and stands to be improved. However, I cannot put into words what the problem(s) is(are). I believe that it will require community efforts to address these problems. This includes all of the severe weather community: research meteorologists, operational meteorologist, NWS management, emergency managers, broadcast meteorologists, and, maybe the most overlooked piece, social scientists.

Lastly, I must apologize to Greg Blumberg for coming across much harsher than I intended to when addressing a comment he made during the presentation. My response was intended in jest since I know Greg, but that didn’t come across to everyone in the audience, which tells me I shouldn’t have said it. Greg, my sincerest apologies, and I hope you understand that my response was entirely in jest.

With that said, I hope you enjoy the presentation, and I look forward to hearing your ideas!

Attempts at Assessing Chaser Contributions to the Warning Process

I am going to begin this post by saying upfront, in bold font.

This post is not an “anti-chasing” post. I have no problem with people wanting to chase.

I merely have a problem with people justifying their chase activities by saying they do it to “help the NWS” or to “save lives”. I have no problem with people who chase, just be honest with why you do it. In fact, since I know I’m going to take a lot of heat from certain groups within the chaser community, I’m going to repeat it.

This post is not an “anti-chasing” post.

OK, now that the disclaimer is out of the way, what is the purpose of this post?

In the wake of last weekend’s tornado outbreak, several news agencies (LA Times, USA Today, and Detroit Free Press just to name a few) have written stories on the (exponential?) increase in the number of storm chasers on the roadways. It is not my intention to discuss the complaints of emergency managers and other local government officials. (That will have to be left for a future post.) Nor do I intend to discuss whether chasing is morally wrong and/or should it be outlawed. Instead, I want to stick to the all-too-often used storm chaser justification for chasing, “Storm chasers save lives.” I’ve seen this justification used in the numerous stories I’ve read this week. I’ve heard storm chasers complain about the “bad rap” they are getting; that people are focusing on the actions of a few and forgetting that “chasers save lives”.

Storm chasing really began to take off in the mid-1990s with the advent of the first Verification of the Origin of Rotation in Tornadoes Experiment (VORTEX) and the subsequent movie Twister. Since these two events, the number of people who identified themselves as chasers has been on the rise (at least according to my perceptions). In recent years, television shows on storm chasing, software such as GRLevel3, ThreatNet, and Spotter Network, as well as the increased cell-phone bandwidth, among many other things, has resulted in what I perceive as an almost exponential growth in the number of storm chasers. If these storm chasers really were chasing to “save lives” as many or most claim, I would hope to see some sort of reflection of this in the fatality counts from tornadoes. After all, more people out saving lives should result in more lives being saved.

Below is a figure courtesy of Dr. Harold Brooks. It shows the annual number of tornado fatalities in the United States in terms of deaths per million people. Examining tornado fatalities in this context attempts to account for the fact that the population is increasing, and thus there are more people susceptible to losing their life in a tornado.

What should be very obvious is that from about 1950 through about 2000 the trend is decisively downward. However, since around 2000, the trend is approximately flat, meaning that the odds of dying from a tornado is roughly the same now as it was back in 2000. This figure is one of my absolute favorites as it contains a lot of information and leads to a lot of tough questions for the severe weather community. One question that is often asked is, why does the trend seem to flatten out in the 2000s? There could be a lot of reasons why this is, and we’ll leave those to another post. My point here is that with the explosion in the number of chasers, I would expect to see some reflection of this in the number of fatalities resulting from tornadoes. However, the data do not seem to suggest that storm chasers have had that much influence in saving lives.

“But, Patrick, storm chasers provide a valuable service to the National Weather Service by providing real-time information to aid in the warning process. You can’t use a single figure to negate all the contributions of chasers to the warning process!” Fair enough. If chasers do provide a significant impact in the warning process, we should see some reflection of their contributions in the various tornado and tornado warning metrics, so let’s take a look.

Above is a figure that breaks down the number of reported tornadoes by year and by F/EF rating. As with the figure before it, there is a lot of information behind the data going into this figure, but we’ll leave that for another post as well. For the current purpose, you’ll notice that for the most part, most of the data have remained unchanged. The exception being the number of EF-0 tornadoes, and to a lesser extent the number of EF-1 tornadoes, dramatically increases starting around 1990. This increase coincides with the advent and widespread adoption of Doppler radars. With the increased information Doppler radar provided meteorologists, weaker tornadoes were more easily detected, and thus, the number of reported weaker tornadoes has increased. If chasers had a significant impact in the number of tornadoes observed, I would expect to see some sort of change to the trend-lines beginning in the mid-1990s as the increased number of chasers saw more of the tornadoes that meteorologists missed. The truth is, the impact of chasing is circumstantially less than the impact due to adoption of Doppler radar.

To illustrate this even further, lets consider the probability that a tornado will occur in a tornado warning probability that a tornado that is occurring has been or will be warned. (In verification parlance, this is known as Probability of Detection, or POD.) As the figure below indicates, the Probability of Detection has increased consistently, albeit slowly, since around 1990, which is roughly when the increase in weaker tornadoes began. Taking these two pieces of information together, it would suggest that the Probability of Detection has increased as a result of detecting the weaker (F/EF-0 and F/EF-1) tornadoes, and not the result of chasers making reports.

Considering the Probability of Detection aspect of the problem lead me to consider, well maybe chasers have an impact on the tornado warning lead time. With chasers calling in those “rotating wall cloud” reports, maybe the lead time has increased. The mean lead time for all tornado warnings is shown below.

Now, this figure requires a bit of explanation. The National Weather Service defines lead time in a quirky way. First, lead time is merely the amount of time elapsed from the issuance of a tornado warning to the first report of a tornado. Sounds simple enough, right? Well, it gets tricky when you consider a tornado that occurs before a tornado warning. In this case, one would expect a negative lead time. This negative lead time would approach negative infinity in the case that a tornado is never warned. However, this is not how the National Weather Service reports lead time. The NWS assigns a lead time of 0 for all tornadoes that occur before or without the issuance of a tornado. Thus, if you have a low Probability of Detection Thus, if a tornado does not have a warning, or warnings are not issued until after the a tornado is reported, you would expect to have a low lead time because of all the zero lead times that would be averaged in.

What we see in the figure above is that the tornado warning lead time has increased fairly consistently since around 1990. It has increased from about roughly 5 minutes to roughly 15 minutes. Once again, however, this increase in lead time does not appear to be related to storm chasers. In fact, it appears to be directly related to the increase in Probability of Detection.

Plotting the Probability of Detection (multiplied by 20) on the same plot as Tornado Warning Lead Time (above), it becomes pretty obvious that the two are highly correlated. In fact, if we were to plot the linear trend lines of the two metrics (below), we would find that the slope is almost the exact same!

For completeness, I’ve plotted both the raw values and the trend lines on the same plot above. Once again, I do not see any significant impact from storm chasers on these metrics. I firmly believe that they are all attributable to better forecaster training and widespread adoption of Doppler Radar.

Maybe chasers have an impact on the False Alarm Ratio, or the number of times a tornado warning is issued and a tornado fails to develop. Certainly chasers have had an impact here, right? After all, with the number of chasers out there reporting what the see in real time back to the NWS, the NWS is issues fewer tornado warnings where no tornado occurs. Unfortunately, as the figure below indicates, the False Alarm Ratio, or FAR, has not changed since 1986. It has remained fairly constant around 75%

Yes, I admit that all of this “evidence” is circumstantial; maybe the improvements since 1990 are the result of chasers and not Doppler Radar. (Maybe we should ask NWS forecasters if they would rather have Doppler Radar or chasers?) I also admit that there could be some chasers who really do chase with intention of “saving lives” or “aiding the warning process”, but I think these chasers are few and far between. Instead, what I really see, and believe, are chasers who are interested in chasing for their own reasons (adrenaline, fame, money, etc) and try to pass it off as a noble cause. If people really did chase for the sole reason of helping others, there would be no need for still and video cameras and no need for chasers to get as close as possible. Please don’t insult my intelligence by claiming to chase to “save lives”. I do not see paramedics and firefighters setting up tripods at the scenes of fires and accidents, so why are chasers unless there is another motivation? At least be honest with yourselves and with those around you, and say you, and most other chasers, chase for personal reasons, and if you remember to help the NWS out while doing it, you try to do so.

To summarize, I believe chasers have a limited impact on the warning process, and don’t appear to directly “save lives”. With that said, I don’t have no problem with people who want to chase. My problem lies with people who chase and then are dishonest about, or at least misrepresenting, their intentions.

Lastly, I want to make a distinction between “spotters” and “chasers”. Spotters have been around for a long time, dating back to the days of World War II (and possibly longer). The role of spotters used to be to warn military bases about approaching thunderstorms so that people could be removed from munition depots on the off-chance that lighting stuck the munitions. Spotters tend to be tied to local communities and have relationships with the local officials involved in decision making process. I suspect that spotters have a much greater role in the warning process than chasers, although, it appears (at least circumstantially) that this contribution is still at the margins.

Again, these are my interpretations of the data; I’m sure others will interpret the data in other ways, and I have no doubt that those of you who disagree with my interpretation will let me know.

Please read my response titled Comments on “Attempts at Assessing Chaser Contributions to the Warning Process” before posting comments. Thanks for understanding.

Tornado Emergency Success or Failure?

Clarification/Reiteration: This post is not meant to criticize the decisions of the NWS Wichita office. Based on the protocol they are expected to follow, they did exactly what they were supposed to do in this circumstance. They did an amazing job. This post is one in a series designed to keep the discussion going regarding the use of enhanced wording. It is designed to focus on the policy, not the forecast.

On Saturday evening a the National Weather Service in Wichita was receiving reports of a large, violent tornado moving toward the city of Conway Springs, KS. Based on the protocol of the Impacts Based Warnings pilot program, the local NWS office issued a Tornado Emergency for Conway Springs. (The text of this warning, including the line

...TORNADO EMERGENCY FOR CONWAY SPRINGS...

can be found below.) Fortunately for the residents of Conway Springs, a devastating tornado failed to occur. The original tornado that prompted the tornado emergency weakened and dissipated to the west-southwest of Conway Springs. A second tornado developed to the southeast of the original tornado and moved to the east-northeast and missed Conway Spring to the east.

Above is the image from the NWS Wichita, KS homepage. (Original image can be found on this page). It clearly shows that Conway Springs was spared from damage from the tornadoes. Based on how I have defined tornado emergencies in my research, this (potentially) would be considered a successful Tornado Emergency, as a tornado occurred within the bounds of the tornado polygon. (It’s “potential” depending on what tornado intensity threshold you want to use for a tornado emergency.)

However, since the Severe Weather Statement that announced the Tornado Emergency specifically stated,

...TORNADO EMERGENCY FOR CONWAY SPRINGS...

should this be considered a successful Tornado Emergency, even though the city was not hit? What if you lived in Conway Springs, heard from the local TV station that this was a Tornado Emergency and a “Catastrophic Warning”, as at least one Wichita television station said, and decided that your best bet is to get into your car and leave town? If you drove west, south, or east, chances are you might have driven into one of the two tornadoes, whereas staying at home would have spared you. Would that person consider this a successful Tornado Emergency? I would speculate that he or she would not, as I know I would not. All of these are scenarios and questions that we do not have answers for. This is why the Impacts Based Warnings pilot program should have engaged social scientists and researched before being tested operationally.

I’ll reiterate my position:

The current warning system has problems, but works remarkably well for a vast majority of the population, despite these problems. At the same time, no one in the severe weather community understands, let alone can articulate, what these problems actually are. Before attempting to implement solutions to address what the problems are perceived to be, the severe weather community needs to seek to understand what the problems are. Only then, once the problems are known and understood, can solutions can be proposed, tested, revised, and tested again before being implemented into a warning system that has undoubtedly saved tens of thousands of lives throughout the course of it’s use.

The text of the Severe Weather Statement issuing a Tornado Emergency for Conway Springs, KS.


815
WWUS53 KICT 150236
SVSICT

SEVERE WEATHER STATEMENT
NATIONAL WEATHER SERVICE WICHITA KS
936 PM CDT SAT APR 14 2012

KSC191-150300-
/O.CON.KICT.TO.W.0029.000000T0000Z-120415T0300Z/
SUMNER KS-
936 PM CDT SAT APR 14 2012

...A TORNADO WARNING REMAINS IN EFFECT FOR NORTHERN SUMNER COUNTY
UNTIL 1000 PM CDT...

...TORNADO EMERGENCY FOR CONWAY SPRINGS...

AT 932 PM CDT...A CONFIRMED LARGE...VIOLENT AND EXTREMELY DANGEROUS
TORNADO WAS LOCATED 5 MILES SOUTHWEST OF CONWAY SPRINGS...AND MOVING
NORTHEAST AT 35 MPH.

THIS IS A PARTICULARLY DANGEROUS SITUATION.

HAZARD...DEADLY TORNADO.

SOURCE...SPOTTER CONFIRMED TORNADO.

IMPACT...THIS IS A LIFE THREATENING SITUATION. YOU COULD BE KILLED IF
         NOT UNDERGROUND OR IN A TORNADO SHELTER. COMPLETE
         DESTRUCTION OF ENTIRE NEIGHBORHOODS IS LIKELY. MANY WELL
         BUILT HOMES AND BUSINESSES WILL BE COMPLETELY SWEPT FROM
         THEIR FOUNDATIONS. DEBRIS WILL BLOCK MOST ROADWAYS. MASS
         DEVASTATION IS HIGHLY LIKELY MAKING THE AREA UNRECOGNIZABLE
         TO SURVIVORS.

LOCATIONS IMPACTED INCLUDE...
WELLINGTON...CONWAY SPRINGS...BELLE PLAINE...WELLINGTON AIRPORT AND
RIVERDALE.

PRECAUTIONARY/PREPAREDNESS ACTIONS...

TO REPEAT...A LARGE...EXTREMELY DANGEROUS AND POTENTIALLY DEADLY
TORNADO IS ON THE GROUND. TO PROTECT YOUR LIFE...TAKE COVER NOW. MOVE
TO AN INTERIOR ROOM ON THE LOWEST FLOOR OF A STURDY BUILDING. AVOID
WINDOWS. IF IN A MOBILE HOME...A VEHICLE OR OUTDOORS...MOVE TO THE
CLOSEST SUBSTANTIAL SHELTER AND PROTECT YOURSELF FROM FLYING DEBRIS.

TORNADOES ARE DIFFICULT TO SEE AND CONFIRM AT NIGHT. TAKE COVER NOW.

&&

LAT...LON 3748 9781 3747 9715 3730 9715 3706 9780
TIME...MOT...LOC 0235Z 213DEG 30KT 3735 9768

TORNADO...OBSERVED
TORNADO DAMAGE THREAT...CATASTROPHIC
HAIL...2.50IN

$$