# Problem 5: As unpredictable as the weather

Whether or not you intend to be an SQL programmer (ever, or for the rest of your life), you will spend a large amount of your time in IT modeling data. For now, throw the MVC meaning of model out the window, or if you happen to be on the 17th floor, just put it on the sill, and restrict yourself to the abstract problem of transforming real world data into a something you can read, search, and display.

Credit where credit is due: The idea for this problem comes to you from Martin Schweitzer, a computer scientist at Australia’s Bureau of Meteorology. Martin is one of the quietly competent people at the top end of computer science, and he has done many more favors for Scott and me than just send us a few ideas.

### Problem narrative:

Let’s say we are running a weather service and we store weather observations in a “database.” We have several thousand stations around the country recording observations. Some stations record only temperature, some record only rainfall, and some record a combination of both. There are currently about 15-20 variables that we may be interested in, and any station can record any combination of these variables … but not reliably, meaning that the rain gauge may fail without the thermocouple’s simultaneous demise.

Some stations submit daily records (example: daily rainfall), some submit every six hours (i.e., 4/day). Some record every three hours, and some record every minute (example: the automatic weather stations collecting temperature, wind speed, and wind direction). For convenience and reference, we also need to keep some daily values that are submitted by the stations or calculated after the fact, such as max and min temperature each day.

And then there are the derived data: monthly rainfall, as an example.

There are a few obvious options about how we may store these data in tables. We may want to organize by time:

• tables for 1-minute data
• tables for hourly data
• tables for daily data

Or we could go at it by data type:

• tables for temperature
• tables for rainfall
• tables for windspeed and direction

We could also just bail and store everything as a name-value pair (NVP).

### Essential questions:

Describe the aspects of the data that you would take into consideration when designing a storage repository.

Yes, we know this is not an easy problem, but the question is not to present your design in fifteen minutes or less, but to catalog the things you would think about; the questions that need an answer; the boundaries that constrain your choice.