Some SARS data (epi data)

(This is reposted from my web page, where it has been since the indicated date.)


As part of my work on the spatio-temporal spread of infectious diseases, I have had to use the Severe Acute Respiratory Syndrome (SARS) data available on the WHO website (here). This data comes in a challenging form: each page contains the report for a single day and the information presented evolves through time. The latter is to be expected: as the epidemic unfolded, the type of information that was required to deal with the problem evolved. However, in order to be able to use the data, some processing is necessary.

Formatting the data to make it easier to use was a somewhat long and rather annoying process and since this is publicly available data, I thought I would save others interested in this type of problems the time that it took me to do this, by making this processed data public as well.


The WHO data was obtained during the course of the SARS epidemic. Some of the data was revised, but only a few days after it was initially posted. Therefore, compared to more thorough post-epidemic investigation and data gathering, the data here cannot be considered as the most up to date.

Epochs in the data

There are three main phases in the data, characterized by different variables being reported. In the processed data, all these variables are present, although they are empty for data in an epoch that does not report about given variables.

Tidying up the data

As much of the information as possible was retained, with a few exceptions; see below. Dates were formatted consistently. The times at which the reports were posted are also available. When notes were relative to a given country on a given day, they are included as notes. The only notes that were omitted are relative to the general handling of the data. They are available in the full file, though.

Data files

Several data files are available.