CDC WONDER API for Data Query Web Service
Access data in the WONDER online databases immediately with automated data queries in XML format over HTTP, for use in your own web pages or widgets.
Send a POST request to http://wonder.cdc.gov/controller/datarequest with one parameter named request_xml, that points to an XML document containing the request. The result of the request will be returned as an XML document. The results include the data table and applicable caveats and footnotes. All data use restrictions are enforced and still apply to these data.The XML document that specifies the request has a specific format and valid parameters specific to each unique online database. Examples and more information are shown below. Please contact us to learn more.
CDC WONDER API Examples and Description
- U.S. national cancer deaths (ICD-10 codes C00-D48) by year and by race, for the 5 year time period 2009-2013. Number of deaths, population estimates, crude death rates and age-adjusted death rates per 100,000 persons, 95% confidence intervals and standard errors for age-adjusted death rates.
- Example 1 Request
- Example 1 Response
- U.S. national injury deaths for persons age 18 and under, by Injury Intent and Injury Mechanism, for years 1999-2013. Number of deaths, population estimates, crude death rates.
- Example 2 Request
- Example 2 Response
The examples are simple text files, a matched pair for the request (“req.xml”) and response (“resp.xml”) for the D76 online database for Detailed Mortality 1999-2013, at http://wonder.cdc.gov/ucd-icd10.html.
The first example ("Example1") data query groups results by year (D76.V1-level1) and by race (D76.V8), where cause of death (O_ucd) is limited to ICD-10 codes (D76.V2) for cancer (C00-D48), and years (D76.V1) are limited to 1999-2013. It requests default statistical measures for the number of deaths (D76.M1), population estimates (D76.M2), crude death rates (D76.M3), and optional measures for age-adjusted rates (D76.M4), standard errors for age-adjusted rates (D76.M41), and 95% confidence intervals for age-adjusted rates (D76.M42). The first query asks to show suppressed values ("O_show_suppressed" display is true), show zero-value rows ("O_show_zeros" display is true) and totals (O_show_totals). The "example1" response file includes the results table under the "data-table" section. Each "r" defines a row and each "c" is a cell or column's values for that row. Columns, left to right: first the year, then the race, then the number of deaths, the population, the crude death rate, then age-adjusted death rate, a sub-ordinate level "l" value to show the confidence interval range inside and below the age-adjusted death rate, and finally the standard error for the age-adjusted rates. The 2009 row spans 5 subordinate rows (c r="5"), one row for each of the 4 race categories plus the sub-total for the 2009 year. The sub-totals for each year are last (c c="1"). The very last row in the table is the summary total for the rows in the table (c c="2"). The "data-table" is set up by year and by race, because those are the request parameters for "B_1" and "B_2" in the request file.
The 2nd example ("Example2") groups results by Injury Intent (D76.V22) and by Injury Mechanism (D76.V23), and requests statistical summary measures for the number of deaths (D76.M1), the population (D76.M2), the crude death rate (D76.M3), where age is limited to single-year age groups (O_age value is D76.V52) for ages < 18, and where cause of death is set to Injury Intent (O_ucd values is D76.V22), and D76.V22 is limited to the 5 injury categories only. In the response for "Example2," the "data-table" section shows rows ("r") defined by 5 cells going from left to right for each column. The left-most column shows the Injury Intent category, then the Injury Mechanism category, then the number of deaths, the population figure, and the crude death rate. The sub-totals for each Injury Intent category (c c="1") are shown as the last row under each category. The very last row in the table is the summary total for the rows in the table (c c="2"). The "data-table" is set up by Injury Intent and by Injury Mechanism, because those are the request parameters for "B_1" and "B_2" in the request file.These 2 example queries are for mortality data. The vital statistics online databases are open to API queries as of February 23, 2015. However, in keeping with the vital statistics policy for public data sharing, only national data are available for query by the API. Queries for mortality and births statistics from the National Vital Statistics System cannot limit or group results by any location field, such as Region, Division, State or County, or Urbanization (urbanization categories map to specific geographic counties). For example, in the D76 online database for Detailed Mortality 1999-2013, the location fields are D76.V9, D76.V10 and D76.V27, and the urbanization fields are D76.V11 and D76.V19. These 'sub-national" data fields cannot be grouped by or limited via the API, although these fields are available in the web application. The request must be a POST which contains one parameter with the name "request_xml" (without the quotes) whose value is the contents of the XML request document, sent to http://wonder.cdc.gov/controller/datarequest/[database ID]. Each online database has a unique database ID, and a unique set of parameters and measures. In the examples shown here, the database ID is D76. D76 is the online database for Detailed Mortality 1999-2013, at http://wonder.cdc.gov/ucd-icd10.html. The data response page includes more information about the possible request parameters for the specific online database. The response file contains information that supports both drawing the results table and also re-populating the request form for WONDER, including labels, validations and "caveat" messages that may be displayed to the end-user. Thus the response file may give you a broader sense of the metadata that supports possible variable/value pairs in forming a potential data request. Currently, this XML response file doesn't have charts or maps. WONDER uses the data in this response file to generate the input to the charting and mapping functions, after the user clicks on the chart or map tabs.
Frequently asked questions for the API:1) How to send a query? The URL to send the request is the server name (http://wonder.cdc.gov/controller/datarequest ) followed by the dataset code or ID for the specific online database referenced in your query. For the Detailed Mortality 1999-2013 online database, the dataset ID is "D76" as in " http://wonder.cdc.gov/controller/datarequest/D76". The request must be a POST which contains one parameter with the name "request_xml" (without the quotes), whose value is the contents of the XML request document. The WONDER server will then reply with an XML document that contains the results. 2) How to form valid queries with the XML? The valid metadata describing fields in WONDER varies per data set. If you "view source" for any single "Request Form" page, then you see those data items in the select boxes in the form. However, our Finder control makes calls back to the server, to get data for the more complex hierarchical code sets such as ICD chapter/sub-chapter/causes and Regions/States/Counties. You may wish to run a query (via the user interface) that "groups by" the desired fields, and then export the data in order to see the mapping of code value to label for each category in that field. This approach is particularly helpful for hierarchical lists, such as the International Classification of Diseases (ICD), Regions/States/Counties, Vaccine Type/Vaccine, or Year/Month lists. We haven't yet built a system for external partners to record or generate desired XML queries. If you have a specific focus, describe the desired queries and we may record some queries for you and send the XML to you for re-use. 3) What is the key to understanding the query parameters? In the request file:
4) How often to send automated queries via the API? If you are running a single robot to "data mine" the system, please post the queries in a series, one at a time. Please don't run multiple instances simultaneously. Firing a query every 2 minutes provides good recovery time of our system. If queries are embedded in your own web pages called by (human) users as needed, then don't worry about timing. 5) How often are data updated? Most of the data in the WONDER online databases are updated annually. VAERS case reports update monthly. Some data updates result in new dataset ID, new data fields, new measures for output. For many online databases, the database ID will change when the next “vintage” or series of data spanning several years, is released. For example, the cases reports for previous years and the population estimates may be updated with revisions, when the most recent data are added. To stay abreast of new releases, contact us and ask to be put on our notification list for data updates. 6) How to get line-listed or record-level data? Queries to WONDER do not pull a copy of the entire record-level database. Instead, the results are summary statistics that meet the query criteria. Some of the datasets in WONDER contain one case per record, such as the TB or the VAERS case reports. Other data are summarized before being loaded into WONDER. All of the "security" issues such as data query validation and data suppression rules are applied internally by the WONDER system, the XML request cannot access secure data. If you wish to get more details or record-level data for a specific data collection, please contact us and we'll refer you to a contact for the desired data source. 7) What else to keep in mind? "Fine print:"
B_ are "by-variables" or those parameters selected in the "Group Results By" and the "And By" drop-down lists in the "Request Form." These "by-variables" are the cross-tabulations, stratifications or indexes to the query results. Expect the results data table to show a row for each category in the by-variables, and a column for each measure. For example, if you wish to compare data by sex, then "group results by" gender, to get a row for females and a row for males in the output. M_ are measures to return, the default measures plus any optional measures. V_ are variable values to limit in the "where" clause of the query, found in multiple select list boxes and advanced finder text entry boxes in the "Request Form." F_ are values highlighted in a "Finder" control for hierarchical lists, such the "Regions/Divisions/States/Counties hierarchical" list. I_ are the contents of the "Currently selected" information areas next to "Finder" controls in the "Request Form." VM_ are values for non-standard age adjusted rates (see mortality online databases). O_ are other parameters, such as radio buttons, checkboxes, and lists that are not data categories (Calculate Rates Per). Among the things an "O_" may specify:
- "O_age" sets which field is used for age groups in the Detailed Mortality 1999-2013 (D76) online database example, such as whether the radio button is set to use the single-year age groups;
- "O_ucd" sets which field to use for underlying cause of death in the Detailed Mortality 1999-2013 (D76) online database example, such as whether the radio button is set to use the ICD chapters, or the ICD 113 selected causes or the ICD 130 selected causes for infants;
- "O_V1_fmode" in the example files sets the "mode" for "Finder" control selections that specify values under the specified data variable field. In the examples, the "V1" field refers to the D76.V1 "Year/Month" field. The "fmode" value specifies whether to pull selections from the Finder's select list, or to pull selections from the "Advanced Finder" text entries. For example, when "freg" is the value, then the "Finder" control is in "regular" mode, and the "F_" values are used as query criteria. Change "freg" to "fadv" when using the "V_" values, per an advanced mode "Finder" control. In the example files, the "O_V1_fmode" value controls whether we use values specified under "F_D76.V1" or "V_D76.V1" as query criteria, in the examples. When "Finder" is regular mode, then the values under the example "F_D76.V1" specify each year separately:However, when a "Finder" is in advanced mode, the value selections are pulled from a text area, Each parameter value starts on a new line. For example, for years under V_D76.V1, the text blob to indicate a request for data in years 2009-2011:
- Credit: If you are running the queries to pull the data into another data analysis system, when the data are "published" in any form, please be sure to include WONDER in the credit ("powered by CDC WONDER" is fine) and also give the official data source citations in the "Suggested Citation" data that comes packaged with the result sets. Some of those citations mention WONDER, in that case, you need not mention it twice.
- Notes: Please be sure to include the footnotes and the "caveats" or explanatory notes that appear below the results set. Those messages are important to the original data providers as part of public data release.
- Suppressions: Also, please be sure to observe the suppression constraints, if you re-assemble the data. Suppression constraints vary per each online database, and these are determined by the data provider, we merely implement these rules in WONDER. The online help for any specific online database describes the suppression constraints imposed, or feel free to ask us for any details. All data on CDC WONDER are covered by the data restrictions described at http://wonder.cdc.gov/datause.html. For more information on the Detailed Mortality confidentiality constraints, see the "About" tab on the user interface or the "About" section in the response XML file, and also see Assurance of Confidentiality for Vital Statistics. Please note that only national data are accessible to API queries for data from the National Vitals Statistics System.