Monday, August 11, 2008

A fundamental requirement

If we're going to have a database of health care data, then the codes we use to refer to diseases ought to be in a form that we can import into a database.

In other words, we require a file that associates each code with a disease in a way that the computer can reliably extract and work with the codes.

For example, a common, machine-readable file is a comma- or tab-delimited file:

12345,Coronary artery disease
23456,Diabetes mellitus
34567,Asthma

However, neither ICD-9-CM nor ICD-10-CM are available in such a format. In fact, both disease classifications are released in a file format that makes it impossible to import codes directly into database tables.

The National Center for Health Statistics—which maintains the disease classifications in ICD-9-CM (it also includes a classification of procedures, which is a long story for another post) and ICD-10-CM—releases ICD-9-CM in a rich text file format. It releases ICD-10-CM in portable document format (pdf).

Here is a portion of ICD-10-CM from its 2,392 page, 23MB pdf file:



The net implication is that you cannot get ICD-9-CM or ICD-10-CM, from its source, in a format that you can import into a database table.

Thus, neither ICD-9-CM nor ICD-10-CM meet the most basic requirement for moving health care information into the modern era: we need disease codes in a format for use in computers, not for printing!

No comments: