ICD-10-CM and Word Processing

We noted in one of our first posts that the National Center for Health Statistics releases ICD-10-CM as a 23 MB portable document format document (click here to view it). And we noted that ICD-10-CM therefore fails to meet a fundamental requirement for a modern diagnosis coding system, namely that we can use it in our computer systems directly (which would require at the very least some machine-processable text file such as comma-separated value or tab-delimited text, instead of a file format meant for humans to read or print).

We have learned that in fact, the NCHS uses a word processor to create and maintain ICD-10-CM. The following quote is from a presentation that Dr. Chris Chute gave as part of a seminar series of the National Center for Biomedical Ontology:

...the American 10 clinical modification will migrate to the tools that we're using to build ICD-11, benefiting from a better environment. They're using a word processor now...kind of pathetic actually...

We agree, that is pathetic. To hear it yourself, go to a point approximately 32 minutes into the presentation and listen from there.

So, if modern tools exist now for creation and ongoing maintenance of the next version of ICD, why is NCHS still using a word processor?

Dr. Chute does go on to say that NCHS will migrate to these tools and ICD-10-CM will "evolve to become identical to ICD-11". But not until after 2015, when ICD-11 is finalized.

So, for the next 7 years at least, NCHS will continue to maintain ICD-10-CM with a word processor, and release it as a giant text blob from which one cannot automatically and reliably extract the set of codes it contains for use in a database or spreadsheet.

Thus, we have additional evidence that ICD-10-CM is based on archaic practices and technology. And $11 billion or more to upgrade to something archaic is a waste of money.

An Even More Costly Prerequisite

In our last post, we mentioned that a standard called 5010 must be in place before ICD-10-CM. The reason is that 5010 replaces a standard that cannot accommodate ICD-10-CM.

5010 is a standard for submitting health care insurance claims.

First, a word about health care insurance. If you have health insurance and receive care from a doctor and/or a hospital, they submit a claim on your behalf to your insurance company. The insurance company pays the doctor and/or hospital directly for the services they provided. You are spared the hassle of receiving a bill, submitting a claim to the insurance company yourself, receiving the check in the mail, then sending it to the doctor and/or hospital to pay the bill. The doctor and hospital benefit as well, since they receive their payment in a more direct, timely, and reliable manner.

This description simplifies things quite a bit. But it and the fact that doctors and hospitals must put billing diagnoses on the claim form is sufficient to explain the need for 5010.

The whole process is even more efficient if doctors and hospitals submit claims electronically from their computer to the insurance company's computer. Because there are over 100,000 physician practices and hundreds of insurance companies--all of whom use computer systems from hundreds of software companies, the process of submitting claims electronically is made even simpler if all these entities use a standard electronic claim form. Any doctor or hospital using any standard-compliant computer system can submit a claim to any insurance company also using a standard-compliant system.

Today, this standard is 4010A. A law passed by the U.S. Congress in 1996 (called the Health Insurance Portability and Accountability Act) gave the Department of Health and Human Services (HHS) the power to mandate that all claims submitted electronically by organizations "covered" by this law (and nearly every doctor and hospital is "covered") use this standard. And HHS did so. And the health care system had to comply.

All told, implementation of 4010A cost the health care industry an estimated $28 billion. Yes, billion with a 'b'. And that's not our estimate, it's the estimate of HHS. Who has a bias towards underestimating the impact of their regulations on the industry so they can keep imposing more regulations. In their impact analysis on the rule to adopt 4010, HHS states: ...Although we cannot determine the specific economic impact of the standards being proposed in this rule
(and individually each standard may not have a significant impact), the overall impact analysis makes clear that, collectively, all the standards will have a significant impact of over $100 million on the economy. $100 million?

Well, 5010 is an 'upgrade' to 4010A. And to use ICD-10-CM as a coding system for billing diagnoses on claim forms, it is a requirement to upgrade to 5010.

Why can't we use ICD-10-CM codes on 4010A?

Because the 4010A form has a limited-length field for diagnosis codes. It limits the length of diagnosis codes to a maximum of 5 digits (warning: pdf, and see page 10 for the limit), the maximum length of an ICD-9-CM code. Why didn't the designers of 4010A allow for longer field lengths, knowing that HHS and others were anticipating an upgrade to ICD-9-CM? We don't know.

However, the maximum length of an ICD-10-CM code is 7 digits. So, there must be a change to the standard electronic claims form or we can't use ICD-10-CM. And that change is 5010, which fixes a number of deficiencies of 4010 in addition to the limit on diagnosis codes.

Which finally brings us to the cost. How much will it cost the industry to upgrade from 4010A to 5010?

By HHS' own estimate in the Notice of Proposed Rulemaking (NPRM) for 5010 (a different NPRM from the one mandating the upgrade to ICD-10-CM), it will cost anywhere from $5.6 to $11.2 billion (yes, with a 'b' again).

Here is a breakdown of the costs to the industry of adopting 4010A and HHS' estimates of the costs for upgrading to 5010 (numbers represent millions of dollars):

*Includes conversion to 5010 and another standard called D.0

We agree that it is reasonable to conclude that, because 4010A was the first time the industry implemented a standard electronic claims form, the cost of an upgrade to 5010 will be lower than the costs of adopting 4010A in the first place.

However, is it reasonable to assume a 60-80% reduction in costs?

Well, the Blue Cross and Blue Shield Association has identified approximately 850 complex changes that 5010 makes to 4010A. They also note (warning: ppt) that 5010 is a suite of standards for nine types of electronic claims transactions, and that a 5010 implementation guide for just one of the nine transactions is 700 pages long, with at least one modification made on every single page.

A reasonable estimate for a more modest, first upgrade is probably a 50% reduction.

But a 60-80% reduction for an aggressive, complex upgrade? We don't think so.

The update to 5010 will most likely cost the industry well over $10 billion. Even if the cost of an electronic medical record (EMR) were $100,000 per physician, $10 billion is enough to equip 100,000 physicians with one.

Thus, the prerequisite to ICD-10-CM is over $10 billion and ICD-10-CM itself will cost approximately $1 billion or more to implement, for a total of >=$11 billion to upgrade our diagnosis coding system in the United States.

If we're going to spend that much money upgrading our diagnosis coding system, shouldn't it be state of the art?

HHS Ignores Advice It Asked For

In the Notice of Proposed Rulemaking or NPRM (warning: pdf) to mandate a switch to ICD-10-CM from ICD-9-CM for classifying diagnoses, the Department of Health and Human Services (HHS) mentions, on page number 49802 (the rule is in the Federal Register), that the Workgroup on Electronic Data Interchange (WEDI) sent the Secretary of HHS a letter on May 31, 2006.

The mention of this letter is significant because:

1. HHS is required by law to consult with WEDI on adoption of new code sets.
2. WEDI held a forum in April of 2006 to determine when and how to adopt ICD-10-CM.
3. The rule makes no mention of the recommendations of this letter.
4. The rule makes recommendations that directly conflict with the recommendations in the letter.

Perhaps Congress requires HHS to consult with WEDI because it recognizes that bureaucrats are wont to run roughshod over industry. If so, the NPRM is a good example of just such bureaucratic tendencies.

The official letter that WEDI sent to the Secretary of HHS is not available publicly: one must have a login to the WEDI web site to access it. Nevertheless, there are two publicly available documents that summarize the recommendations:

1. Co-Chair Report on ICD-10 Forum Discussion (warning: pdf)
2. WEDI ICD 10 Forum Recommendation to HHS Final Draft (warning: pdf)

We don't know if the latter truly represents the version that WEDI sent to the Secretary. For one thing, it does not even have a date.

However, the key recommendations from both documents are the same, and they are clear.

One recommendation that HHS blatantly ignores in its NPRM (it does not even mention the recommendation, let alone try to rebut it), is that implementation of another standard--known as 5010--should occur first. The NPRM requires that the industry adopt 5010 and ICD-10-CM concurrently, but that 5010 is required by April 1, 2010 and ICD-10-CM is not required until October 1, 2011.

Now it may seem that 5010 precedes ICD-10-CM. However, to meet those deadlines, the industry will have to start working on both standards now, and thus work on them concurrently.

The WEDI recommendation clearly states: This upgrade [to 5010] is too significant to be done in conjunction with ICD-10-CM and ICD-10-PCS conversion.

No wonder HHS doesn't mention this recommendation in the NPRM. It is too inconvenient. And it is too compelling to confront directly.

In a story about the effect of implementing 5010, the Blue Cross and Blue Shield Association notes that 5010 makes 850 complex changes to its predecessor standard.

Also, in 2007 WEDI and the North Carolina Healthcare Information and Communications Alliance (NCHICA) developed a detailed project plan that outlines all the steps the industry must take and milestones it must meet to adopt 5010. They derived a date of 2014 for final implementation of 5010 without ICD-10.

Yet HHS wants to adopt 5010 and ICD-10-CM by 2011?

WEDI is holding a policy advisory group forum from September 9-11 (just after this post) to address the ramifications of the NPRM on 5010 and ICD-10-CM. Let's hope they take HHS to task for ignoring the advice they gave it--advice that HHS by law is required to take into account.

Do We Need 290 Codes for Diabetes Mellitus?

Despite the emerging genomics revolution that promises to identify the genetic and molecular basis of disease with unprecedented precision, the state-of-the-art science on the nature of diabetes mellitus has identified fewer than 50 subtypes of diabetes mellitus (for example, see the paper Diagnosis and Classification of Diabetes Mellitus).

Nevertheless, ICD-10-CM has approximately 290 codes for diabetes mellitus, not counting diabetes mellitus that arises during the course of pregnancy (also known as gestational diabetes mellitus). We say approximately because again, ICD-10-CM comes as a text document in a pdf. We counted twice and got 289 codes the first time, and 291 codes the second time. These codes span a full 21 pages of the ICD-10-CM document.

So, if we know there aren't 290 types of diabetes mellitus, how does ICD-10-CM derive 290 codes for it?

Combination codes.

A combination code is a code that allows the medical records coder (an entire profession has evolved to review the medical record, apply the rules for assigning billing codes, and create the final set of billing codes submitted to the third-party payer for payment) to assign several diagnoses (or, more properly classes of diagnoses) to a patient in one fell swoop. In addition, it helps to avoid the problem of choosing one diagnosis category as the "primary diagnosis". The coder may assign a combination code as the primary diagnosis, and voila, multiple diagnosis categories are all at once the primary diagnosis, with no messy decisions about which one was the most important or proximate cause of the medical care provided to the patient.

Here is an example of the combination codes created under the heading of diabetes mellitus (click on image to see the whole thing):

Code E11.321 is a combination of two diagnoses, a level of severity, and a physical manifestation of one of the diagnoses: type 2 diabetes mellitus, nonproliferative diabetic retinopathy, mild, and macular edema, respectively. All the possible combinations of types of diabetic retinopathy, severity, and presence/absence of macular edema are present under E11.3 Type 2 diabetes mellitus with ophthalmic complications.

Now, suppose you are a researcher who studies diabetic retinopathy to develop new treatments for this disease, which is the leading cause of blindness in the United States. Suppose further that for a particular study, you were interested in finding all the patients in your data set with nonproliferative diabetic retinopathy.

Instead of searching for all patients with just a single code that represents nonproliferative diabetic retinopathy, you have to locate in the ICD-10-CM pdf all the ICD-10-CM codes that include nonproliferative diabetic retinopathy. Then, you must search on all the codes you locate in this manner.

Nonproliferative diabetic retinopathy is combined with other diagnoses in approximately 50 ICD-10-CM codes. If you miss one, you'll fail to find patients who are potentially eligible for your research study. And since ICD-10-CM is a giant text blob, you cannot rely on the computer to find all 50 codes automatically for you. You have to search the pdf manually.

Combination codes make it hard to use ICD-10-CM encoded data for epidemiology, clinical research, decision support, and any number of other so-called "secondary" uses of medical records data (called secondary uses because the primary use is for the actual care of the patient).

Wouldn't it be better to have one code for one diagnosis? And to assign as many codes as the patient has diagnoses?

Is ICD-10-CM really a Diagnosis Coding System?

The answer, perhaps surprising, is no, it is not. ICD-10-CM, like its predecessor, ICD-9-CM, provides codes for categories or classes of diagnoses, but not individual diagnoses.

For example, on page 3 of the 23MB pdf (warning: pdf) that represents ICD-10-CM in its official release format, we find A01.02 Typhoid fever with heart involvement. In the class represented by this code, the file lists two diagnoses:
1. Typhoid endocarditis
2. Typhoid myocarditis

The two diagnoses of typhoid endocarditis and typhoid myocarditis do NOT have their own code in ICD-10-CM. The code A01.02 represents a class of diagnoses, into which at least two diagnoses fall that have no code themselves.

Thus, we see that ICD-10-CM, true to its name, is a classification system. It does not purport to provide codes for individual diagnoses.

A more extreme example is G40.3 Generalized idiopathic epilepsy and epileptic syndromes. Here is a snapshot taken from the ICD-10-CM pdf:

Thus, G40.3 is a class of diagnoses that contains no fewer than 13 individual diagnoses.

Because ICD-10-CM tries to provide a class for every possible diagnosis, present or future, it creates a partition of the diagnosis space. As a result, it requires complex inclusion and exclusion criteria to determine which class or “pigeonhole” each diagnosis falls. These criteria often make it difficult to assign the correct code to a particular patient.

For example, C49 Malignant neoplasm of other connective and soft tissue—and its 15 subclasses—all have the following list of inclusion and exclusion criteria, which span the page break:

Note that, like the rest of ICD-10-CM, none of these inclusion and exclusion criteria are available in a format we can import into a database. Thus, before we can write programs that manipulate these criteria to ensure correct coding, we have to manually type them into our database tables, an error-prone and time-consuming process.

Because of the complexity of assigning a diagnosis to the correct ICD-9-CM category (a situation not ameliorated by ICD-10-CM), the accuracy of data coded with ICD-9-CM suffers. For example, one study found that up to 15-20% of patients classified as having acute stroke did not in fact have a stroke.

Another artifact of the partitional nature of ICD-9-CM and ICD-10-CM is that they both contain wastebasket categories, into which ‘everything else’ under a particular heading goes. For example,

The problem with these types of classes is that their semantics changes over time.

A real-world example of such a change occurred in ICD-9-CM with respect to coding of viral hepatitis. The following chart shows a decline in the incidence of Hepatitis, unspecified beginning about 1981 (open image in a new window to see it more clearly).

This decline was co-incident with the introduction of a code for the class of diagnoses of Hepatitis, Non-A, Non-B. Thus, the true incidence of diseases classified as Hepatitis, unspecified did not change. Rather, the definition of the class itself changed.

These types of wastebasket categories wreak havoc with accurate disease statistics over time. The history of ICD-9-CM is that important diseases such as AIDS and Hepatitis C initially get captured by wastebasket categories, then receive their own codes as they are defined by medical science. The statistics of the incidence and prevalence of these diseases subsequently become quite distorted and difficult to manage.

Yet another problem with ICD-10-CM classes or categories is that they often have criteria that have nothing to with diagnoses or disease, but instead to the timing and nature of the treatment of disease. For example, under the class M48.4 Fatigue fracture of vertebra, we find a requirement to add a 7th character to the code based on (1) whether it is the patient’s first visit to the health care system for such fractures, or a subsequent visit; (2) the rapidity with which the fractures have healed; and (3) whether any complications of such fractures are present:

Wouldn’t it be simpler to switch to a diagnosis coding system where each diagnosis receives its own code?