Three design principles for electronic medical records

Design principles for electronic medical records have, to date, focused on sociotechnical issues, such as privacy and interoperability, and human factors issues, including usability principles (e.g., the HIMSS usability guidelines). In this document, I propose several architectural principles that future systems should respect; these principles are motivated by the need for the system to scale with the expected growth of the size of the medical record: the electronic medical record for a patient will grow over time as more information is recorded about each patient. This problem was avoided in the paper world because all the information relevant to the care of a patient rarely appeared at the same place at the same time, or was ignored or glossed over if somehow it did.

EMR growth is already an issue in large systems such as the VA's CPRS/VistA and will become more prevalent as health care systems and patients move to electronic representations of medical data. There will be too much raw data, slowing searchs and presenting a bewildering array of data to the clinician. Because the EMR will serve as a permanent memory of clinical information, the chart will never get shorter, but the amount of time per patient will remain roughly constant; this implies that the amount of work per unit time will have to increase. Some kind of organization and summarization of information will be needed.

How to do this? Record historical data once, use simple summarization heuristics to shorten clinical encounter information (labs, meds, hospitalizations, progress notes), and allow disease and treatment hypotheses to become more accurate over time. The principles in greater detail:

1. Record historical information only once

All information should be stored only once, but historical information about the patient should grow very slowly, if at all. (I'm referring to the user's view of the database, as internally there may be a need for redundant representation. My point has nothing to do with the concepts from database theory of normalization and de-normalization.) Examples include: where they were born and raised, family structure while growing up, school performance and highest level of education, and current living situation and financial support. For the parts that do change, it will necessary to revoke information without deleting it. It should be possible to annotate previously recorded information as of suspect accuracy. In general, this kind of information will be sorted by date.

2. Use summarization to get sub-linear growth

Work should grow at a rate less than linear in size of database. This is important for the parts of the database that grow most rapidly, namely records of encounters (progress notes), medications, lab values, and problems on the problem list. For medico-legal reasons, it seems unwise to adopt fully automatic ("artificial intelligence") methods, and instead focus on some simpler heuristics for information summarization. Some examples:

When the list of prior hospitalizations gets too long, show the first and last, report the length of the list, along with information abstracted from the intervening hospitalizations: the discharge diagnoses, the classes of the those diagnoses (e.g., psychotic disorder), the discharge medications, and the classes of those medications.
When the list of prior medications gets too long, list them by class, by frequency of use (with some cutoff). List those "orphan" medications that are prescribed for disorders the patient does not appear to have.
In general, try to sort the information by problem or diagnosis to put it in context.
Graph medication intervals and lab values with problem intervals.
Try to make the "medication trial" an actual entity. The prescribing clinician would associate the drug with the clinical problem.

3. Design to get monotonic increases in accuracy

Adding information should increase the accuracy of its model of the patient. This is certainly the behavior we would expect from a clinician as they learn more about a given patient. However, adding facts to a database may only serve to complicate or slow information retrieval. There are several examples in statistics where adding more data improves accuracy. The Law of Large Numbers in statistics says that increasing the number of observations makes the sample mean converge on the expect value of the mean. Similarly, in meta-analysis, adding studies provides tighter confidence intervals and can show evidence of convergence. Bayes rule uses prior probabilities and what is known about conditional probabilities of some new evidence to produce posterior probabilities; typically prior probabilties are diffuse, while posterior probabilities are more narrowly distributed. It may not be an accident that these examples all arise in statistics, as it provides the theoretical structure necessary to infer convergence. In contrast, collecting random sample of articles from newspapers would not likely result in convergence to anything. In general, convergence requires that there is some entity that is the target of convergence, along with some method of combination to put the pieces of data together.