Personal Health Assistant
Conversational Personal Medical Assistants
James Allen and George Ferguson
Department of Computer Science
Center for Future Health
University of Rochester
How can automated systems best help people solve their problems and get things done? Answering this question requires multi-disciplinary basic research in areas ranging from linguistics and psychology to computer science and systems engineering.
The standard approach is to build a custom system for every problem. Unfortunately, this leads to a proliferation of unique systems (think of the proliferation of remote controls associated with individual devices in your home). Each system requires training to use, if they are usable at all (think of VCRs). But most importantly, this approach isolates problems rather than integrating solutions. Most real problems are solved by combining expertise from various relevant areas and working through the solution incrementally and interactively.
Rather than design an individual tool for every problem, our approach is to design and build a conversational assistant. Such a system interacts with people in natural, spoken language, thereby requiring no training to use. It understands not only the specifics of the problem at hand but also how to work together to develop solutions. And, like a good human assistant, it tries to do what it can to help solve the problem. When designed properly, most of the system is generic, and thus readily applicable to new domains and problems.
There are many health care problems where the conversational assistant approach would be effective. Conversational interaction requires no training, and so is uniquely appropriate for systems to be used by patients in their homes. For example, a medication advisor could know about a patient's prescription regimen, have background knowledge about drugs and drug interactions, and be able to combine these to help a patient take their medications properly. The conversational paradigm naturally supports answering patients' follow-up questions (e.g., "why am I taking Celebrex?") as well as letting the system give advice when appropriate (e.g., "you should take your Prinivil now"). Helping to manage chronic non-critical conditions such as diabetes could be done similarly. The system's ability to gather and use longitudinal data about the patient (in conjunction with other devices) could be a big advantage over infrequent visits to the clinic or doctor's office. As well as patient-centered systems, the conversational assistant paradigm can be applied to helping experts such as doctors, nurses, or pharmacists solve problems. The system's underlying capabilities allow it to process large volumes of date and answer a broad range of questions. The conversational paradigm provides access to this power without programming or learning special graphical interfaces.
The results of our research are technologies including the following:
- Spoken language recognition and synthesis: We believe that speech is the only truly natural interface, requiring no training to use and affording very efficient two-way communication.
- Natural language understanding: It is not enough to simply recognize the words that a person says. Rather, the meaning of the words must be understood at a higher level in order to support reasoning about them.
- Intention recognition: In particular, for effective collaboration, we need to determine why the person said something, i.e., the intention behind their utterance. To do this, a system must combine knowledge about language with knowledge about the task(s) at hand.
- Collaborative problem solving: Collaboration occurs when the user and the system adopt each other's intentions, resulting in a shared commitment to getting the job done. The system's behavior is then driven by the desire to do what it can towards those commitments.
- Mixed-initiative systems: Unlike an automated call center or a web-based help desk, neither the system nor the user is always in charge. Rather, since the person and the system have different skills and different needs, a mixed-initiative model allows whoever is best qualified to be in charge.
- Knowledge-based decision support: Underlying the system's behavior is a rich body of knowledge about the task at hand. The system understands and can reason about this knowledge, providing answers to questions, hypotheses about user intention, and background knowledge necessary for solving the problems.
- Multimodal presentation: Although speech is the primary modality, some information is best communicated visually if circumstances permit. Integrating such displays with the ongoing conversation is crucial to support phenomena such as follow-up or drill-down questions.
- System architectures: Systems such as these are complex, made up of numerous specialized components interacting and sharing information in complex ways. We use an agent-oriented architecture and a common knowledge representation to pull the pieces together.
There are open research questions in all of these areas. However, we have also found it very useful to situate the basic research within a small number of challenge domains or problems. These are real-world situations where a conversational assistant can be effective. Applying the basic theories and system architectures to the real-world problems provides a reality check that helps avoid over-simplified solutions. More importantly, it allows us to evaluate the system and measure our progress. It also provides proof-of-concept demonstrations of the utility of the conversational assistant paradigm.
To develop a specific conversational assistant system, we work with task and domain experts to develop the expertise that the system will need in order to be useful. This knowledge engineering is a crucial part of any knowledge-based decision support system. We also use focus groups of users in order to determine typical problems they might need to solve, how they go about solving them, how an assistant can help (using a human assistant as a surrogate), and the specific language that they use to discuss their problems. From this data, we can tailor the speech and language understanding components for the domain, as well as refining the collaborative problem solving model to the task at hand. Typically both sources of input, expert and user, lead to new requirements at the basic research level, which motivates our future work.
Ferguson, G., J. F. Allen, N. J. Blaylock, D. K. Byron, N. W. Chambers, M. O. Dzikovska, L. Galescu, X. Shen, R. S. Swier, and M. D. Swift. The Medication Advisor Project: Preliminary Report, Technical Report 776, Computer Science Dept., University of Rochester, May 2002. View article
Allen, J., G. Ferguson, and A. Stent, "An architecture for more realistic conversational systems," in Proceedings of Intelligent User Interfaces 2001 (IUI-01), pp. 1-8, Santa Fe, NM, January 14-17, 2001. View article
Allen, J., D. Byron, M. Dzikovska, G. Ferguson, L. Galescu, and A. Stent, "Towards conversational human-computer interaction," AI Magazine, 22(4), pp. 27-37, Winter 2001. View article