Enovation: Despite earlier hopes of a brave new world for speech-recognition systems, technical difficulties have led to a slow take-up in business and the health service.Is the pace about to get faster?

Voice-driven systems hold a lot of promise for the health service, in areas as diverse as treating emergency cases and routine surgery.

US firm Talk Technology, recently bought by Belgians Agfa, is developing a software package called TalkStationEM, designed to allow clinicians to dictate medical charts directly, using both speech recognition and intelligent templates.

The system automatically codes the aural input and turns it into text. The company is also working with Marconi Medical Systems on adding voice recognition to Marconi's radiology system, the picture archive communication system.

Another US firm, Stryker Corporation, has developed a voice-based command system for use in operating theatres, so a surgeon can control such things as lights, television pictures, and positions of cameras recording an operation. But though the potential is considerable, there is still a relatively long way to go before that potential is realised. At the moment, this is a technology that is only being used in a few isolated areas of the health service.

The Stryker system Hermes, for instance, is only in use in two hospitals in the UK. It has been installed in North Hampshire Hospital for almost two years but until recently, when Leeds Nuffield Hospital announced it would also install Hermes, no UK hospitals had followed that lead.

Just for once, this slow take-up of technology is not a case of the NHS lagging behind commercial counterparts.Voice-driven systems as a whole have been hyped for some years, but have failed to meet users' expectations.As a result, the market is seen as a niche and the companies that expected to do well in this area are having a tough time.

Belgian-based speech-recognition specialist Lernout & Hauspie, for instance, which last year bought out one of its rivals, Dragon, was optimistic about the prospects for voice-driven systems in general last year, and for healthcare voice systems in particular.

It ran a series of roadshows to demonstrate its VoiceXpress systems in three main areas, radiology, cardiology and pathology. Now, however, the company is in bankruptcy protection.

Part of the problem afflicting the development of voice-driven systems is the huge technical demand of processing voice commands.

There are two types of voice-based software systems. The first, probably better known in the consumer market, are those where the user dictates into a microphone on their PC and the words appear on the screen. The user can then correct the text on the screen. The problem with these packages is that it is hard to achieve high levels of accuracy and systems take a long time to set up and learn to use.

A different approach is to build on a very familiar concept to many clinicians: dictation.Many professionals dictate reports which are then typed up by medical secretaries. By using intelligent recognition systems to go over the dictated reports and documents, these newer speech-to-text recognition systems are more accurate. The backend system, which carries out the checking, can sit on a server on the user's local system or could even be accessed through the Internet.

Marcel Wassink, European sales and marketing director of Philips Speech Processing, an Austrianbased firm that sells the SpeechMagic voice recognition system, says companies like Dragon and IBM, which have developed the direct-toscreen voice systems, have suffered because they focused on trying to provide low-cost software.

'I think these companies went in too early, ' he says.

Mr Wassink believes companies like Philips Speech Processing, which have a background in selling dictation systems, are now finding very little competition and growing demand in the UK healthcare market, mainly because of an increasing shortage of medical secretaries.

'We have defined our initial target as users used to dictation. That is, professional users in specific niche areas, using specific vocabularies, ' he adds. 'That provides the system with very high recognition rates.

We see speech recognition as a tool to make the document creation process more effective.'

The Philips system is based on the concept of users dictating reports into a digital dictation system, creating an audio file that is sent to a medical secretary to type up. If, however, no secretary is available, there is an option for the user to see the file on their own screen and do their own corrections.

The company is positioning its software as part of the general trend to implement electronic patient records; it points out that there is already a shortage of medical secretaries and that speech-recognition software could be a key enabling tool to help create and maintain documents online.

Philips claims its SpeechMagic software is being used in more than 400 hospitals across Europe.

In the UK, however, take-up has been relatively slow, partly because Philips has been working with software vendor Amersham, to integrate the Philips software into Amersham's Radiology Information System (RIS), rather than selling the voice system as a stand-alone piece of software. This has led to some hold-ups, as Dr Richard Harries, a consultant radiologist at Princess Diana Hospital in north east Lincolnshire, has discovered.

Dr Harries has used Philips' SpeechMagic software to dictate his reports, having begun looking at the possible use of voice-recognition systems two years ago. 'It was too early, ' he comments. Initially, the radiologists saw speech recognition as a way of becoming more efficient.

More recently, though, the goalposts have moved.

'We now have a real shortage of medical secretarial staff, as a result of retirement and so on, ' says Dr Harries. 'There have been times when my reports have waited for two weeks to be typed.'

The trust always intended to include speech recognition software in the specifications of its new RIS, but the shortage of medical secretaries gave greater impetus to the search for a usable system.

With the installation of an Amersham RIS, Dr Harries and his colleagues spoke to Philips and tried out a stand-alone pilot of the company's SpeechMagic software. On the back of that trial, Dr Harries is now using a full version of the software - but the going hasn't been easy.

'The software was first installed three months ago, but was not fully integrated with the Amersham RIS, so further development work had to be carried out before trying again.

'Amersham has rewritten it so the software is now fully integrated and it works very well. I dictate my report, using a digital dictation system, just as I have done for the past 15 years.When I've finished, the file is sent to the recognition server, which carries out the recognition process and then sends the file back to the Amersham RIS. That takes about five or 10 minutes and the file is then available to access and check for correction.'

At the moment, Dr Harries is checking the reports himself, but this is a precaution.He sees this as part of the typist's function, once he is satisfied that the software has bedded down properly. The clear advantage of this approach is speed.Obviously, it is more efficient for Dr Harries to have his reports back more quickly, not least because it means specific cases are still fresh in his mind, making the process more accurate than waiting up to a fortnight for the report.

The system builds up a database of common associations, so it becomes more accurate in spotting the context of words in the radiological reports.

One point emphasised by Dr Harries is that this type of software is not intended to make secretaries redundant - it has been the shortage of secretaries that has driven forward its use.

'These are still early days, ' says Dr Harries. 'There have been high hopes for speech recognition systems, but it has taken a long time for them to become usable. I think we are getting there now.'

Reach for the skies: e-medics The technology that enables Eurofighter pilots to talk to their planes and ascertain the status of their complex computer systems, without diverting their attention from flying, has long been the envy of more mundane systems developers.

There have been promises over the years that this type of advanced, speech-driven technology could be used in other settings and one use was outlined earlier this year at the Healthcare Computing 2001 conference in Harrogate.It has been developed by QinetiQ, formerly the Defence Evaluation and Research Agency, and is being trialled by Surrey Ambulance Service.

The system, called e-medics, uses voice recognition technology developed by UK company 20/20 Speech, and is designed to help paramedics treat injuries faster and more accurately.

The system uses a portable computer, enabling paramedics to communicate over a radio data system with hospital-based accident and emergency experts.The paramedic uses a remote head-mike to feed in and access information about the patient's condition, so information can be provided quickly to the A&E staff and they, in turn, can provide any necessary instructions.

Speech recognition is just one part of the overall system, but is important in enabling the paramedic to access information in the database and leaving them free to work on the patient, rather than having to stop and key in information.