| Overview | Package | Class | Tree | Index | Help | |||
| PREV CLASS | NEXT CLASS | FRAMES | NO FRAMES | ||
| SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||
FinalResult is an extension to the Result
interface that provides information about a result that has been
finalized - that is, recognition is complete. A finalized result
is a Result that has received either a
RESULT_ACCEPTED or RESULT_REJECTED
ResultEvent that puts it in either the ACCEPTED
or REJECTED state (indicated by the getResultState
method of the Result interface).
The FinalResult interface provides information for
finalized results that match either a DictationGrammar
or a RuleGrammar.
Any result object provided by a recognizer implements both the
FinalRuleResult and FinalDictationResult
interfaces. Because both these interfaces extend the FinalResult
interface, which in turn extends the Result interface,
all results implement FinalResult.
The methods of the FinalResult interface provide information
about a finalized result (ACCEPTED or
REJECTED state). If any method of the FinalResult
interface is called on a result in the UNFINALIZED state, a
ResultStateError is thrown.
Three capabilities can be provided by a finalized result:
training/correction, access to audio data, and access to alternative guesses.
All three capabilities are optional because they are not all relevant
to all results or all recognition environments, and they are
not universally supported by speech recognizers.
Training and access to audio data are provided by the
FinalResult interface. Access to alternative guesses
is provided by the FinalDictationResult and
FinalRuleResult interfaces (depending upon the type
of grammar matched).
Training / Correction
Because speech recognizers are not always correct, applications need
to consider the possibility that a recognition error has occurred.
When an application detects an error (e.g. a user updates a result),
the application should inform the recognizer so that it can learn
from the mistake and try to improve future performance.
The tokenCorrection is provided for an application to provide
feedback from user correction to the recognizer.
Sometimes, but certainly not always, the correct result is
selected by a user from amongst the N-best alternatives for a
result obtained through either the FinalRuleResult
or FinalDictationResult interfaces. In other cases,
a user may type the correct result or the application may infer
a correction from following user input.
Recognizers must store considerable information to support
training from results. Applications need to be involved in
the management of that information so that it is not stored
unnecessarily. The isTrainingInfoAvailable method
tests whether training information is available for a finalized result.
When an application/user has finished correction/training for a result
it should call releaseTrainingInfo to free up
system resources. Also, a recognizer may choose at any time to free
up training information. In both cases, the application is
notified of the the release with a TRAINING_INFO_RELEASED
event to ResultListeners.
Audio Data
Audio data for a finalized result is optionally provided by recognizers. In dictation systems, audio feedback to users can remind them of what they said and is useful in correcting and proof-reading documents. Audio data can be stored for future use by an application or user and in certain circumstances can be provided by one recognizer to another.
Since storing audio requires substantial system resources,
audio data requires special treatment. If an application wants to
use audio data, it should set the setResultAudioProvided
property of the RecognizerProperties to true.
Not all recognizers provide access to audio data. For those
recognizers, setResultAudioProvided has no effect,
the FinalResult.isAudioAvailable always returns
false, and the getAudio
methods always return null.
Recognizers that provide access to audio data cannot always provide
audio for every result. Applications should test audio availability
for every FinalResult and should always test for
null on the getAudio methods.
| Field Summary | |
| static int | DONT_KNOW
The DONT_KNOW flag is used in a call to tokenCorrection
to indicate that the application does not know whether a
change to a result is because of MISRECOGNITION
or USER_CHANGE.
|
| static int | MISRECOGNITION
The MISRECOGNITION flag is used in a call to
tokenCorrection to indicate that the change is
a correction of an error made by the recognizer.
|
| static int | USER_CHANGE
The USER_CHANGE flag is used in a call to
tokenCorrection to indicate that the user has
modified the text that was returned by the recognizer to
something different from what they actually said.
|
| Method Summary | |
| AudioClip | getAudio()
Get the result audio for the complete utterance of a FinalResult.
|
| AudioClip | getAudio(ResultToken fromToken,
ResultToken toToken)
Get the audio for a token or sequence of tokens. |
| boolean | isAudioAvailable()
Test whether result audio data is available for this result. |
| boolean | isTrainingInfoAvailable()
Returns true if the Recognizer
has training information available for this result.
|
| void | releaseAudio()
Release the result audio for the result. |
| void | releaseTrainingInfo()
Release training information for this FinalResult.
|
| void | tokenCorrection(String[] correctTokens,
ResultToken fromToken,
ResultToken toToken,
int correctionType)
Inform the recognizer of a correction to one of more tokens in a finalized result so that the recognizer can re-train itself. |
| Field Detail |
public static final int MISRECOGNITION
MISRECOGNITION flag is used in a call to
tokenCorrection to indicate that the change is
a correction of an error made by the recognizer.
public static final int USER_CHANGE
USER_CHANGE flag is used in a call to
tokenCorrection to indicate that the user has
modified the text that was returned by the recognizer to
something different from what they actually said.
public static final int DONT_KNOW
DONT_KNOW flag is used in a call to tokenCorrection
to indicate that the application does not know whether a
change to a result is because of MISRECOGNITION
or USER_CHANGE.
| Method Detail |
public boolean isTrainingInfoAvailable()
throws ResultStateError
true if the Recognizer
has training information available for this result.
Training is available if the following conditions are met:
isTrainingProvided property of the
RecognizerProperties is set to true.
TRAINING_INFO_RELEASED event has not been issued.)
Calls to tokenCorrection have no effect if the training
information is not available.
public void releaseTrainingInfo()
throws ResultStateError
FinalResult.
The release frees memory used for the training information --
this information can be substantial.
It is not an error to call the method when training information is not available or has already been released.
This method is asynchronous - the training info is not
necessarily released when the call returns.
A TRAINING_INFO_RELEASED event is issued to
the ResultListener once the information is released.
The TRAINING_INFO_RELEASED event is also issued if the
recognizer releases the training information for any other reason
(e.g. to reclaim memory).
public void tokenCorrection(String[] correctTokens,
ResultToken fromToken,
ResultToken toToken,
int correctionType)
throws ResultStateError,
IllegalArgumentException
The fromToken and toToken parameters
indicate the inclusive sequence of best-guess or alternative
tokens that are being trained or corrected. If toToken is
null or if fromToken and toToken
are the same, the training applies to a single recognized token.
The correctTokens token sequence may have the
same of a different length than the token sequence being corrected.
Setting correctTokens to null indicates
the deletion of tokens.
The correctionType parameter must be one of MISRECOGNITION,
USER_CHANGE, DONT_KNOW.
Note: tokenCorrection does not change the result object.
So, future calls to the getBestToken, getBestTokens
and getAlternativeTokens method return exactly the same values as
before the call to tokenCorrection.
correctTokens
- sequence of correct tokens to replace fromToken to toToken
fromToken
- first token in the sequence being corrected
toToken
- last token in the sequence being corrected
correctionType
- type of correction: MISRECOGNITION, USER_CHANGE,
DONT_KNOWFinalResult
public boolean isAudioAvailable()
throws ResultStateError
ResultAudioProvided property of
RecognizerProperties was set to true
when the result was recognized.
Recognizer was able to collect result audio for
the current type of FinalResult
(FinalRuleResult or FinalDictationResult).
The availability of audio for a result does not mean that all
getAudio calls will return an AudioClip.
For example, some recognizers might provide audio data only for
the entire result or only for individual tokens, or not for
sequences of more than one token.
public void releaseAudio()
throws ResultStateError
isAudioAvailable will return false.
This call is ignored if result audio is not available or
has already been released.
This method is asynchronous - audio data is not necessarily
released immediately. A AUDIO_RELEASED event
is issued to the ResultListener when the audio is released
by a call to this method. A AUDIO_RELEASED event is also
issued if the recognizer releases the audio for some other reason
(e.g. to reclaim memory).
public AudioClip getAudio()
throws ResultStateError
FinalResult.
Returns null if result audio is not available or if it has been released.
public AudioClip getAudio(ResultToken fromToken,
ResultToken toToken)
throws IllegalArgumentException,
ResultStateError
Returns null if result audio is not available or
if it cannot be obtained for the specified sequence of tokens.
If toToken is null or if
fromToken and toToken are the same,
the method returns audio for fromToken.
If both fromToken and
toToken are null, it returns the audio
for the entire result (same as getAudio()).
Not all recognizers can provide per-token audio, even if they can provide audio for a complete utterance.
| Overview | Package | Class | Tree | Index | Help | |||
| PREV CLASS | NEXT CLASS | FRAMES | NO FRAMES | ||
| SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||
JavaTM Speech API
Copyright 1997-1998 Sun Microsystems, Inc. All rights reserved
Send comments to javaspeech-comments@sun.com