Page Summary
-
LanguageIdentifier is used for identifying the language of text.
-
You can create a LanguageIdentifier client with default or custom options.
-
This class can be used from any thread.
-
It provides methods to identify the most likely language or a list of possible languages.
A LanguageIdentification client for identifying the language of a piece of text.
A LanguageIdentifier is created via LanguageIdentification.getClient(LanguageIdentificationOptions) or LanguageIdentification.getClient() if you wish to use the default options. For example, the code below creates a LanguageIdentifier with default options.
Example:
LanguageIdentifier languageIdentifier = LanguageIdentification.getClient(); This class can be used from any thread.
Constant Summary
| float | DEFAULT_IDENTIFY_LANGUAGE_CONFIDENCE_THRESHOLD | The default confidence threshold for the identifyLanguage(String) call. |
| float | DEFAULT_IDENTIFY_POSSIBLE_LANGUAGES_CONFIDENCE_THRESHOLD | The default confidence threshold for the identifyPossibleLanguages(String) call. |
| String | UNDETERMINED_LANGUAGE_TAG | The BCP 47 language tag for "undetermined language" |
Public Method Summary
| abstract void | close() |
| abstract Task<String> | identifyLanguage(String text) Identifies the language in a supplied String and returns the most likely language. |
| abstract Task<List<IdentifiedLanguage>> | identifyPossibleLanguages(String text) Identifies the language in a supplied String and returns a list of possible languages, cutting off any languages whose confidence score falls below the threshold which is set in LanguageIdentificationOptions.Builder.setConfidenceThreshold(float). |
Inherited Method Summary
Constants
public static final float DEFAULT_IDENTIFY_LANGUAGE_CONFIDENCE_THRESHOLD
The default confidence threshold for the identifyLanguage(String) call.
public static final float DEFAULT_IDENTIFY_POSSIBLE_LANGUAGES_CONFIDENCE_THRESHOLD
The default confidence threshold for the identifyPossibleLanguages(String) call.
public static final String UNDETERMINED_LANGUAGE_TAG
The BCP 47 language tag for "undetermined language"
Public Methods
public abstract void close ()
public abstract Task<String> identifyLanguage (String text)
Identifies the language in a supplied String and returns the most likely language.
Parameters
| text | the text for which to identify the language. Inputs longer than 200 characters are truncated to 200 characters, as longer input does not improve the detection accuracy. |
|---|
Returns
- a
Taskthat returns aStringwith the BCP 47 language tag of the most likely language, orUNDETERMINED_LANGUAGE_TAGif the confidence was below the threshold specified inLanguageIdentificationOptions
public abstract Task<List<IdentifiedLanguage>> identifyPossibleLanguages (String text)
Identifies the language in a supplied String and returns a list of possible languages, cutting off any languages whose confidence score falls below the threshold which is set in LanguageIdentificationOptions.Builder.setConfidenceThreshold(float).
Note that this API assumes the text is in a single language; the returned list contains all estimations for what that language could be, along with a confidence score for each possible language. The API does not detect multiple languages in a single text.
Parameters
| text | the text for which to identify the language. Inputs longer than 200 characters are truncated to 200 characters, as longer input does not improve the detection accuracy. |
|---|
Returns
- a
Taskthat returns aListofIdentifiedLanguages. The returned list will never be empty; if all languages have lower confidence scores than the threshold, the list will contain a single item with theUNDETERMINED_LANGUAGE_TAG.