Enum RecognitionConfig.AudioEncoding

  • All Implemented Interfaces:
    com.google.protobuf.Internal.EnumLite, com.google.protobuf.ProtocolMessageEnum, Serializable, Comparable<RecognitionConfig.AudioEncoding>
    Enclosing class:
    RecognitionConfig

    public static enum RecognitionConfig.AudioEncoding
    extends Enum<RecognitionConfig.AudioEncoding>
    implements com.google.protobuf.ProtocolMessageEnum
     The encoding of the audio data sent in the request.
    
     All encodings support only 1 channel (mono) audio, unless the
     `audio_channel_count` and `enable_separate_recognition_per_channel` fields
     are set.
    
     For best results, the audio source should be captured and transmitted using
     a lossless encoding (`FLAC` or `LINEAR16`). The accuracy of the speech
     recognition can be reduced if lossy codecs are used to capture or transmit
     audio, particularly if background noise is present. Lossy codecs include
     `MULAW`, `AMR`, `AMR_WB`, `OGG_OPUS`, `SPEEX_WITH_HEADER_BYTE`, `MP3`,
     and `WEBM_OPUS`.
    
     The `FLAC` and `WAV` audio file formats include a header that describes the
     included audio content. You can request recognition for `WAV` files that
     contain either `LINEAR16` or `MULAW` encoded audio.
     If you send `FLAC` or `WAV` audio file format in
     your request, you do not need to specify an `AudioEncoding`; the audio
     encoding format is determined from the file header. If you specify
     an `AudioEncoding` when you send  send `FLAC` or `WAV` audio, the
     encoding configuration must match the encoding described in the audio
     header; otherwise the request returns an
     [google.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT] error
     code.
     
    Protobuf enum google.cloud.speech.v1p1beta1.RecognitionConfig.AudioEncoding
    • Enum Constant Detail

      • FLAC

        public static final RecognitionConfig.AudioEncoding FLAC
         `FLAC` (Free Lossless Audio
         Codec) is the recommended encoding because it is
         lossless--therefore recognition is not compromised--and
         requires only about half the bandwidth of `LINEAR16`. `FLAC` stream
         encoding supports 16-bit and 24-bit samples, however, not all fields in
         `STREAMINFO` are supported.
         
        FLAC = 2;
      • OGG_OPUS

        public static final RecognitionConfig.AudioEncoding OGG_OPUS
         Opus encoded audio frames in Ogg container
         ([OggOpus](https://wiki.xiph.org/OggOpus)).
         `sample_rate_hertz` must be one of 8000, 12000, 16000, 24000, or 48000.
         
        OGG_OPUS = 6;
      • SPEEX_WITH_HEADER_BYTE

        public static final RecognitionConfig.AudioEncoding SPEEX_WITH_HEADER_BYTE
         Although the use of lossy encodings is not recommended, if a very low
         bitrate encoding is required, `OGG_OPUS` is highly preferred over
         Speex encoding. The [Speex](https://speex.org/)  encoding supported by
         Cloud Speech API has a header byte in each block, as in MIME type
         `audio/x-speex-with-header-byte`.
         It is a variant of the RTP Speex encoding defined in
         [RFC 5574](https://tools.ietf.org/html/rfc5574).
         The stream is a sequence of blocks, one block per RTP packet. Each block
         starts with a byte containing the length of the block, in bytes, followed
         by one or more frames of Speex data, padded to an integral number of
         bytes (octets) as specified in RFC 5574. In other words, each RTP header
         is replaced with a single byte containing the block length. Only Speex
         wideband is supported. `sample_rate_hertz` must be 16000.
         
        SPEEX_WITH_HEADER_BYTE = 7;
      • MP3

        public static final RecognitionConfig.AudioEncoding MP3
         MP3 audio. MP3 encoding is a Beta feature and only available in
         v1p1beta1. Support all standard MP3 bitrates (which range from 32-320
         kbps). When using this encoding, `sample_rate_hertz` has to match the
         sample rate of the file being used.
         
        MP3 = 8;
      • WEBM_OPUS

        public static final RecognitionConfig.AudioEncoding WEBM_OPUS
         Opus encoded audio frames in WebM container
         ([OggOpus](https://wiki.xiph.org/OggOpus)). `sample_rate_hertz` must be
         one of 8000, 12000, 16000, 24000, or 48000.
         
        WEBM_OPUS = 9;
    • Field Detail

      • ENCODING_UNSPECIFIED_VALUE

        public static final int ENCODING_UNSPECIFIED_VALUE
         Not specified.
         
        ENCODING_UNSPECIFIED = 0;
        See Also:
        Constant Field Values
      • LINEAR16_VALUE

        public static final int LINEAR16_VALUE
         Uncompressed 16-bit signed little-endian samples (Linear PCM).
         
        LINEAR16 = 1;
        See Also:
        Constant Field Values
      • FLAC_VALUE

        public static final int FLAC_VALUE
         `FLAC` (Free Lossless Audio
         Codec) is the recommended encoding because it is
         lossless--therefore recognition is not compromised--and
         requires only about half the bandwidth of `LINEAR16`. `FLAC` stream
         encoding supports 16-bit and 24-bit samples, however, not all fields in
         `STREAMINFO` are supported.
         
        FLAC = 2;
        See Also:
        Constant Field Values
      • MULAW_VALUE

        public static final int MULAW_VALUE
         8-bit samples that compand 14-bit audio samples using G.711 PCMU/mu-law.
         
        MULAW = 3;
        See Also:
        Constant Field Values
      • AMR_VALUE

        public static final int AMR_VALUE
         Adaptive Multi-Rate Narrowband codec. `sample_rate_hertz` must be 8000.
         
        AMR = 4;
        See Also:
        Constant Field Values
      • AMR_WB_VALUE

        public static final int AMR_WB_VALUE
         Adaptive Multi-Rate Wideband codec. `sample_rate_hertz` must be 16000.
         
        AMR_WB = 5;
        See Also:
        Constant Field Values
      • OGG_OPUS_VALUE

        public static final int OGG_OPUS_VALUE
         Opus encoded audio frames in Ogg container
         ([OggOpus](https://wiki.xiph.org/OggOpus)).
         `sample_rate_hertz` must be one of 8000, 12000, 16000, 24000, or 48000.
         
        OGG_OPUS = 6;
        See Also:
        Constant Field Values
      • SPEEX_WITH_HEADER_BYTE_VALUE

        public static final int SPEEX_WITH_HEADER_BYTE_VALUE
         Although the use of lossy encodings is not recommended, if a very low
         bitrate encoding is required, `OGG_OPUS` is highly preferred over
         Speex encoding. The [Speex](https://speex.org/)  encoding supported by
         Cloud Speech API has a header byte in each block, as in MIME type
         `audio/x-speex-with-header-byte`.
         It is a variant of the RTP Speex encoding defined in
         [RFC 5574](https://tools.ietf.org/html/rfc5574).
         The stream is a sequence of blocks, one block per RTP packet. Each block
         starts with a byte containing the length of the block, in bytes, followed
         by one or more frames of Speex data, padded to an integral number of
         bytes (octets) as specified in RFC 5574. In other words, each RTP header
         is replaced with a single byte containing the block length. Only Speex
         wideband is supported. `sample_rate_hertz` must be 16000.
         
        SPEEX_WITH_HEADER_BYTE = 7;
        See Also:
        Constant Field Values
      • MP3_VALUE

        public static final int MP3_VALUE
         MP3 audio. MP3 encoding is a Beta feature and only available in
         v1p1beta1. Support all standard MP3 bitrates (which range from 32-320
         kbps). When using this encoding, `sample_rate_hertz` has to match the
         sample rate of the file being used.
         
        MP3 = 8;
        See Also:
        Constant Field Values
      • WEBM_OPUS_VALUE

        public static final int WEBM_OPUS_VALUE
         Opus encoded audio frames in WebM container
         ([OggOpus](https://wiki.xiph.org/OggOpus)). `sample_rate_hertz` must be
         one of 8000, 12000, 16000, 24000, or 48000.
         
        WEBM_OPUS = 9;
        See Also:
        Constant Field Values
    • Method Detail

      • values

        public static RecognitionConfig.AudioEncoding[] values()
        Returns an array containing the constants of this enum type, in the order they are declared. This method may be used to iterate over the constants as follows:
        for (RecognitionConfig.AudioEncoding c : RecognitionConfig.AudioEncoding.values())
            System.out.println(c);
        
        Returns:
        an array containing the constants of this enum type, in the order they are declared
      • valueOf

        public static RecognitionConfig.AudioEncoding valueOf​(String name)
        Returns the enum constant of this type with the specified name. The string must match exactly an identifier used to declare an enum constant in this type. (Extraneous whitespace characters are not permitted.)
        Parameters:
        name - the name of the enum constant to be returned.
        Returns:
        the enum constant with the specified name
        Throws:
        IllegalArgumentException - if this enum type has no constant with the specified name
        NullPointerException - if the argument is null
      • getNumber

        public final int getNumber()
        Specified by:
        getNumber in interface com.google.protobuf.Internal.EnumLite
        Specified by:
        getNumber in interface com.google.protobuf.ProtocolMessageEnum
      • valueOf

        @Deprecated
        public static RecognitionConfig.AudioEncoding valueOf​(int value)
        Deprecated.
        Returns the enum constant of this type with the specified name. The string must match exactly an identifier used to declare an enum constant in this type. (Extraneous whitespace characters are not permitted.)
        Parameters:
        value - the name of the enum constant to be returned.
        Returns:
        the enum constant with the specified name
        Throws:
        IllegalArgumentException - if this enum type has no constant with the specified name
        NullPointerException - if the argument is null
      • forNumber

        public static RecognitionConfig.AudioEncoding forNumber​(int value)
        Parameters:
        value - The numeric wire value of the corresponding enum entry.
        Returns:
        The enum associated with the given numeric wire value.
      • getValueDescriptor

        public final com.google.protobuf.Descriptors.EnumValueDescriptor getValueDescriptor()
        Specified by:
        getValueDescriptor in interface com.google.protobuf.ProtocolMessageEnum
      • getDescriptorForType

        public final com.google.protobuf.Descriptors.EnumDescriptor getDescriptorForType()
        Specified by:
        getDescriptorForType in interface com.google.protobuf.ProtocolMessageEnum
      • getDescriptor

        public static final com.google.protobuf.Descriptors.EnumDescriptor getDescriptor()
      • valueOf

        public static RecognitionConfig.AudioEncoding valueOf​(com.google.protobuf.Descriptors.EnumValueDescriptor desc)
        Returns the enum constant of this type with the specified name. The string must match exactly an identifier used to declare an enum constant in this type. (Extraneous whitespace characters are not permitted.)
        Parameters:
        desc - the name of the enum constant to be returned.
        Returns:
        the enum constant with the specified name
        Throws:
        IllegalArgumentException - if this enum type has no constant with the specified name
        NullPointerException - if the argument is null