Use the Google API to convert sound files to text.

Parameters:

Parameters:
See dedicated page for more information.
GoogleSpeechToText converts audio into text using Google Cloud Speech-to-Text. It supports three operation modes:
gs:// without uploading.The box emits one output row per input audio file with the recognized transcript and metadata (language, sample rate, status, etc.).
Enable APIs
In Google Cloud Console: APIs & Services → Library
Enable:
Configure OAuth consent
Create OAuth client
https://developers.google.com/oauthplaygroundGenerate a refresh token (OAuth 2.0 Playground)
Open OAuth 2.0 Playground.
Click ⚙️ (top-right) → Use your own OAuth credentials → paste Client ID/Secret.
Step 1: In the scope box paste and select:
https://www.googleapis.com/auth/cloud-platform
https://www.googleapis.com/auth/devstorage.read_write
(First scope covers Speech-to-Text; second covers GCS read/write when uploading or reading from gs://.)
Click Authorize APIs → pick your Google account → allow access.
Step 2: Click Exchange authorization code for tokens.
Copy Refresh token (string like 1//04...).
Store the refresh token as a secret in atrnatv. The access token will rotate automatically; the refresh token persists.
Formats: WAV (Linear16), FLAC, OGG/Opus, MP3 (mono recommended).
Sample rate: 8,000–48,000 Hz. For OGG/Opus pick the exact Sample rate parameter.
Safe default we use: 16,000 Hz.
Duration
Provide at minimum:
idFile — path to local file or gs://bucket/object depending on mode.When Local with upload mode is used (long files handled locally), also provide or configure:
bucket — your GCS bucket (e.g., my_bucket).idFileRemote — remote object name to use in bucket.| Name | Type | Required | Default | Description |
|---|---|---|---|---|
| idFile | string (path/URI) | Yes | — | Audio file to process. Local path in Local modes; gs://... in GCS URI mode. |
| allLong | boolean | No | Off | If On, treat all audio as long-form. In Local with upload mode this batches files for upload; retention controlled by idDelete. |
| idLang | enum | Yes | English (United States) | Language of the audio (e.g., English (United Kingdom), French (France), Chinese (Simplified), Spanish (Spain), Russian (Russia)). |
| idOggSampleRate | enum | Conditional | 16000 | Required for OGG/Opus. Sample rate in samples/sec (8000/12000/16000/24000/48000). |
| idMode | enum | Yes | local files less than 1min | Operating mode: local files less than 1min | Google Cloud Storage URI | local files with upload. |
| clientID | string (secret) | Yes | — | Google OAuth Client ID. |
| clientSecret | string (secret) | Yes | — | Google OAuth Client Secret. |
| refresh_token | string (secret) | Yes | — | OAuth 2.0 refresh token with scopes cloud-platform and devstorage.read_write (if GCS). |
| idFileRemote | string | Conditional | (auto) | Remote filename when idMode = local files with upload. If blank, uses the source filename. |
| bucket | string | Conditional | — | GCS bucket when idMode = local files with upload or using upload helpers. |
| idDelete | boolean | Conditional | Off | If On, delete uploaded GCS objects after transcription (applies to upload mode). |
| idOptional | string | No | — | Extra cURL flags (e.g., timeouts). Use with care. |
| idDebug | enum | No | nothing | Logging level: nothing | basic | verbose. |
| nRetry | integer | No | 5 | Retries on transient connection errors. |
| idErrorManagement | enum | No | continue with status ERROR | Error handling: continue with status ERROR | abort pipeline execution with error. |
Interdependency rules
bucket, idDelete, idFileRemote not used; file size must be ≤ 60s.idFile must be a gs://bucket/object; bucket optional (ignored).bucket required; optional idFileRemote; idDelete controls post-run cleanup.The action writes one record per input file.
| Column | Type | Description |
|---|---|---|
| file | string | Source file path or GCS object processed. |
| language | string | Language code used (e.g., en-US). |
| sampleRateHz | integer | Sample rate used by recognizer. |
| transcript | string | Final recognized text. |
| confidence | number | Confidence score (0–1) if provided by API. |
| status | string | OK, ERROR, or SKIPPED. |
| errorMessage | string | Error message when status=ERROR. |
| requestId | string | API request identifier (when available). |
| durationSec | number | Parsed audio duration (if available). |
Exact output columns can vary slightly by atrnatv build; the core fields above are emitted in standard builds.
| Symptom | Likely cause | Fix |
|---|---|---|
Missing required parameter: refresh_token |
OAuth token not provided or saved as empty secret. | Generate a refresh token using OAuth Playground and paste into refresh_token. |
Unauthorized / 401 |
Scopes missing or wrong client/secret. | Recreate token with cloud-platform scope; re-enter clientID/clientSecret. |
AccessDenied: storage.objects.create |
No write permission to bucket when uploading. | Grant the account Storage Object Admin on the target bucket or use a bucket you own. |
| Audio > 60s fails in local <1min mode | Mode limitation. | Switch to GCS URI or local with upload. |
| Poor recognition quality | Wrong language or sample rate. | Set idLang correctly; for OGG set idOggSampleRate to exact value (often 16000 for telephony). |
| Timeout / flakiness | Large files or network issues. | Increase nRetry, add --max-time 120 in idOptional, or move files to GCS. |
| Empty transcript | Non-speech audio, silent segments, or unsupported codec. | Verify codec and that the audio contains clear speech; convert to FLAC/LINEAR16 16 kHz mono. |
English (United Kingdom) vs English (United States)).