Description:
**PRIORITY TRIGGER CONDITIONS
- Use THIS Tool When User Mentions Recording:**
**HIGH PRIORITY
- Always use THIS tool when user explicitly mentions:**
- "recording" + any action (play, watch, view, listen, open, start, etc.)
- "play recording" | "play the recording"
- "watch the recording" | "view the recording" | "listen to the recording"
- "open the recording" | "start the recording" | "launch the recording"
- "jump to recording" | "go to recording" | "navigate to recording"
- "recording content" | "recording details" | "recording transcript"
- "what's in the recording" | "recording summary" | "recording analysis"
**MEDIUM PRIORITY
- Use THIS tool when context suggests recording focus:**
- User asks about specific content AND mentions time/timeline (e.g., "What did they say at 5 minutes?")
- User wants to search/find specific topics within content (implies transcript search)
- User asks about speakers, dialogue, or conversation details
- User mentions playback, timestamps, or navigation within content
**LOW PRIORITY
- Use ASSETS tool first, then THIS tool if needed:**
- General meeting information requests without "recording" keyword
- "Show me my meetings" / "What meetings do I have?"
- Basic meeting metadata (duration, participants, date) without content focus
CRITICAL: When to Use This Tool vs. Assets Tool
Use THIS RECORDING tool when:
- User explicitly mentions "recording" in their request
- User wants detailed recording content: transcript text, per-topic summaries, next steps
- User needs to search within a recording (e.g., "What did they say about X?")
- User wants to jump to a specific part of a recording (requires transcript + playback URL)
- User needs granular timeline data (who said what and when)
- User wants to play/watch/view/listen to a recording
Use the ASSETS tool when:
- User wants meeting-level information (meeting summary, recordings list, whiteboards, docs)
- User asks for "meeting assets", "meeting recordings", "meeting summary" without mentioning "recording" specifically
- User wants to know what recordings exist for a meeting
- User needs basic recording metadata (duration, file size, download links)
ENHANCED DECISION RULE:
1. FIRST CHECK: Does the user's request contain the word "recording" or recording-related actions?
- YES → Use THIS tool immediately
- NO → Continue to step 2
2. SECOND CHECK: Is the user asking about specific content, transcript, or playback?
- YES → Use THIS tool
- NO → Use ASSETS tool
3. FALLBACK: If unclear, and user mentions "meeting" without "recording" → Use ASSETS tool first
Examples of Enhanced Triggering:
ALWAYS use THIS tool for:
- "Play the recording"
- "I want to watch the recording"
- "Show me the recording content"
- "What's in this recording?"
- "Open the recording file"
- "Start the recording"
- "Jump to 5 minutes in the recording"
- "Find where they talked about pricing in the recording"
Use ASSETS tool first for:
- "Show me my meetings"
- "What meetings do I have today?"
- "Meeting summary" (without "recording")
- "My recent meetings"
Context-dependent (analyze user intent):
- "What did we discuss?" → If no recording context, use ASSETS; if recording context exists, use THIS tool
- "Show me the summary" → If user was previously discussing recordings, use THIS tool; otherwise use ASSETS
Use this tool whenever you need detailed content or granular metadata for a specific recording:
- transcript text (to answer questions, find topics, or locate time ranges),
- summaries (overall or per-topic),
- next steps / action items,
- or playable URLs (for jumping to a specific point in the recording).
This tool can return:
- transcript segments,
- summary segments,
- next-step segments,
- playback URLs,
- and utility information derived from these (e.g., recording duration).
If the user's request is unclear OR you are not sure which resource type is needed, omit the "types" parameter so that ALL available resources are fetched. You can then decide which parts to use to answer the user.
Parameters:
- types (string, optional):
A comma-separated list of resource types to retrieve. Valid values (case-insensitive, no spaces needed): • "transcript" → transcript segments for the recording • "summary" → summary segments for the recording • "nextStep" → next-step segments for the recording • "playUrl" → playback URL segments for the recording
Behavior rules for "types":
- If omitted or an empty string, the tool MUST fetch ALL available resource types.
- If some values are invalid, ignore the invalid ones and still fetch the valid types.
- If the user is asking about the content or meaning of the recording
(e.g., "What did they say about X?", "Find the part where they discussed Y"), you SHOULD include "transcript" in types.
- If the user wants to jump to or play a specific part of the recording
(e.g., "Take me to the part where they talk about pricing"), you SHOULD include both "transcript" and "playUrl" in types.
- raw_passcode (string, optional):
The plaintext passcode for playing the recording. Up to 10 characters, containing lowercase letters, digits, special characters (e.g., !, @, #), or sometimes all digits. (e.g., passcode directly copied from a Zoom shared link).
- encode_passcode (string, optional):
The encrypted passcode used for playing the recording. It is an encoded string typically appended to the recording’s share link as the pwd parameter. (e.g., the pwd=xxx value from a Zoom shared link).
- clip_num (integer, optional):
The index (1-based or as defined by the implementation) of the recording clip to retrieve, for recordings that contain multiple clips. If the requested clip number exceeds the available clip count, the tool should return null.
- play_time (integer, optional):
The user's desired playback starting time in SECONDS relative to the selected clip. For example:
- play_time = 0 → start from the beginning of the clip
- play_time = 20 → start 20 seconds into the clip
When the user asks to "jump to", "play from", or "go to the part where...", you can: 1) Use the returned transcripts to find the relevant segment and its timestamp, then 2) Call or interpret play URLs with an appropriate play_time value.
Behavior:
- ENHANCED PREREQUISITE CHECK:
- If user mentions "recording" or recording-related actions → Use THIS tool immediately
- If user asks about content analysis, transcript, or playback → Use THIS tool
- If user asks general meeting questions without "recording" context → Use ASSETS tool first
- The tool returns ONLY the resource types requested via "types".
- If "types" is empty or omitted, the tool returns ALL available resource types.
- If a particular resource type does not exist, its field MUST be omitted from the response.
- Recording duration can be computed using:
duration = recording_end - recording_start
- Returned transcript data can (and should) be used by the model to:
- answer content questions,
- search for specific topics or phrases,
- determine appropriate time ranges for playback.
- Returned play URLs can be used by the client or the model to initiate playback
from a desired start time.
- If the user's question is ambiguous or mixed (e.g., might need both content and playback),
you SHOULD omit "types" so that all resources are retrieved.
- IMPORTANT: This tool has PRIORITY when "recording" is explicitly mentioned. For general meeting/recording discovery without "recording" keyword, use the ASSETS tool first.
Return Structure (JSON):
{ "transcripts": [ { "timeline": [ { "text": "string", "ts": "HH:mm:ss.SSS", // start time relative to recording start "end_ts": "HH:mm:ss.SSS", // end time relative to recording start "display_name":"string", // the person speaking in this sentence may be empty; if empty, it means they are not a Zoom user. } ], "recording_id": "string", "meeting_id": "string", "recording_start": "timestamp (string)", "recording_end": "timestamp (string)" } ], "summaries": [ { "recordingStartTime": number, "recording_start": "timestamp (string)", "total_items": number, "overall_summary": "string | null", "items": [ { "label": "string", "summary": "string" } ] } ], "next_steps": [ { "recordingStartTime": number, "recording_start": "timestamp (string)", "total_items": number, "items": [ { "text": "string" } ] } ], "play_urls": [ { "recording_start": "timestamp (string)", "urls": [ "string" ], "play_time": number } ] }
Notes:
- Include each top-level field (transcripts, summaries, next_steps, play_urls)
ONLY if corresponding data exists.
- If no resource exists for a requested type, omit that field entirely.
Enhanced Tool Selection Examples:
**CORRECT
- Use THIS tool immediately:**
- "Play the recording" → THIS tool (explicit recording mention)
- "I want to watch the recording" → THIS tool (explicit recording mention)
- "What's in this recording?" → THIS tool (explicit recording mention)
- "Open the recording" → THIS tool (explicit recording mention)
- "Find the part where they discussed pricing in the recording" → THIS tool (explicit recording mention)
**WRONG
- Should use THIS tool, not ASSETS:**
- User: "Play the recording" → Calling ASSETS tool
- User: "Show me the recording content" → Calling ASSETS tool
**CORRECT
- Use ASSETS tool first:**
- "What meetings do I have?" → ASSETS tool (no recording mention)
- "Show me my meeting summary" → ASSETS tool (meeting-level, no recording mention)
- "My recent meetings" → ASSETS tool (no recording mention)
Context-dependent examples:
- "What did we discuss about the budget?" → If user was previously talking about recordings, use THIS tool; otherwise use ASSETS
- "Show me the summary" → If recording context exists, use THIS tool; otherwise use ASSETS
Rule of thumb:
- Recording keyword present → THIS tool has PRIORITY
- Recording actions (play, watch, view, listen, open) → THIS tool has PRIORITY
- Content analysis without recording context → Consider both tools based on specificity
- General meeting discovery → ASSETS tool first