Execute a data query using the dataset endpoint for more complex queries.
- You want to query data to form an ad hoc analysis within a project that does not already exist as a chart, metric, or experiment.
- You want to tweak an existing chart to glean more information from it (e.g. add a group by, change the time range, etc.)
1. Use the 'search' tool to find existing charts that relate to the data you want to query. 2. Use the 'get_charts' tool to understand the events, properties, and dataset schema generally. 3. Optionally use the 'search' tool again to find additional events, user properties, etc. needed for the query. 4. Optionally use the 'get_event_properties' tool to get properties on individual events. 3. Use this tool to query the ad hoc analysis.
- Don't assume or guess properties, events, or schema. Use the tools provided to you to understand the data before running a dataset query.
- When running into query failures, try searching for existing charts to understand the data taxonomy and dataset schema.
- When you receive a 400 error response the schema is likely incorrect or the events/properties do not exist.
Special system events available for analysis. Events are passed in the "event_type" field:
- "_active": Any active event useful for tracking 'active users' like DAU, MAU(events not marked as inactive)
- "_all": Any event being tracked in Amplitude
- "_new": Events triggered by new users within the time interval. Useful for tracking 'new users'.
- "_any_revenue_event": Any revenue-generating event. Useful for tracking 'revenue'.
- "$popularEvents": Top events by volume (dynamically computed). Useful for more meta taxonomy analyses like 'what are the most common events'.
- AMPLITUDE CORE PROPERTIES: Built-in properties automatically tracked by Amplitude
- Have source: "AMPLITUDE"
- Use standard naming (lowercase, underscores)
- Available for all events and users
- CUSTOM PROPERTIES: Organization-defined properties
- Have source: "CUSTOMER"
- ALWAYS prefixed with "gp:" (e.g., "gp:email", "gp:full name")
- Include experiment flags like "gp:[Experiment] feature-name"
- user_id: Primary user identifier
- amplitude_id: Amplitude's internal user ID
- device_id: Unique device identifier
- session_id: Session identifier
- event_id: Unique event identifier
- ip_address: User's IP address
- country: Country (from GeoIP)
- city: City (from GeoIP)
- region: State/region (from GeoIP)
- dma: Designated Market Area
- $postal_code: Postal/ZIP code
- location_lat: Latitude coordinate
- location_lng: Longitude coordinate
- platform: Device platform (iOS, Android, Web, etc.)
- device: Device family name
- device_type: Type of device (mobile, tablet, desktop)
- os: Operating system
- language: Device language setting
- carrier: Mobile carrier
- version: Current app version
- start_version: Version when user first used the app
- library: SDK library used to send data
- server_upload_time: When data was received by Amplitude
- paying: Whether user is a paying customer
RESPONSE FORMAT: Returns {isCsvResponse: bool, csvResponse or jsonResponse, definition}. Only ONE response type present. Check the isCsvResponse flag to determine which response format to parse
CSV Response Structure (when isCsvResponse is true):
- Header rows: The top rows contain metadata including chart name, description, events, formulas, and other chart configuration details
- Data header row: A single row containing column labels for the data points below (typically includes dates or time periods)
- Data rows: Each row contains:
- Label columns: First few columns contain row labels identifying the data series
- Value columns: Numerical data organized under the corresponding date/time columns from the data header row
- Parse by: Skip metadata rows, identify the data header row, then extract labels from first columns and values from remaining columns
- Cells in the CSV response are delimited by commas and may be prepended with a character
Example below measures uniques of custom event "Valuable Tweaking" over 3 days (2025-08-23, 2025-08-24, 2025-08-25) for all users. The data points are 614, 1769, and 4132 for the 3 days respectively. data: " Example chart name" " Formula"," UNIQUES(A)" " A:"," [Custom] 'Valuable Tweaking'"
" Segment"," 2025-08-23"," 2025-08-24"," 2025-08-25" " All Non-Amplitude Users","614","1769","4132" definition: { "app": "APP_ID", "params": { "countGroup": "User", "end": 1756166399, "events": [ { "event_type": "ce:'Valuable Tweaking'", "filters": [], "group_by": [] } ], "groupBy": [], "interval": 1, "metric": "uniques", "segments": [], "start": 1755907200, }, "type": "eventsSegmentation", }
JSON Response Structure (when isCsvResponse is false):
- Parse using the following structure:
- timeSeries: Array of arrays, each containing data point for a given time period with a "value" property
- overallSeries: Array of arrays, each containing data the overall data point (across the entire range) under the "value" property
- seriesMetadata: Array of objects containing metadata for each series
- xValuesForTimeSeries: Array of strings representing the x-axis values (dates) for the time series
- Use the dataset definition to be able to parse referenced events, properties, and segments.
Example below is a JSON response is for the same query as the CSV example above. { "timeSeries": [[{"value": 614}, {"value": 1769}, {"value": 4132}]], "overallSeries": [[{"value": 5642}]], "seriesMetadata": [{"segmentIndex": 0, "formulaIndex": 0, "formula": "UNIQUES(A)"}], "xValuesForTimeSeries": ["2025-08-23T00:00:00", "2025-08-24T00:00:00", "2025-08-25T00:00:00"] }