Downloading Data

Export and download data from TitanRDM as pandas DataFrames. The SDK supports full and incremental export patterns with built-in polling for export completion.

Note: If you are using Databricks or another Spark-based environment, consider using the SparkSync class instead of manually creating downloads. The SparkSync class provides a more convenient way to download data by automatically discovering tables and creating downloads for them. Note: SparkSync class uses the ConventionSync class internally. You can use ConventionSync directly with pandas DataFrames if you prefer more control over the download process.


Concepts

TermDescription
Download (Export)An asynchronous operation that prepares data for download from TitanRDM.
Patternfull exports the complete dataset; incremental exports only rows changed since a given timestamp.
High Water MarkAn ISO 8601 timestamp used with incremental exports to fetch only newer data.

Full Download Workflow

from titan_rdm_sdk import TitanRDMClient

client = TitanRDMClient(
    url="https://your-tenant.titanrdm.com",
    client_id="your_client_id",
    client_secret="your_client_secret",
)

# Step 1: Create an export
download = client.get_download(
    branch_id=174,
    table_definition_key=50,
    pattern="full",
    correlation_code="daily-export-2025-01-22",
)
print(f"Export created: ID={download.id}, Status={download.status}")

# Step 2: Wait for the export to complete
final_status = download.wait_until_ready(poll_interval=2.0, max_wait=300.0)
print(f"Export ready: status={final_status}")

# Step 3: Download the data
df = download.receive()
print(f"Downloaded {len(df)} rows, {len(df.columns)} columns")
print(f"Duration: {download.duration_secs} seconds")

Incremental Download

Use pattern="incremental" with a high_water_mark to export only rows modified after a given timestamp:

download = client.get_download(
    branch_id=174,
    table_definition_key=50,
    pattern="incremental",
    high_water_mark="2025-01-21T00:00:00Z",
    correlation_code="incremental-sync",
)

download.wait_until_ready()
df = download.receive()
print(f"Downloaded {len(df)} changed rows since 2025-01-21")

The high_water_mark must be in ISO 8601 format (e.g. 2025-01-21T00:00:00Z).


Polling Behaviour

Exports are asynchronous — the server prepares the data in the background. The wait_until_ready() method polls the export status at regular intervals:

download.wait_until_ready(
    poll_interval=2.0,   # Check every 2 seconds
    max_wait=300.0,      # Time out after 5 minutes
)

Manual Status Checking

For more control, poll the status yourself:

import time

download = client.get_download(
    branch_id=174,
    table_definition_key=50,
    pattern="full",
)

while True:
    status = download.check_status()
    print(f"Status: {status}")

    if status == "ready":
        df = download.receive()
        break
    elif status == "failed":
        print(f"Export failed: {download.message}")
        break

    time.sleep(2)

Export Statuses

StatusDescription
startedExport is being prepared
readyExport is complete and available for download
failedExport failed (check download.message for details)

Download Object Properties

PropertyTypeDescription
idintExport ID
statusstr'started', 'ready', or 'failed'
branch_idintBranch ID
table_definition_keyintTable definition key
patternstrExport pattern
correlation_codestrUser-defined tracking ID
rows_exportedintNumber of rows exported
files_exportedintNumber of files generated
bytes_exportedintTotal bytes exported
messagestrStatus/error message
duration_secsintExport duration in seconds
started_datetimestrWhen the export started
completed_datetimestrWhen the export completed

Parameters Reference

client.get_download()

ParameterTypeRequiredDescription
branch_idintYesTarget branch ID
table_definition_keyintYesTable definition key
patternstrNo'full' or 'incremental' (default: 'full')
correlation_codestrNoUser-defined tracking identifier
high_water_markstrConditionalISO 8601 timestamp (required for 'incremental')

download.wait_until_ready()

ParameterTypeDefaultDescription
poll_intervalfloat2.0Seconds between status checks
max_waitfloat300.0Maximum seconds to wait

Next Steps