Platform Integrations

The TitanRDM SDK provides platform-specific sync classes for BigQuery and Snowflake, in addition to the Spark/Databricks integration. Each extends the same ConventionSync base class and follows the same naming conventions.


BigQuery — BigQuerySync

Convention-based sync between Google BigQuery and TitanRDM.

Naming Convention

{project}.{dataset}.{domain_abbreviation}_{database_table_name}

Setup

from titan_rdm_sdk import TitanRDMClient, BigQuerySync

client = TitanRDMClient(url=URL, client_id=ID, client_secret=SECRET)

# Option A: Use application default credentials
sync = BigQuerySync(client=client, project="my-gcp-project")

# Option B: Pass an existing BigQuery client
from google.cloud import bigquery
bq_client = bigquery.Client(project="my-gcp-project")
sync = BigQuerySync(client=client, bq_client=bq_client)

Requirements: google-cloud-bigquery and pandas-gbq packages.

Upload: BigQuery → TitanRDM

branch = client.get_branch_by_name("prod")

results = sync.upload_sync_by_convention(
    branch_id=branch.id,
    source_dataset="rdmout",
    target_domain_name="Clinics",
)

Upload Parameters

ParameterTypeRequiredDescription
branch_idintYesTarget branch ID
source_datasetstrYesSource BigQuery dataset (e.g. 'rdmout')
target_domain_namestrYesExact domain name in TitanRDM
target_table_nameslist[str]NoFilter to specific tables
source_projectstrNoGCP project (defaults to init project)
descriptionstrNoImport batch description
correlation_codestrNoTracking identifier

Download: TitanRDM → BigQuery

results = sync.download_sync_by_convention(
    branch_id=branch.id,
    target_dataset="rdmin",
    source_domain_name="Clinics",
)

Download Parameters

ParameterTypeRequiredDescription
branch_idintYesTarget branch ID
target_datasetstrYesDestination BigQuery dataset (e.g. 'rdmin')
source_domain_namestrYesExact domain name in TitanRDM
source_table_nameslist[str]NoFilter to specific tables
target_projectstrNoGCP project (defaults to init project)
correlation_codestrNoTracking identifier prefix
poll_intervalfloatNoSeconds between checks (default: 2.0)
max_waitfloatNoMax wait per export (default: 300.0)

BigQuery Prerequisites

  1. Create datasets for sync: sql CREATE SCHEMA IF NOT EXISTS `my-project.rdmin`; CREATE SCHEMA IF NOT EXISTS `my-project.rdmout`;

  2. Ensure your service account has BigQuery Data Editor permissions on the target datasets.


Snowflake — SnowparkSync

Convention-based sync between Snowflake and TitanRDM via Snowpark.

Naming Convention

{database}.{schema}.{domain_abbreviation}_{database_table_name}

Setup

from titan_rdm_sdk import TitanRDMClient, SnowparkSync

client = TitanRDMClient(url=URL, client_id=ID, client_secret=SECRET)

# Option A: In a Snowflake notebook (session auto-detected)
sync = SnowparkSync(client=client)

# Option B: Pass an existing Snowpark session
from snowflake.snowpark import Session
session = Session.builder.configs(connection_params).create()
sync = SnowparkSync(client=client, session=session)

Requirements: snowflake-snowpark-python package.

Upload: Snowflake → TitanRDM

branch = client.get_branch_by_name("prod")

results = sync.upload_sync_by_convention(
    branch_id=branch.id,
    source_database="ANALYTICS",
    source_schema="RDMOUT",
    target_domain_name="Clinics",
)

Upload Parameters

ParameterTypeRequiredDescription
branch_idintYesTarget branch ID
source_databasestrYesSource Snowflake database
source_schemastrYesSource schema (e.g. 'RDMOUT')
target_domain_namestrYesExact domain name in TitanRDM
target_table_nameslist[str]NoFilter to specific tables
descriptionstrNoImport batch description
correlation_codestrNoTracking identifier

Download: TitanRDM → Snowflake

results = sync.download_sync_by_convention(
    branch_id=branch.id,
    target_database="ANALYTICS",
    target_schema="RDMIN",
    source_domain_name="Clinics",
)

Download Parameters

ParameterTypeRequiredDescription
branch_idintYesTarget branch ID
target_databasestrYesDestination Snowflake database
target_schemastrYesDestination schema (e.g. 'RDMIN')
source_domain_namestrYesExact domain name in TitanRDM
source_table_nameslist[str]NoFilter to specific tables
correlation_codestrNoTracking identifier prefix
poll_intervalfloatNoSeconds between checks (default: 2.0)
max_waitfloatNoMax wait per export (default: 300.0)

Snowflake Prerequisites

  1. Create schemas for sync: sql CREATE SCHEMA IF NOT EXISTS ANALYTICS.RDMIN; CREATE SCHEMA IF NOT EXISTS ANALYTICS.RDMOUT;

  2. Ensure your Snowflake role has USAGE on the database and CREATE TABLE / INSERT on the target schemas.


Platform Comparison

FeatureSparkSyncBigQuerySyncSnowparkSync
PlatformDatabricks / SparkGoogle BigQuerySnowflake
Conventioncatalog.schema.keyproject.dataset.keydatabase.schema.key
SessionSparkSessionBigQuery ClientSnowpark Session
Write modeoverwrite + overwriteSchemaWRITE_TRUNCATEoverwrite
Packagepysparkgoogle-cloud-bigquerysnowflake-snowpark-python

Next Steps