← Back to Projects

Stage 1: Snowflake Data Transformation & De-Identification

Developed a Snowflake stored procedure and orchestrated daily automation to process upstream claims transactions for a major pharma analytics client. Worked cross-functionally to integrate data science-derived views, enforce NCPDP standards, and ensure accurate, rights-based record selection.

Project Overview

  • Collaborated with data science and business stakeholders to identify, validate, and select the upstream datasets essential for claims processing.
  • Worked alongside the data science team as they built the B1, B2, and B3 data views; integrated these into the downstream transformation pipeline for nightly and ad hoc processing.
  • Created, maintained, and parameterized a secure Snowflake stored procedure that de-identified and transformed claims according to NCPDP standards and project privacy requirements.
  • Set up a Snowflake task to automate daily processing for claims with external sharing rights, loading qualifying records into a project master table for analytics and delivery.
  • Supported manual, on-demand historical runs as needed for multi-year data backfills to meet client requirements.

Quality, Auditability & Compliance

  • Tracked all runs and transformations with comprehensive metadata and logging for transparency, audit support, and troubleshooting.
  • Ensured consistent, documented, and compliant application of de-identification rules with clear separation of creation, validation, and execution responsibilities.

Results & Value Delivered

  • Provided secure, NCPDP-compliant de-identified data sets, ready for onward tokenization and client analytics integration.
  • Enabled automated and on-demand delivery of claims for external usage, all with full compliance and documented lineage.
Tech Stack: Snowflake SQL AWS ETL Orchestration