← Back to Projects
Stage 1: Snowflake Data Transformation & De-Identification
Developed a Snowflake stored procedure and orchestrated daily automation to process upstream claims transactions for a major pharma analytics client. Worked cross-functionally to integrate data science-derived views, enforce NCPDP standards, and ensure accurate, rights-based record selection.
Project Overview
- Collaborated with data science and business stakeholders to identify, validate, and select the upstream datasets essential for claims processing.
- Worked alongside the data science team as they built the B1, B2, and B3 data views; integrated these into the downstream transformation pipeline for nightly and ad hoc processing.
- Created, maintained, and parameterized a secure Snowflake stored procedure that de-identified and transformed claims according to NCPDP standards and project privacy requirements.
- Set up a Snowflake task to automate daily processing for claims with external sharing rights, loading qualifying records into a project master table for analytics and delivery.
- Supported manual, on-demand historical runs as needed for multi-year data backfills to meet client requirements.
Quality, Auditability & Compliance
- Tracked all runs and transformations with comprehensive metadata and logging for transparency, audit support, and troubleshooting.
- Ensured consistent, documented, and compliant application of de-identification rules with clear separation of creation, validation, and execution responsibilities.
Results & Value Delivered
- Provided secure, NCPDP-compliant de-identified data sets, ready for onward tokenization and client analytics integration.
- Enabled automated and on-demand delivery of claims for external usage, all with full compliance and documented lineage.
Tech Stack: Snowflake SQL AWS ETL Orchestration