Clinical Data Engineer (Associate Director) - Alta Petens
This position is currently classified as "hybrid" in accordance with Takeda's Hybrid and Remote Work policy.
Key to Takeda's success the Clinical Data Sciences team provides strategic planning, integrating, execution, build and oversight of clinical trial deliverables. The Clinical Data Sciences group is responsible for integrating structured and unstructured data across the various data sources, setup, data transfer/review and support downstream transformation and analysis.
The Clinical Data Sciences comprises of the Clinical Data Engineering and Clinical Data Standards. While the Clinical Data Standards provides the standards for clinical operation and data flow, the Clinical Data Engineering team drives the data architecture for clinical data. CDS also provides support to exploratory and specialty data for the purposes of data modelling, simulation, and analysis.
Clinical Data Sciences (CDS):
Key to Takeda's success is the Clinical Data Engineering team, provides strategic planning, integrating, execution, build and oversight of clinical trial deliverables. CDE leads the integration, design, development, and execution of data pipelines for the ingestion of clinical data from all sources at an enterprise level for use by the clinical data configuration specialist at the study level. The CDE is an enterprise level role and is primarily responsible for ensuring smooth end to end processes for data collection/ingestion from all data collection sources, providing an output into a data lake that is fit for use by downstream end users. The CDE is also responsible for developing and tracking KPIs and other measures across the business and providing continuous improvement for both process and tools. The CDE will also develop and maintain libraries, tools, and reports to increase reuse and overall efficiency for study level roles. The CDE should have a strong understanding of end-to-end clinical data collection and extraction processes as well as strong project management and technical experience. The CDE will be working with cross functional stakeholders to ensure alignment on processes and requirements and often will be required to convert these requirements to technical specifications. The CDE may also need to develop tools and visualizations as part of the continuous improvement process.
Under the guidance of Clinical Data Sciences, the AD, Clinical Data Engineer provides strategic guidance and leadership at the enterprise level for end-to-end data extraction, transformations and construct of data pipelines that conform to the harmonized data model that ensures data ingestion for all clinical data capture technologies and other related vendor and/or applications (e.g., EDC, IRT, ePRO, eCOA) as well other data models that may be required by end users. Understands and ensures proper data formats for all downstream users for use in the data lake. Able to articulate and provide insights to cross-functional stake holders around data engineering strategy with regards to data ingestion, transformation and downstream consumption.
Provides technical and operational expertise to the data engineering team around developing and maintaining library of reusable mapping and transformation functions to be used across studies. CDE contributes to the successful conduct of Takeda's clinical trials and to the delivery of high quality in a timely manner, which is eventually used for statistical analysis and submitted to regulatory authorities for the approval of Takeda products. The CDE also monitors end to end performance and KPIs and provides continuous improvement to processes and tools. Further, CDE efforts enable valid secondary use of clinical trial data throughout Takeda research groups to maximize value and achieve company objectives.
Experience building data pipelines for various heterogenous data sources.
Identifying, designing and implementing scalable data delivery pipelines and automating manual processes
Building required infrastructure for optimal data extraction, transformation and loading of data using cloud technologies like AWS, Azure etc.,
Experience leading and managing teams in developing end to end processes on the enterprise level for use by the clinical data configuration specialist to prepare data extraction and transformations of raw data quickly and efficiently from various sources at the study level
Experience working with cross functional teams and able to articulate data engineering's capabilities and insights for ingesting and managing data for downstream consumption.
Have people management experience managing data engineers and/or technical team members.
Able to provide guidance and technical expertise to internal team in building reusable ELT and ETL to ingest data into data warehouse and data lakes
Experience creating reusable data pipelines for heterogenous data ingestions
Experience developing processes and best practices to manage and maintain pipelines and troubleshoot data in data lake or warehouse
Hands-on knowledge developing visualization and analysis of data stored in data lake
Technical and/or business knowledge in Defining and tracking KPIs and provide continuous improvement
Out of the box thinker and strategist to guide and empower the team to develop and maintain, tools, libraries, and reusable templates of data pipelines and standards for study level consumption by data configuration specialist
Collaborate with various vendors and cross functional teams to build and align on data transfer specification and ensure a streamlined process of data integration
Ability to forecast resource needs for the team
Able to manage study deliverable timelines and deliver on time with high quality
Ability to create and own Data Engineering roadmaps for future development
Able to work with ambiguities and open to interact with technical and non-technical teams
Work with the team to ensure automation and continuous validation of data pipelines
Participate in the development, maintenance and training rendered by standards and other functions on transfer specs and best practices used by business.
Collaborate with system architecture team in designing and developing data pipelines as per business needs
Network with key business stakeholders on refining and enhancing the integration of structured and non-structured data.
Well organized, able to communicate and collaborate with cross-functional teams across different disciplines
Provide expertise for structured and non-structured data ingestion
Develop organizational knowledge of key data sources, systems and be a valuable resource to people in the company on how to best integrate data to pursue company objectives.
Provides technical leadership on various aspects of clinical data flow including assisting with the definition, build, and validation of application program interfaces (APIs), data streams, data staging to various systems for data extraction and integration
Experience in creating data integrity and data quality checks for data ingestion
Coordinates with data base builders, clinical data configuration specialists and data management (DM) programmers ensuring accuracy of data integration per SOPs
Provide technical support / consultancy and end-user support, work with Information Technology (IT) in troubleshooting, reporting, and resolving system issues
Develop and deliver training programs to internal and external team, ensure timely communication of new and/or revised data transfer specs
Continuous Improvement/Continuous Development
Understand end to end requirements for stakeholders and contribute to process and conventions for clinical data ingestion and data transfer agreements
Adhere to SOPs for computer system validation and all GCP (Good Clinical Practice) regulations
Ensure compliance with own Learning Curricula, corporate and/or GxP requirements
Assists with quality review of above activities performed by a vendor, as needed
Assess and enable clinical data visualization software in the data flows
Performs other duties as assigned within timelines
Performs clinical data engineering tasks according to applicable SOPs (standard operating procedures) and processes.
Bachelor's degree in computer science, statistics, biostatistics, mathematics, biology or other health related field or equivalent experience that provides the skills and knowledge necessary to perform the job.
BS with ~15+ years of experience. Minimum of 3 years' experience in managing data engineering teams, around 5+ years hands-on knowledge building data pipelines to manage heterogenous data ingestions or similar in data integration across multiple sources including collected data.
Experience with Python/R, SQL, NoSQL
Cloud experience (i.e. AWS, AZURE or GCP)
Experience with GitLab, GitHub
Experience with Jenkins, GitLab
Experience deploying data pipelines in the cloud
Experience with Apache Spark (databricks)
Experience setting up and working with data warehouse, data lakes (eg: snowflake, Amazon RedShift etc.,)
Experience setting up ELT and ETL
Experience with unstructured data processing and transformation
Experience developing and maintaining data pipelines for large amounts of data efficiently
Must understand database concepts. Knowledge of XML, JSON, APIs.
Demonstrated ability to lead projects and work groups. Strong project management skills. Proven ability to resolve problems independently and collaboratively.
Must be able to work in a fast-paced environment with demonstrated ability to juggle and prioritize multiple competing tasks and demands.
•Ability to work independently, take initiative and complete tasks to deadlines.
Strong attention to detail, and organizational skills
Strong Project Management skills
Strong understating of end-to-end processes for data collection, extraction and analysis needs by end users
Strong ability to communicate with cross functional stakeholders
Good knowledge of office software (Microsoft Office).
Ability to visualize large datasets
Experience with Agile development methods
Experience working in pharma and/or healthcare industry
Is comfortable with ambiguity.
Excellent teamwork, organizational, interpersonal, conflict resolution and problem-solving skills.
Medium-High complexity project.
Base Salary Range: $143,500.00 to $205,000.00. Employees may also be eligible for Short Term and Long-Term Incentive benefits. Employees are eligible to participate in Medical, Dental, Vision, Life Insurance, 401(k), Charitable Contribution Match, Holidays, Personal Days & Vacation, Tuition Reimbursement Program and Paid Volunteer Time Off.
The final salary offered for this position may take into account a number of factors including, but not limited to, location, skills, education, and experience.
This position is currently classified as "hybrid" in accordance with Takeda's Hybrid and Remote Work policy.
Takeda is proud in its commitment to creating a diverse workforce and providing equal employment opportunities to all employees and applicants for employment without regard to race, color, religion, sex, sexual orientation, gender identity, gender expression, parental status, national origin, age, disability, citizenship status, genetic information or characteristics, marital status, status as a Vietnam era veteran, special disabled veteran, or other protected veteran in accordance with applicable federal, state and local laws, and any other characteristic protected by law.