Demo

Data Scientist Indianapolis - IN - Indiana

Jobs via Dice
Indianapolis, IN Full Time
POSTED ON 6/15/2026
AVAILABLE BEFORE 7/12/2026
We are seeking a Databricks Data Scientist with strong experience in Databricks Lakehouse, advanced analytics, and Genie (AIBI) to design, build, and deploy scalable data science and AI solutions. This role will focus on transforming enterprise data into actionable insights using machine learning, natural language analytics, and self-service BI powered by Databricks Genie.

You will work closely with medical, commercial, and R&D teams across the pharma and life sciences industry to build intelligent solutions that drive scientific and business impact from drug discovery to commercial analytics to patient outcomes.

Python PySpark Databricks MLflow Spark ML Delta Lake

Genie (AIBI) Unity Catalog SQL NLP GenAI HIPAA GxP

Key Responsibilities

Data Science & Machine Learning

Design, develop, and deploy machine learning models using Databricks (MLflow, Spark ML, Python) for pharma and life sciences use cases

Implement end-to-end ML pipelines covering data ingestion, feature engineering, model training, deployment, and monitoring

Build predictive models for patient identification, HCP segmentation, market access analytics, pharmacovigilance, and safety signal detection

Apply NLP and generative AI techniques (LLMs, RAG pipelines) to extract insights from medical literature, clinical notes, and regulatory documents

Conduct AB testing, model validation, and statistical analysis to evaluate model performance and business impact

Collaborate with data engineers to ensure reliable, high-quality, production-ready datasets in the Lakehouse

Databricks & Lakehouse Architecture

Leverage Databricks Lakehouse (Delta Lake, Unity Catalog) for scalable, governed, and high-performance analytics

Design and optimize Spark jobs for performance and cost efficiency across large-scale pharma datasets

Apply best practices for data governance, data lineage, and security within Unity Catalog

Build and maintain Bronze Silver Gold Medallion architecture for clinical, claims, and commercial data

Implement Delta Live Tables (DLT) pipelines with data quality checks for real-time and batch processing

Configure and manage Databricks Workflows, Repos, and cluster policies for production ML workloads

Genie (AIBI & Natural Language Analytics)

Configure and enable Databricks Genie for self-service analytics across business and scientific teams

Design semantic layers and curated Gold datasets optimized for natural language queries via Genie

Define certified questions, trusted assets, and business glossary terms to improve Genie response quality

Partner with business stakeholders to translate complex pharma questions into Genie-enabled insights

Monitor and iterate on Genie Spaces based on user feedback, query accuracy, and adoption metrics

Enable non-technical users across Medical Affairs, Commercial, and R&D to self-serve data insights

Real-World & Clinical Data Analysis

Analyze real-world data (RWD), electronic health records (EHR), claims data, and clinical trial datasets to generate actionable insights

Build scalable data pipelines for pharma-specific sources including IQVIA, Symphony Health, Komodo, and specialty pharmacy data

Apply survival analysis, mixed models, and Bayesian methods for epidemiology and health economics (HEOR) studies

Ensure all models and data processes comply with HIPAA, GxP, and 21 CFR Part 11 regulations

Business Enablement & Stakeholder Collaboration

Work closely with product owners, analysts, and business leaders to identify and prioritize high-value data science use cases

Communicate complex analytical results and model outputs in a clear, business-friendly manner to non-technical audiences

Produce analytical documentation: model cards, design specs, performance reports, and executive summaries

Lead sprint ceremonies as analytics owner: architecture reviews, estimation sessions, and release planning

Required Qualifications

Experience: 4 years of professional experience in data science or advanced analytics, preferably in pharma, biotech, or life sciences

Education: Bachelors or Masters degree in Data Science, Computer Science, Statistics, Engineering, or a related field

Databricks: Hands-on experience with Databricks and Apache Spark for large-scale data processing and ML workloads

Python: Strong programming skills in Python PySpark, Pandas, NumPy, Scikit-learn for data science and ML development

MLflow: Experience building and deploying ML models in production using MLflow for experiment tracking and model lifecycle management

SQL: Solid understanding of SQL and data modeling for analytical and reporting workloads on large datasets

Delta Lake: Experience with Delta Lake, Unity Catalog, and Medallion architecture (Bronze Silver Gold) for Lakehouse analytics

Genie AI-BI: Familiarity with Databricks Genie or AIBI tools for natural language querying and self-service analytics

Healthcare Data: Experience working with clinical, claims, or real-world healthcare data (EHR, RWD, specialty pharmacy)

Compliance: Familiarity with HIPAA compliance and handling of sensitive patient data in regulated environments

Communication: Strong communication skills ability to translate complex models and analysis into clear, actionable business insights

Preferred Qualifications

Experience with Databricks Genie Spaces configuration, semantic layer design, and certified question management

Hands-on experience with Delta Live Tables (DLT) for streaming and batch data quality pipelines

Familiarity with LLMs, RAG pipelines, or generative AI for medical and scientific use cases

Knowledge of GxP validation and 21 CFR Part 11 compliance for production ML models

Experience with IQVIA, Symphony Health, Komodo Health, or similar pharma data vendors

Familiarity with clinical trial data standards: CDISC, SDTM, ADaM

Experience with pharmacovigilance, drug safety signal detection, or regulatory analytics

Knowledge of AWS or Azure cloud services for ML deployment: SageMaker, Azure ML, Lambda, or equivalent

Databricks certifications: Databricks Certified Machine Learning Professional or Data Engineer Associate

PhD in a quantitative or life sciences field is a plus

Prior experience in large-scale IT consulting or services delivery (TCS, Infosys, Accenture, Wipro, or similar)

Desirable Skills:

Keyword:

Skills: Digital : Databricks

Experience Required: 4-6

Salary.com Estimation for Data Scientist Indianapolis - IN - Indiana in Indianapolis, IN
$116,812 to $143,790
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Data Scientist Indianapolis - IN - Indiana?

Sign up to receive alerts about other jobs on the Data Scientist Indianapolis - IN - Indiana career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$90,112 - $113,166
Income Estimation: 
$116,765 - $144,626
Income Estimation: 
$116,765 - $144,626
Income Estimation: 
$142,836 - $179,016
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Jobs via Dice

  • Jobs via Dice Douglas, WY
  • Energy Transfer , recognized by Forbes as one of America's best large employers , is dedicated to responsibly and safely delivering America's energy . We a... more
  • 1 Day Ago

  • Jobs via Dice Smithfield, RI
  • job summary: Focus on customer: Demonstrate understanding of customer's business domain. Ensuring the technology team is building the right software soluti... more
  • 1 Day Ago

  • Jobs via Dice Middletown, RI
  • Job ID: 2612055 Location: Middletown, RI, US Date Posted: 2026-05-03 Category: Quality Assurance Subcategory: Qual Assurance Technician Schedule: Full-Time... more
  • 1 Day Ago

  • Jobs via Dice Cranston, RI
  • Dice is the leading career destination for tech experts at every stage of their careers. Our client, Talent Groups, is seeking the following. Apply via Dic... more
  • 1 Day Ago


Not the job you're looking for? Here are some other Data Scientist Indianapolis - IN - Indiana jobs in the Indianapolis, IN area that may be a better fit.

  • Curare Physician Recruiting Indianapolis, IN
  • Gastroenterology opportunity in Indianapolis with a large, physician-led system offering advanced procedures, strong APP support, and shared call. Practice... more
  • 3 Days Ago

  • ProMedical Staffing LLC Indianapolis, IN
  • Neurohospitalist Needed for Indianapolis Indiana Looking to join a premier neurology group in Indiana? JOB DESCRIPTION: Board Certified or Board Eligible N... more
  • 8 Days Ago

AI Assistant is available now!

Feel free to start your new journey!