Demo

Summer Internship - Data Science / Data Engineering

MKThink
San Francisco, CA Full Time
POSTED ON 4/14/2026
AVAILABLE BEFORE 5/13/2026
We are seeking a Data Science or Data Engineering intern (graduate student preferred, advanced undergraduate considered) to support development of an unstructured data extraction pipeline. The intern will build systems that ingest heterogeneous documents, identify relevant information, map extracted content to a target schema, and improve output accuracy through iterative user feedback.

About MKThink

MKThink is a future-forward design firm grounded in spatial intelligence and dedicated to “build less, solve more.” Our data-informed solutions improve human performance at less operational, environmental, and capital costs than conventional approaches. Founded in 2000, MKThink practices from the Pacific Edge of San Francisco to the Oceanic Edge of O'ahu. At MKThink, we believe that we can play a role in helping create a better and more sustainable future by creating intelligent spaces that improve the quality of life. Our greatest resource is our staff and their ability to contribute fully as teammates and individuals. We bring together thinkers from various disciplines to solve problems at the nexus of architecture, culture, and the environment. Our people have the interdisciplinary skills to contribute to this mission within and across the domains of architecture, strategies, and innovation.

Overview

Build an end-to-end pipeline to extract and structure data from heterogeneous, unstructured documents (e.g., PDFs with high format variance). Work includes document parsing, ML/NLP-based extraction, schema alignment, and confidence scoring. Implement a human-in-the-loop feedback system to iteratively improve accuracy (target ≥90% extraction & mapping accuracy, ≤3 iteration convergence). Requires strong Python and experience with data pipelines, machine learning, or unstructured data processing. Graduate students preferred.

Key Responsibilities

  • Build data pipelines for ingesting and processing unstructured documents, including PDFs with inconsistent structure and content
  • Develop extraction workflows that combine document parsing, feature engineering, ML/NLP methods, and rule-based logic to identify relevant fields
  • Design methods to evaluate what content is useful, discard irrelevant content, and align extracted information to a predefined schema
  • Implement confidence scoring, validation, and error-handling logic to improve extraction accuracy and reliability
  • Build a human-in-the-loop feedback workflow where users can confirm, reject, or correct extracted fields and trigger reruns toward improved output

Preferred Background

  • Graduate student preferred in Data Science, Computer Science, Engineering, or related field
  • Strong Python and experience with data pipelines, machine learning, or NLP
  • Experience working with unstructured data, document intelligence, information extraction, or schema mapping
  • Comfortable working on applied modeling and data engineering problems with ambiguous inputs and variable document formats
  • Proactive and highly self-motivated, able to operate independently with minimal guidance and supervision

How To Apply

Please submit the following:

  • Resume
  • Cover letter describing your fit for the role.
  • Work samples demonstrating your visual and writing skills.
  • Include "DS Summer Internship – [Your Name]" in the subject line.

Salary.com Estimation for Summer Internship - Data Science / Data Engineering in San Francisco, CA
$119,904 to $151,743
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Summer Internship - Data Science / Data Engineering?

Sign up to receive alerts about other jobs on the Summer Internship - Data Science / Data Engineering career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$73,798 - $89,311
Income Estimation: 
$90,112 - $113,166
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at MKThink

  • MKThink Honolulu, HI
  • ABOUT MKTHINK MKThink is a future-forward design firm grounded in spatial intelligence and dedicated to build less, solve more. Our data-informed solutions... more
  • 12 Days Ago

  • MKThink Honolulu, HI
  • About Mkthink MKThink is a future-forward design firm grounded in spatial intelligence and dedicated to “build less, solve more.” Our data-informed solutio... more
  • 15 Days Ago

  • MKThink Honolulu, HI
  • Title: Project Coordinator/Junior Project Manager Reports into: Project Executive and Project Manager Direct Reports: n/a Compensation: $75,000 - $110,000 ... more
  • 1 Day Ago

  • MKThink Honolulu, HI
  • MKThink is growing to meet the needs of our clients! We have a fantastic opportunity for a licensed architect to work on projects in a variety of sectors, ... more
  • 4 Days Ago


Not the job you're looking for? Here are some other Summer Internship - Data Science / Data Engineering jobs in the San Francisco, CA area that may be a better fit.

  • Tessera Data San Francisco, CA
  • About Checkr Checkr is building the data platform to power safe and fair decisions. Established in 2014, Checkr’s innovative technology and robust data pla... more
  • 11 Days Ago

  • Capital One San Francisco, CA
  • Key Role Details This is a full-time paid internship program spanning ten weeks from June through August 2026. Participation in the internship requires tha... more
  • 30 Days Ago

AI Assistant is available now!

Feel free to start your new journey!