What are the responsibilities and job description for the NLP Research Scientist position at Harnham?
NLP Research Scientist
$180,000 – $200,000
Washington DC Metro Area - Hybrid (2-3 Days Per week)
THE COMPANY
Our client is a global technology organization at the forefront of AI innovation, known for applying cutting-edge research to real-world challenges across legal, tax, government, and media domains. Their Labs division operates as a fast-moving, collaborative innovation hub focused on foundational research, model development, and applied AI.
With one of the world’s richest proprietary data sources, they are uniquely positioned to train next-generation language models and build intelligent systems with measurable, real-world impact. The Labs team partners with leading universities, domain experts, and internal product teams to develop, test, and scale transformative technologies.
THE ROLE
You will play a key role in advancing the company’s internal machine learning capabilities—especially in natural language processing (NLP). You'll work on large-scale data pipelines that support training and evaluation for a range of NLP models, not limited to LLMs, with a focus on real-world performance and scalable deployment.
In this role, you will:
- Design and optimize robust, scalable data pipelines for processing and curating high-quality text data for NLP model training
- Build systems that connect data collection, filtering, and annotation workflows with training and evaluation pipelines
- Collaborate with applied researchers to implement performance-driven data selection strategies and model feedback loops
- Work closely with legal and domain experts to ensure training data aligns with real-world use cases
- Support version control, testing, and backup systems to ensure data integrity and reproducibility
- Contribute to foundational research projects in NLP, with opportunities to publish or support publication in top-tier ML/NLP venues
- Partner with academic collaborators and internal cross-functional teams on long-term innovation projects
- Help integrate the latest advances in NLP into high-growth, user-facing products across multiple business units
WHO YOU ARE
You’re a curious and technically skilled engineer with a passion for data-driven AI research. You’re comfortable working in complex environments and bring a strong foundation in systems engineering, machine learning, and data architecture.
Required Qualifications
- Degree in a technical field (e.g., computer science, data engineering, ML)
- Experience in applied ML (e.g., few-shot learning, NLP classifiers)
- Strong software engineering skills with experience in debugging and system design
- Proficiency with SQL and NoSQL databases (e.g., PostgreSQL, MongoDB)
- Experience with orchestration tools and cloud data platforms (AWS, GCP, or Azure)
- Excellent written and verbal communication skills
- Ability to work independently and collaborate with cross-functional teams
Preferred Qualifications
- Familiarity with the legal domain or interest in legal tech
- Hands-on experience with Spark, Hadoop, or other big data tools
- Experience in research or academic publishing in ML
- Open-source code contributions or infrastructure development background
WHAT WE OFFER
- $180,000 – $200,000 USD/year, based on experience
- Comprehensive benefits including health, dental, vision, 401(k), tuition reimbursement, and wellness support
HOW TO APPLY
Please express your interest by submitting your resume via the Apply link on this page.
KEYWORDS
Machine Learning | Research Engineering | LLM | NLP | Data Pipelines | Cloud Platforms | AWS | Azure | GCP | Spark | NoSQL | Legal Tech | AI Research | Python | SQL | Orchestration Tools | Big Data | Foundational Models
Salary : $180,000 - $200,000