What are the responsibilities and job description for the Data Scientist position at SoTalent?
Job Title: Data Scientist
Location: Philadelphia, Pennsylvania, United States
Type: Full Time
Our Client is looking for a talented and passionate Data Scientist to design, build, and deploy advanced AI solutions that enhance knowledge discovery across a global research ecosystem.
If you’re excited about working with cutting-edge AI technologies and making a meaningful impact in how scientific knowledge is accessed and used, this role is for you.
About the Role
In this role, you will work with vast and diverse scientific data sources including publications, datasets, citations, and knowledge graphs. You will help build intelligent systems that make knowledge more discoverable, connected, and actionable for researchers worldwide.
Key Responsibilities
- Design and deploy machine learning, NLP, and generative AI solutions
- Build intelligent systems for search, recommendation, ranking, and question-answering
- Develop solutions that connect datasets, publications, and knowledge graphs
- Fine-tune and implement large language models (LLMs) and RAG systems
- Create evaluation frameworks to measure quality, reliability, and impact
- Build scalable data pipelines and ML workflows for continuous improvement
- Apply a mix of classical ML, deep learning, and generative AI techniques
- Collaborate with cross-functional teams to deliver practical AI solutions
- Write clean, production-quality Python code and reusable components
- Continuously enhance system performance and real-world value
Required Qualifications
- Degree in Data Science, AI, Machine Learning, Computer Science, or related field
- Strong Python programming experience in production environments
- Solid foundation in machine learning (modeling, evaluation, optimization)
- Experience working with large-scale structured and unstructured datasets
- Hands-on experience with LLMs, embeddings, retrieval systems, and generative AI
- Familiarity with frameworks like Scikit-learn, PyTorch, TensorFlow, or Hugging Face
- Strong problem-solving skills with a data-driven approach