What are the responsibilities and job description for the Data Scientist position at Santcore Technologies?
- Data Scientist (ML & AI Modeling) – (Focus - ML)
- Ideally, has a PhD (must have master's at a minimum)
- Any Status
- Hybrid: 3-days/week onsite
- Position will run thru Fiscal Year End (April 2026) w/ a possible extension
- ONE 60-minute MS Teams interview
- Need one photo ID, please
Responsibilities:
As a data scientist, you will use expertise in modeling, data engineering, and building applications with LLMs. Your responsibilities will span a wide range of data science activities, such as those listed below:
- Collaborate with users to deliver big data and machine learning models and AI/LLM solutions.
- Use AI/ML to empower business with novel capabilities such as automating workflows by utilizing
- machine learning and LLM systems.
- Research and analyze complex data sets, combine different sources and types of data to develop
- machine learning and deep learning models making value out of data.
- Design and implement efficient, adaptable, scalable, and reliable pipelines and algorithms to process
- unstructured and structured data.
- Implement new statistical or other mathematical methodologies as needed for specific models or
- analysis.
- Leverage or build cloud-based technologies and solutions to deliver optimized ML models at scale.
- Prototype solutions and conduct experiments highlighting results and lessons learned.
- Provide technical expertise and guidance to users on AI/ML and LLM best practices.
- Help the users evaluate the AI/ML solution from a technical perspective.
- Leverage industry knowledge and stay close to latest technology trends
- Contribute to the development of sample applications, tutorials, presentations and training material for
- big data and machine learning technologies.
Qualifications:
- Master's or Ph.D. in Computer Science, Data Science, or related field, plus a minimum of four years of
- relevant professional experience or a master degree.
- A strong background in machine learning algorithms, natural language processing and LLMs.
- Proficiency in programming languages, including SQL, R, Python, Java, and MATLAB.
- Demonstrated work experience with prompt engineering, retrieval augmented generation
- architectures and LangChain/LangGraph or similar tools and frameworks.
- Demonstrated work experience with extract, transform, and load (ETL) for large‐scale, complex
- data sets.
- Demonstrated work experience with structured/unstructured data and parallel/distributed
- computation (PySpark).
- Knowledge of cloud technologies and services (Microsoft Azure, AWS, etc.).
- Knowledge of network configuration, information risk and security guidelines.
- Knowledge of version control systems, software configuration management and source code
- lifecycle management tools.
- Knowledge of data architecture and application architecture, target state design and strategy.
- Ability to research and identify innovative approaches for data acquisition, as well as new uses for
- existing datasets.
- Ability to effectively leverage knowledge, skills, tools, and techniques in managing complex
- programs and projects, ensuring alignment with defined business, technical, and security
- requirements.
- Ability to research, design, develop, implement and manage applications based on specific
- business needs