What are the responsibilities and job description for the Senior Machine Learning Engineer position at TetraMem - Accelerate The World?
Responsibilities
- Develop, optimize, and deploy lightweight machine learning models for edge AI applications, particularly for audio processing.
- Implement and optimize ML models on embedded platforms, including FPGA and custom ASIC solutions.
- Work closely with hardware and software teams to integrate ML models into production systems.
- Research and implement state-of-the-art ML techniques to enhance model efficiency, latency, and power consumption for embedded AI applications.
- Improve inference efficiency and model compression techniques, including quantization, pruning, and knowledge distillation.
- Collaborate with cross-functional teams to drive innovation and contribute to the overall system architecture.
- Provide technical leadership and mentorship to junior engineers.
- Publish research findings, present at conferences, and contribute to open-source projects when applicable.
- 5 years of relevant industry experience (or a PhD) in Computer Science, Electrical Engineering, Machine Learning, or related fields.
- Strong hands-on experience in machine learning, with a focus on edge AI, on-device inference, and deploying lightweight models on resource-constrained devices.
- Expertise in modern ML frameworks such as PyTorch, TensorFlow (including TensorFlow Lite), and JAX.
- Proficiency in Python and C/C , with practical experience in ML model optimization and production deployment.
- Deep experience with model quantization (PTQ/QAT), pruning, knowledge distillation, sparsity, and other compression techniques for efficient edge inference.
- Hands-on experience developing for or integrating with AI chip SDKs, neural accelerators (NPUs/DSPs), or hardware-specific toolchains (e.g., NVIDIA TensorRT, Qualcomm Neural Processing SDK, ARM Ethos, or similar).
- Familiarity with edge inference runtimes (ONNX Runtime, ExecuTorch, TVM) and optimizing models for hardware constraints (latency, memory footprint, power consumption).
- Understanding of ML compiler and runtime design.
- Experience working with tools such as Optimum, ONNX, TensorRT, TFLite/LiteRT, ncnn, or CoreML.
- Familiarity with hardware acceleration techniques.
- Experience in embedded system development.
Salary : $200,000 - $280,000