Machine Learning Engineer at Pond

Full Time1 month ago
Employment Information
Job Description
We are looking for an experienced distributed deep learning engineer to drive cutting-edge decentralized artificial intelligence and machine learning projects. The ideal candidate will play a key role in developing innovative solutions that leverage advanced distributed computing techniques to solve complex problems in AI and ML.
Key Responsibilities
  • Design and implement large-scale model training using distributed deep learning frameworks such as PyTorch, TensorFlow, Ray, etc.
  • Manage and optimize model training and inference processes to ensure high performance and efficiency.
  • Containerize deep learning applications using Docker and orchestrate them using Kubernetes and Kubeflow.
  • Deploy and manage deep learning workloads on major cloud platforms including AWS, Google Cloud, and Azure.
  • Apply model compression and inference acceleration techniques to optimize performance.
  • Implement stream batch data inference techniques for real-time processing.
  • Collaborate with cross-functional teams to develop and execute technical strategies for distributed computing and deep learning solutions.
Job Requirements
  • Extensive experience in deep learning frameworks (PyTorch, TensorFlow, etc.) and model training/optimization.
  • Strong expertise in containerization (Docker) and orchestration techniques (Kubernetes, Kubeflow).
  • Proven experience with cloud computing platforms (AWS, Google Cloud, Azure).
  • Preferred experience in CUDA programming and multi-GPU communication optimization.
  • Knowledge of stream batch data processing techniques.
  • Ability to work collaboratively in a team environment and contribute to technical strategy development.
  • Strong problem-solving skills and ability to work on cutting-edge AI/ML projects.
Preferred Qualifications
  • Experience with Ray or other distributed computing frameworks.
  • Background in decentralized AI/ML systems.
  • Publications or contributions to open-source projects in relevant fields.
MyJob.one - Remote work. Real impact

New Things Will Always
Update Regularly

MyJob.one - Remote work. Real impact