M+E Technology Job Board

Data Infrastructure Engineer for Machine Learning

  • linkedin
  • fb
  • twitter
  • google plus
  • email

Apple

At Apple, we are constantly pushing the boundaries of technology to create products that enrich people’s lives. To achieve this, we rely on innovative and dedicated engineers who are passionate about solving complex problems and delivering high-quality solutions. We are looking for a Data Infrastructure Engineer for Large Scale Machine Learning to join our team and help us build innovative data infrastructure that will enable us to extract insights and perform failure analysis on a vast amounts of data and variety of machine learning models.

Key Qualifications

Design, implement, and maintain data infrastructure using technologies such as Spark, Jupyter Notebooks, EMR, and AWS.
Collaborate with data scientists, machine learning engineers, software engineers, and product managers to understand their requirements and translate them into scalable, reliable, and efficient data pipelines, data processing workflows, and machine learning pipelines.
Build and maintain data ingestion pipelines from various sources including structured and unstructured data.
Design and implement data storage solutions that are scalable, secure, and cost-effective.
Develop data processing workflows that are optimized for performance and can handle large volumes of data.
Build and maintain pipelines that can evaluate large scale machine learning models efficiently.
Write and maintain high-quality code using standard methodologies such as code reviews, unit testing, and continuous integration.
Stay up-to-date with the latest trends and technologies in data infrastructure, big data analytics, and data-centric machine learning and apply them to improve the system.

Description
As a Data Infrastructure Engineer for Machine Learning, you will be responsible for designing, implementing, and maintaining data infrastructure using technologies such as Spark, Kubernetes, EMR, and many other technologies. You will work closely with data scientists, machine learning engineers, and product managers to understand their requirements and translate them into scalable, reliable, and efficient data processing workflows, and machine learning pipelines. You will work with a multitude of Machine Learning teams, and you will be challenged to design an infrastructure that can support their scale and wide variety of models. You will also be responsible for monitoring the performance of the system, optimizing it for cost and efficiency, and solving any issues that arise.
Education & Experience
Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or a related field.

Additional Requirements

At least 3-5 years of experience in building data infrastructure for large-scale machine learning using technologies such as Apache Spark, Kubernetes, AWS services
Strong proficiency in programming languages such as Python, Java, or Scala.
Experience with data storage solutions such as HDFS, S3, and NoSQL databases.
Familiarity with distributed computing frameworks such as Apache Spark.
Experience with machine learning frameworks such as TensorFlow, PyTorch or Keras
Experience with data and model visualization tools such as Jupyter Notebooks, Tableau or Plotly is a plus.