Building blocks for ML infra

Yelp
capKirk

Go to company page Yelp

capKirk
Jan 22, 2019 7 Comments

The building blocks and challenges for a distributed service are clear from the papers like Amazon Dynamo db, big table and several other Apache projects.

Interested in the challenges of machine learning infrastructure. What is a good place to start? What are the challenges? Any good papers or architectures for reference?

Thanks!

comments

Want to comment? LOG IN or SIGN UP
TOP 7 Comments
  • Amazon
    vdcb40

    Go to company page Amazon

    vdcb40
    Hidden technical debt in ML systems is a classic paper.

    Look at the papers/blog posts about ML platforms that have sprung up. Uber’s Michelangelo, Twitter’s DeepBird, Apple’s Alchemist, Google’s TFX.

    After that you can take a look at available ML platforms/toolkits to understand what today’s problems are and how they’re being addressed - Amazon SageMaker, Kubeflow, TensorRT, Amazon Elastic Inference.

    ML infrastructure is a very broad domain, but these are good reference points for today’s infrastructure challenges.
    Jan 22, 2019 6
    • Thanks @vdcb40 for such a detailed explanation!
      Jan 24, 2019
    • Yelp
      capKirk

      Go to company page Yelp

      capKirk
      OP
      Thanks for the detailed responses.
      So if I want to grow and get expertise in ML infra should I be aware of the hardware aspects?
      Big data infra has abstracted out most of it and seems to work well on commodity hardware. Never had the need to delve deeper for that case
      Jan 24, 2019