Will custom processing units subvert Nvidia’s GPUs?

Microsoft fNEH83
Feb 8, 2018 16 Comments

The GPU market is buoyed by gaming, AI, and crypto. I am bearish on crypto. However, I am neutral/mildly bullish on gaming (VR/AR is the next wave of gaming), and I am extremely bullish on AI (computer vision, autonomous vehicles).

Assuming level 5 vehicular autonomy is reached first by big/unicorn tech (Waymo, Uber, Lyft, etc). It seems these companies have the leverage to poach talent and vertically integrate. Recent attempts to subvert Nvidia’s dominance in the GPU AI market include Google’s Tensor Processing Unit (TPU).

My point is that the majority of deep learning efforts take place in the cloud (AWS, Azure, GCP). It seems Amazon, Microsoft, and Google are highly incentivized to develop their own processing units for AI. Upon doing so, they could potentially push Nvidia out of the cloud by undercutting compute costs using their own hardware.

Thoughts?

* Disclaimer: I own Nvidia shares.

comments

Want to comment? LOG IN or SIGN UP
TOP 16 Comments
  • Netflix CsMk18
    Good analysis. It depends on the adoption. If AI/deep learning has wide adoption (that is, your average company is investing in deep learning), then Nvidia will benefit. If the adoption is limited to few tech companies and limited verticals, NVidia won’t benefit much because of vertical integration by tech companies like you mentioned.
    Feb 8, 2018 5
    • Northrop Grumman / Eng Kolmogorov
      True, smaller shops could still benefit from cheap options especially if their use cases are much smaller in scope than say Google's.
      Feb 8, 2018
    • Apple b00
      Can you think of any sample use cases? Why would smaller shops invest in deep learning?
      Feb 8, 2018
    • Northrop Grumman / Eng Kolmogorov
      Hmm I guess less production and more just applying the techniques to some problems theyre interested in. Maybe the cloud is still more appropriate there.
      Feb 8, 2018
    • Netflix CsMk18
      All businesses can benefit from personalization for example. And AI/machine learning can play a huge role in that
      Feb 8, 2018
    • Microsoft fNEH83
      OP
      I think we’ve only begun to see the impending explosion of startups leveraging computer vision. This is kind of the point I’m making. If a startup uses CV, it could either leverage pre-trained models from machine learning as a service, or train their own models on the cloud platform. Even with processing units that are 1/2 as performant as Nvidia’s GPUs, the bulk of deep learning latency is in training, not in prediction/classification.
      Feb 8, 2018
  • Amazon FreeHat
    But who would be best positioned to create that specialized hardware? I mean, GPUs are specialized for visual computation. I’m from the semiconductor industry and it’s not trivial to just decide to become a large scale, high tech hardware manufacturer. As a technology, there’s tons of tribal knowledge that came from years and years of smart people optimizing lines that have unique problems. Just ask a semi plant that tries to spin up a new plant to make the same stuff. Almost impossible, hence Intel’s super extreme culture around process and detail. When you’re dealing with flatness specs that have a standard deviation of 2 atoms across a surface a foot in diameter, everything matters. To think I.e. Google could just manufacture at scale is to underestimate the challenges.

    Also, as a cost driven business with razor thin margins, these mega tech companies have no experience in that environment. The entire culture would need to be different and in that case, what’s the advantage of vertical integration? I think it’s more likely that the cloud businesses will just have enormous buying power and strongly influence a variety of competitors, as having all your eggs in one basket is too risky.

    But to answer your question, I think it’s inevitable that ML specific hardware replaces GPUs, but that Nvidia is the best positioned to develop that hardware.
    Feb 8, 2018 3
    • Microsoft fNEH83
      OP
      Also, what you have said about big tech leveraging its position to negotiate prices makes a lot of sense. We have already seen examples of this with medical insurance companies and medical providers/pharmaceutical companies.
      Feb 8, 2018
    • Cisco workerbeee
      Brilliant points! The strategy is to differentiate their cloud from others. No one cares about infrastructure per se. More companies will be interested switching to your cloud if you help them compete in their field using ML you provide.
      Feb 8, 2018
    • Intuit 1600club
      Amazon already has custom fab in their networking stack with silicon photonics, is that in house or via manufacturing partnership?
      Feb 8, 2018
  • Qualcomm / Eng zHIy06
    Even for average companies, a lot of them are starting to use cloud solutions from Google cloud , Amazon AWS , IBM and Microsoft Azure rather than building their own infra.
    I think these few powerful companies have massive influence on using whatever in their cloud solutions for ML / DL / AI and definitely they will try to minimize their dependence on NVDA GPU’s by building customized NPU / TPU
    Feb 8, 2018 0
  • Northrop Grumman / Eng Kolmogorov
    Short answer yes. A gpu is much more general than a tpu, which I think is just an asic. A specialized solution will always trump a more general one. Didn't the alpha go zero use like 2 tpus instead of the 500 gpus previously. Maybe there will be some kind of moores law for ai chips now. Nvidia kinda lucked out that the gpus we're efficient for this stuff.
    Feb 8, 2018 1
    • Microsoft fNEH83
      OP
      I agree but not from the specialization perspective. Just from the fact that a cloud provider could develop their own chips and offer a cheaper service.
      Feb 8, 2018
  • Amazon
    Psychopath

    Amazon

    PRE
    Amazon
    Psychopathmore
    Depends on the market, making asics is serious money and you need to have huge demands to justify.

    The good news is it is easy for nvidia and amd to do it if the market is there.
    Feb 9, 2018 0
  • Oracle / Eng ClouDev
    Isn’t your concern analogous to the one of smart phone CPUs?

    Millions of devices sold to date, and snapdragon is still the leading chipset.

    Macs use Intel’s chipset, etc.
    Feb 8, 2018 0
  • Microsoft fNEH83
    OP
    You’ve raised some pretty good points. I suspected something about the profit margins while posting. I think one thing to point out is that I’m not necessarily saying that in-house solutions need to beat Nvidia in performance, just that a good enough in-house solution could be competitive to Nvidia. 2 weeks of model training instead of 1 week seems acceptable.
    Feb 8, 2018 0

Salary
Comparison

    Real time salary information from verified employees