Research Nowdapaps

Scope for R data scientists

My wife has been a data analytics person throughout her career. Earlier she used to work on Oracle data analytics. She lost her job this year and has taken a career break to re-skill herself on machine learning and data science. I asked her to learn python .... but she didn’t like the language. She likes R. She is also participating in Kaggle competitions on machine learning using R. However I feel that R is not being used much and she will have a tough time securing employment because it’s too niche. I would like to know if major companies are doing machine leaning in R or not. My impression is that all ML projects are python or Java.

Add a comment
New
Zennely Apr 21, 2018

Following

Salesforce MauiWowie Apr 21, 2018

Python by a long shot. R is useful for analytics and stats, but for ML python at a minimum.

Facebook Blink-182 Apr 21, 2018

We use R heavily at fb. However it’s unclear to me what you mean by “re-skill on machine learning “. Does she want to be a ML engineer? Or a research scientist?

Research Now dapaps OP Apr 22, 2018

she is learning data analytics using R and she can also do predictions on data using R. For example she participated in “restaurant visitor forecasting” competition at kaggle and secured a rank in top 150. I think she used deep learning with R.

Facebook Blink-182 Apr 22, 2018

That doesn’t really answer my question.

Amazon Dominoz Apr 22, 2018

I have worked four years in BI/data science /data engineering .Heres the brutal truth ; Most of the cool ML you do as a beginner is useless (building random forest classifiers to predict restaurant visitors ,building logistic regression to predict how many people will watch the latest marvel movie etc ).This is simply because when you enter the industry there’s a demand for three kinds of folks : 1.People who can use Python to just hack any large data sets ,build dashboards ,data warehousing ,big data analysis (Spark ,EMR,Hadoop)These are BI and data engineers . 2.Researchers who can build state of the art deep learning/AI solutions .Usually will need PHD .These are the true “data scientists “ who can potentially earn millions .There are very few such people in the world and are usually coming out of the top CS PHD programs . 3.SDE s who work on deploying built ML solutions -ML engineers .They take existing ML solutions researchers propose and scale it. It sounds like your wife will be the best fit for category 1 .R will not help her one bit.She must be really really good in python and possibly scala.ML engineers usually use java and maybe python.

Oracle of·Nada Apr 22, 2018

First, yes, she needs Python like air if she wants lots of options. It's more popular, end of story. R jobs exist, but to get them she'll more often need to sell herself as knowing everything, including Python. I'll elaborate. She is looking for #1, but I disagree with the classification. #1 are not BI people and data engineers. BI isn't generally trained for ML or really much past SQL. R is already a stretch. Python or Spark? Maybe the very very basics, but not more. Next, the data engineers... These don't "hack any large datasets," but are more often the DB oriented part of a model deployment pipeline. Heavy heavy overlap with #3, usually little to do with intelligence/analytics-though large teams may also have some on standby to support analysts on advanced DB queries. #1 is what in most places is called a business analyst or a data scientist in a consulting role, depending on the extent to which daily tasks overlap into 2 and 3. A BA will know some ML and use it to support analyses. This includes the "useless" things like building random forest classifiers and logistic regression. Deep learning is usually beyond a BA's scope. A DS will do all that, plus explore new aproaches (including but not limited to deep learning), read research, create pipelines that enable quickly repeating these analyses, and develop the team's infrastructure. The large overlap with production is usually caused by budgetary constraints requiring just one role to fill both ad hoc needs, research, and deployment. Such DSs often enjoy autonomy and can use analysts as support. They also have considerable ability to determine their own stack even if the company generally likes everyone to use the same thing. R fits nicely here, but OP's wife would usually need to sell herself as a jack of all trades to land such a job, which means being competent with all the buzzwords. This includes Python, Spark, deep learning architectures, and so on. But not all is lost. R is not Julia, there are a fair number of teams which have R as a major part of their stack. At larger companies, these often run on Microsoft's and Oracle's enterprise R offerings (probably why she went with R). There are also outliers like Facebook, already mentioned several times on this thread. It may be possible for her to find a job at these. Still, this limits the options, as Python is the standard. Especially in SWE focused Sillicon Valley.

Amazon Pokebowl Apr 22, 2018

Dominoz advice is generally correct however I wouldn’t separate the roles so finely. It’s more like a spectrum as long as she is willing to learn. Start with #1 then #3 and then #2 or stop developing skills whenever comfortable. There are lots of areas in #1 but it does take some business sense to find those opportunities. You’d be surprised how many startups and new business units in large companies would benefit from something simple like an email campaign that assigns content based on segmentation from a simple decision tree or logistic regression.

Square anon42 Apr 22, 2018

If she gets good at R just let her be! Jeeeeez, at FB a lot of data scientists are heavy on R