Why is R such a popular data science tool?

data science tool

I recommend that you to learn R as your first “data science programming language”. Although there are exceptions (for example, if you need a specific project), I think R is the best choice when you start.

Here’s why:

R becomes the “lingua franca” of the data science

R becomes the lingua franca for the data science. This does not mean that it is the only language, or that it is the best tool for every job. It is however the most used and it is increasing in popularity.

As I have already noted, O’Reilly Media conducted a survey in 2014 to understand the tools used by data scientists. They found that R is the most popular programming language (if you exclude SQL as the “correct” programming language).

More broadly, there are other rankings that relate to the popularity of programming language in general (not only to data scientists). For example, Redmonk measures the popularity of the programming language by examining the discussion (on stack overflow) and usage (on GitHub). In their last ranking, R finished the 13th, the highest of any statistical programming language. Redmonk also noted that R has grown in popularity over time.

A similar ranking by TIOBE (which ranks programming languages ​​by the number of searches on search engines) indicates a large increase from year to year for R

Companies using R

R is widely used in several of the best companies that hire data scientists. Google and Facebook – which I consider to be two of the best companies to work in our modern economy – both have scientists using R.

As noted recently by Revolution Analytics, “R is also the tool of choice for Microsoft scientists, who apply automatic data learning from Bing, Azure, Office and the Sales, Marketing and Finance departments.”

Beyond the technology giants like Google, Facebook and Microsoft, R is widely used in a wide range of companies including Bank of America, Ford, TechCrunch, Uber and Trulia.

R is popular in academia

R is not only a tool for the industry. It is also very popular among researchers and academic researchers, a fact attested in a recent profile of the R programming language in the prestigious Nature journal.

R’s popularity in academia is important because it creates a pool of talent that fuels the industry.

Says differently, if the best and brightest are trained at R at the university, this will increase the importance of R in the industry. The supply of academics, doctoral students and researchers who leave the academic world for business will create its own demand for people with R.

In addition, as data science matures, data scientists in the business world will need to communicate more with academic scientists. We will have to borrow techniques and share ideas. While we are instrumentalizing the planet and turning the world into a data flow, the lines between academic science and business-oriented data science are likely to be blurred.


Leave a Reply

Your email address will not be published. Required fields are marked *