R vs Python: What do Data Scientists prefer?

R and Python are the most common programming languages in the data science world, but what exactly is the difference between the two?

This remains as a common topic debated within the Data Science community. Nevertheless, both programming languages do have their own strengths and limitations in their application.

If you are a professional that is looking to start a career in this field, here are some key takeaways for both R and Python along with trends that we are seeing in the Singapore and Hong Kong markets.  

 

History of the two programming languages

R

R is a statistical computing and graphics language and environment. According to R Project, it is a GNU Project – an operating system and an extensive collection of computer software – developed at Bell Laboratories. Similar to S language, R provides several options for statistical and graphic techniques.

Its functionalities include but are not limited to:

  • Linear and nonlinear modelling
  • Classical statistical tests
  • Time-series analysis
  • Classification and clustering.

Shared by R Project, R’s strengths include

  • Free software which runs on a wide variety of UNIX platforms namely Linus, Windows and MacOS
  • Ease with which well-designed publication-quality plots
  • Making design choices in graphics where user retains full control
  • Allows data manipulation calculation and graphical display
  • Storing and handling data

Overall, it is a simple and effective programming language which supports data scientists and experts to create and control conditionals, loops, user-defined recursive functions and input and output facilities.

Python

Python is a widely used, general-purpose, yet high-level programming language. Developed by Python Software Foundation, its main purpose was focusing on code readability to assist programmers to express concepts in a compressed form compared to Java, C++ and C. The objective is to provide code readability and advanced developer productivity.

Its functionalities include:

  • Developing and scripting code
  • Generation of code and software testing

Due to its elegance and simplicity, top technologically-driven organisations like Dropbox, Google, Quora, Mozilla, Hewlett-Packard, Qualcomm, IBM, and Cisco have implemented Python. Python is also an inspiration to the creation of many other coding languages such as Ruby, Cobra, Boo, CoffeeScript ECMAScript, Groovy, Swift Go, OCaml, Julia etc.

 

R vs Python: which is the preferred choice?

Dr Norm Matloff, Professor of Computer Science at University of California, wrote a paper on the key differences between the two Languages. He compared R and Python across the following multiple domains to determine which programming language was the better choice:

Elegance

Winner: Python

While this is subjective, Python greatly reduces the use of parentheses and braces when coding, making it more sleek, Matloff shared.

Machine Learning

Winner: Python (but not by much)

Python's massive growth in recent years is partially fuelled by the rise of machine learning and artificial intelligence (AI). Python offers a number of finely-tuned libraries for image recognition.

In Maltoff’s words, the Python libraries' power comes from setting certain image-smoothing operations.

Learning curve

Winner: R

Shared by Maltaff, data scientists working with Python must learn a lot of material to get started, including NumPy, Pandas and matplotlib. Nevertheless, matrix types and basic graphics are already built into base R. Novices can now be doing simple data analyses within minutes as R packages run automatically.

Statistical correctness

Winner: R (by far)

Advocates for Python – namely professionals working within machine learning – may seem to have a poor understanding of the statistical issues involved with the language. R, on the other hand, was written by statisticians, for statisticians. This suggests that subject matter experts in R will be able to ensure that the math behind analyses are as accurate as possible.

Parallel computation

Winner: It’s a draw

Matloff suggests that the base versions of R and Python do not have strong support for multicore computation. What he means by this is that both R’s parallel package, and Python's multiprocessing package is not a good workaround for its other issues. Nevertheless, external libraries supporting cluster computation are good in both languages, while Python has better interfaces to GPUs.

Libraries

Winner: Python

Python’s machine learning library – Scikit-learn – is deemed to be highly recognised as ‘gold-standard’. It provides a wide selection of supervised and unsupervised learning algorithms. Reported by Toward Data Science, this library, “by far the easiest and cleanest ML library”. Scikit learn was created with a software engineering mind-set. Its core API design revolves around being easy to use, yet powerful, and still maintaining flexibility for research endeavours. This robustness makes it perfect for use in any end-to-end ML project, from the research phase right down to production deployments.

 

What are the trends in Singapore and Hong Kong markets?

Shared by Donnie Maclary, Principal Consultant of Huxley Singapore, around 90% of all of the jobs that he is filling in Data Science and Analytics are looking for candidates that are well versed in Python. This is because Python offers a lot of flexibility as compared to R.

If you are looking to grow your career in this field, it is thus best to focus on being familiar with the full suite of Python. Additionally, other in-demand skills for data professionals include SQL, Spark, Hadoop, Java, Amazon Web Services (AWS), Scala, and Kafka.

 

Huxley can help!

If you are a Data Science and Analytics professional that is looking to add top-tier talent to your team, please reach out to us via the contact form below. Do keep your eyes peeled for more updates within this space on our LinkedIn page

 

If you would like to find out more information about the market outlook within the sector, please leave us your details below:


IR35 Readiness Programme: Outcomes

06 Sep 2019

Since this event, in Aberdeen we've helped and up-skilled around 70 of our customers in their preparations for the private sector reform, which will be coming into effect in April 2020. Read how you can get a copy of our template materials, or get in touch with us, by reading the rest of the blog.

Tags: IR35

Fintech trends in APAC and the relevant skills that you need to know

19 Aug 2019

The EY Global Fintech Adoption Index 2019 has identified that 58% of digitally-active Australians are using fintech products and services – a 37% increase from 2017. In spite of this considerable increase, Australia still lags behind the global consumer adoption average of 64%.

Should you be hiring contractors in the Risk, Compliance & Governance sector?

28 Aug 2019

The gig economy have been proliferating in Singapore for the past few years. Ministry of Manpower (MOM) reported figures that there were about 200,000 freelancers in 2016 and this figure has grown exponentially with increasing opportunities.

How will upcoming technology impact Singapore and Hong Kong trading floors?

09 Sep 2019

The adoption of technology in the trade finance sector has become more popular amongst top banks and financial institutions across Asia. This is largely due to the high competition between the Asian Tigers – Singapore and Hong Kong – to become the best trading hub in region.