One really underappreciated aspect of R is that it's a lisp at heart. This enables the user (and enterprising package writer) to build really clean abstractions for the task at hand.
The tidyverse suite of Hadley Wickham is a great example of this, notably with the pipe operator %>% (similar to |> in F#) which is not part of the base language and yet could be very easily implemented. Julia's macros probably enables the same type of implementation, but I don't see how one would achieve it as easily in Python for example. Non-standard evaluation is another example of R's lispiness in action .
Also, consider how easy it is to walk R's S-exp. Expressions in R can only be one of four things: an atomic value, a name, a call or a pairlist. Wickham's Advanced R has a great intro on this .
I believe Wickham's amazing work with tidyverse (which really changes the way you code in R) is just the beginning of a rediscovery of R's inner lisp power, a kind of "R: the good parts" moment.
I have seen HN crowd hating R very similar to hating js. While I'm not getting into those details, I'd like to list a few reasons why I like R:
- RStudio is simply great. I know Python has got Jupyter notebook but RStudio makes a good IDE for anyone (even beginners).
- Python is great because it's easier for beginners to start doing magick without getting frustrated hence a good beginners language and it is more appropriate for R because anyone who wants to begin with Data Analytics, R is a lot easier - without trying to figure out how to install a new package, load a new package, make a plot or anything of that matter. Hence the fall out rate would be less.
- Tidyverse. Without denial, it's a better Universe than Marvel's cinematic universe. Not a single day in my job goes without using dplyr.
- While I've quoted tidyverse in general, ggplot2 - embracing the grammar of graphics has set a very nice standard for visualization libraries which matplotlib (the goto library of Python doesn't offer much)
- Pandas is nothing but a library built on Numpy to offer R like data wrangling functions hence I'd like to consider dplyr and R's inbuilt data manipulation functions superior.
There is no doubt that Python has its own advantages with single library scikit-learn and webservices, R is no way to be hated.
Even millenial companies have found interest in R https://medium.com/airbnb-engineering/using-r-packages-and-e...
Missed RShiny to simply create a web app (unlike in Python starting a Flask server and then writing stuff on top of it)
I know many people think otherwise, but I hate R for many reasons. Here are some of them:
- You can use '=' and '<-' to assign values to variables and both do the same, except in a few edge-cases where you now spend one week finding the error
- It confuses and mixes functional programming and oop not only per entity but also between the usage of them. Want to get a value of entity X? use x.getValue(). Want to get a value of entity Y? Use Y.getValue(y).
- The ide crashes once an hour and does not detect file-changes which forces you to restart it manually.
- People say R is the best and optimized for data-analytics which is simply not true. It's a marketing-lie spread by the creators. There is no data-analytics-task that you cannot do with the same ease in other programming languages.
Disclaimer: My big-data-profs enforced me to use R even for tasks where R should not be used.
A few weeks ago I had to do some data transformation (just a few thousand lines of data). Because I have some history with Excel I startet LibreOffice and wrote some formulas. After a few days I reached the point when LibreOffice required one and a half hours to recalculate the formulas.
That was the moment when I asked a friend of my who has some R experience to help me with the basics (yes the syntax is kinda weird at the beginning). After 4 hours of learning by doing we had the same result as what I had reached in a few days of work with LibreOffice and it calculated everything in about 17 seconds. Yes, this time I knew exactly what I wanted and R can do much more efficient transformations than you could ever do with a spreadsheet calculator. Nevertheless I was quite happy with the result.
As I am normally use to code with vim and tmux I use R just like a (bash)-script with the following shebang:
That way I can throw it into a watch myScript.R while I write it in vim in a different tmux pane. That might have some disadvantages compared to RStudio (e.g. can't view graphics in a terminal), but as it fits very nicely into my normal workflow and performs very well, I am very happy with that solution.
The book "R for Data Science" by Garrett Grolemund and Hadley Wickham (O'Reilly, 2017)  provides a comprehensive introduction to modern R and a set of packages known as the tidyverse. Highly recommended.
To save others some of the head-banging sessions I've had with R:
R has an integer division operator, %/%. R gives you the ability to define your own infix operators, as long as you give them symbols that start and end with %. Here's the kicker--all such operators have a higher precedence than multiply and divide, which can lead to unexpected results.
R as a programming language can be frustrating. It has scalar values; you just can't store one in a variable (it becomes a vector of length one). Some functions and operators will work with vectors of arbitrary length... but some require a vector of length one.
(Speaking of which, binary operations on vectors are done by adding corresponding elements, BUT if one operand runs out first, it will start picking them off from the beginning again, with a warning if the length of the longer one isn't a multiple of the length of the shorter one. This may be surprising.)
The wonky list notation takes time to get used to: foo gives you a sublist; chances are you want foo[].
Deciding which of the *apply() functions you want can be a pain. What passes for lambda expressions in R is clunky.
m:n gives you a vector of m, m + 1, ..., n... unless M > n, in which case it assumes you want m, m - 1, ..., n, so 1:0 won't give you an empty vector. This makes for clumsy special case code.
Man I dislike R for its syntax. It does a terrible disservice to people who start coding in R and then think that they "know programming" while they have missed most of the basic programming paradigms any "normal" programming language has.
I think R has a lot of similar ideology as PHP and well everyone has their own opinion about PHP.
Also I found the tutorial seriously lacking I mean no data.frames, matrices, vectors, tables or factors? How to iterate over data.frame might be the biggest thing a beginner needs to know before shooting themselves in the head. apply, lapply, sapply or vapply - which one do I need? Well IMO apply is the best one to start with as it's the basis of them all. sapply is almost the same but it just transforms the result into a vector or matrix.
I'll probably get downvoted for this, but let me tell you - Please don't use R in production. Please don't use R for any serious work.
Over the years, I've come to learn to appreciate the fact that languages are just tools. You simply use the right tool for the job. If you let your personal bias, love/hate get in the way, it will cause you a lot of pain in the long run. In the same token, R is one of the most fucked up languages to work with if you use it simply because you assume it's good for all analytics-related projects. It's not.
In one of my previous companies, we had a hipster, always used everything that's on trend. Against all advice, he decided to use R for many of our internal and client facing projects.
For what would have taken a week if Rails were used, he'd write everything in R Shiny. Yes, he used a statistical programming language to write a web application and serve APIs(!). Performance was terrible. There were lot of break downs. Development prolonged, even his own team members lost morale. I unfortunately had the ill luck of having to maintain some of his codebases and those days were the worst in my life. Worse yet, he didn't have a formal software engineering background, so he loved the idea that you are able to code everything inside of this blackbox called R Studio. Fuck tests, there were no tests written because he didn't understand the importance of tests. The projects he worked on lasted for nearly 1.5 years without completion. Almost every project had an instance on the cloud running an R server and it also costed a LOT simply because it was eating a lot of memory. Even our Ruby projects didn't consume as much.
Eventually most of the projects failed, we lost lot of customers. Many team members quit. All because of one singular mistake of choosing a language that's not right for the job. Eventually, one of our competitors came up with a working prototype in production using Python, Flask and with much better analytic capability at scale in less than 3 months. Python can do a LOT that R can do and cannot do and the code is much, much easier to read.
For example, string concatination:
hello + world
If you're really interested in data science and/or analytics, I sincerely urge you to start with Python and Pandas together rather than R. It is much, much performant, easier to reason, and much, much easier to maintain and scale. Please consider this as heartfelt advice based on my mistakes rather than a rant. Thank you.
I have started really enjoying R (with tidyverse) because it allows me to present complicated topics in a very simple manner. I can easily embed short R snippets and LaTeX equations in an Emacs Org mode document, and then export it as a very nice-looking easy-to-read HTML or PDF document with basically no effort other than coming up with the text itself.
It is incredibly liberating.
As the other comments on this submission imply, if you’re learning R from scratch, start with tidyverse.
You can use base R, but when people talk about how much they hate R, it’s usually because of base R, not tools like dplyr/ggplot2. (I had learned R and used it in college, and nearly quit R entirely until dplyr was released)
And over the last summer, I started using forcats/lubridate, and I am kicking myself for wasting my time not using them sooner and using ugly hacks for the appropriate functionality instead.
> For someone like me, who has only had some programming experience in Python, the syntax of R feels alienating initially. However, I believe it’s just a matter of time before adapting to the unique logicality of a new language.
I preferred R to Python right from the start. However, R is anything but logical, and its syntax is the least of its problems.
> And indeed, the grammar of R flows more naturally to me after having to practice for a while, and I began to grasp its kind of remarkable beauty, that has captivated the heart of countless statisticians throughout the years.
Wow, statisticians care about beauty? This is a shocking scientific discovery! (In the social sciences, but don't let this detract from your achievement.) What data do you use to support your theory?
R can be an annoying programming language, but for some reason I've found it easier to use for prototyping than even Python. I think it's because I can sloppily copy and paste between notepad and repl without much issue, whereas in Python I have to be concerned about the whitespace and things are a bit more verbose. I also get more out of the graphing capability of R, but that's probably because I don't understand Python's graphing well enough. Be that as it may, R just seems to have what I need to get things done as sloppily as I need. My workflow tends to be a combination of Python or Java spitting out numbers, and then using R to analyze and graph those numbers, all glued together with Bash scripts.
I use R (or want to use it) whenever I find myself using excel or google-spreadsheet. If I was more fluent in R I would use it many more times. I found that using it instead of standard spreadsheet was much more robust. Spreadsheet have their role, however R is an amazing tool to have in your programming toolset.
It's clear there are a lot of strong opinions about R!
One kind of obscure problem I run in to is R's embrace of a global namespace. Package developers sometimes assume people are using this namespace, and access it via the globalEnv() function. This means that to use the package anywhere else, you basically have to patch their code.
(in contrast, I don't even think about problems like this occurring in python packages. Worst case scenario, can just use a subprocess )
R is great, haters gonna hate, but when you want to prototype a model, nothing flows like R + tidyverse + RStudio.
YAFL, yet another “fine” language.
The biggest problem with R: it is too slow.
R is free programming - see the R site above for the terms of utilization. It keeps running on a wide assortment of stages including UNIX, Windows and MacOS.
There's no reason to use R unless you are unable to learn.