A Quick Introduction to R (opens in new tab)

(github.com)

171 points_fnhr4y ago103 comments

103 comments

69 comments · 18 top-level

_Wintermute4y ago· 17 in thread

My least favourite things about R is its desire to keep on running when it should have errored on something about 50 lines before and happily spitting out some nonsense result - maybe with a warning, often not.

One of my previous jobs basically turned into an in-house R consultant for a department in a pharmaceutical company, and I caught so many bugs when investigating some other issue which meant the results people were reporting were completely wrong. A really common one is multiplying 2 vectors of unequal length where broadcasting shouldn't be possible and it just recycles the shorter vector - but hey, it ran without error and there's an output so many researchers don't notice.

Not to mention trying to handle errors is pretty miserable, if you want to catch a specific error you have to match the error string, unfortunately the error message changes depending on the locale the R session is running in.

haddr4y ago

This is why it is a good idea to do “set options(warn=2)” to turn warnings into errors & easily spot problems.

melling4y ago

Any other tricks that are helpful for a beginner to know?

    Use Rstudio 
    Include tidyverse
    Turn warnings into errors

5 more replies

salamandersauce4y ago

Is this really a unique to R or do all programming languages have some foibles? For example I spent an hour recently debugging C++ because I forgot that it loves to do integer division despite the fact it's going into an explicitly typed double. No error, no warning. You just have to know and I highly doubt it's desired behavior for most cases.

Most researchers are not programmers and don't care about programming. It's a tool to get the job done and I think you'd run into similar problems with other languages.

dtgriscom4y ago

If you divide two integers, you get an integer. You can then cast it to whatever you want. Or, if you want some other type, you need to cast it before the operation is done.

1 more reply

funesrequiem4y ago

Hi, would it be possible to contact you to ask some career questions related to the pharmaceutical industry and data science? I'm a biostatistician who uses R for everything and lately I've been thinking about doing a career change, but I'm a bit lost with all the available options.

_Wintermute4y ago

Sure, my email is in my profile.

fithisux4y ago

They should have done more for the software engineering side of things because people use it for this reason.

For repl driven development or academic code or exercises it is excellent.

stewbrew4y ago

Maybe these overly self-confident software engineers should just go RTFM.

clove4y ago

Sounds like a fun job. How'd you get that position? If you're retiring soon, I'll fill the position for the company.

sva_4y ago

My least favorite thing so far was indices starting at 1. It seems blasphemous, in a way.

On a more serious note, I agree that R being too charitable in interpreting things (seemingly without warning) seems to be a problem. You'll have to do some debugging to make sure it actually does what you intended it to do. I've only dabbled in it a bit though.

kergonath4y ago

> My least favorite thing so far was indices starting at 1. It seems blasphemous, in a way.

In the real world we start counting from 1. CS people cannot stop complaining about it but it makes sense in languages used for mathematics and statistics. Zero-indexing is not very relevant if you don’t care about memory layout.

1 more reply

CornCobs4y ago

Honestly indices starting from 1 fits really nicely in most situations. 1-based indexing together with ranges and inclusive range-based indexing makes loops and subsetting code really readable IMO

hashimotonomora4y ago

It’s pretty standard in math software such as Matlab and Octave.

ProjectArcturis4y ago

That's my favorite thing for R relative to Python. Far more intuitive to start at 1 rather than 0.

ekianjo4y ago

> It seems blasphemous, in a way.

It's more natural. You never count from zero with real life objects.

Mikeb854y ago

It's standard because indexes start at 1 in Fortran. Not sure why it's an issue, especially because you never need to use loops in R anyway.

epgui4y ago

Many languages are like that. I learned recently that indices also start at 1 in PostgreSQL / PLpgSQL.

j7ake4y ago· 8 in thread

Once you include the statistical packages, ggplot2, and dplyr, there is nothing that beats R in ease of prototyping for data exploration, model fits and sanity checks, and data visualisation of high dimensional data.

funesrequiem4y ago

I don't know if you've heard about it, because it is a relatively recent development, but the tidymodels ecosystem of packages (https://www.tidymodels.org) is also breaching the gap from data exploration/visualization to advanced modeling and machine learning in a way that feels really natural if you're used to the tidyverse way of doing things. It's developed by RStudio as the improved version of caret. I've been using it for differential gene expression analysis and it's a game changer in how much time it saves me.

folli4y ago

What about python and its countless packages? (Honest question, I 'grew up' using python in an academic setting, but haven't caught up with the latest developments)

j7ake4y ago

I use python for more data engineering, large scale processing, and PyTorch.

In my experience, the specific things R does well, python does it in a clunkier way.

Statistical software written by statisticians in academia, bioconductor, and quick prototyping is still much faster in R than in python.

My use case is to prototype in R, then move to python if things become more production rather than exploratory.

FranzFerdiNaN4y ago

Python can of course do the same, its just so much clunkier to do it.

stewbrew4y ago

Python isn't really an advancement. But it's a more obvious choice for people with a background in software engineering. I have some hopes for Julia though.

jhbadger4y ago

As someone who used ruby (yes real ruby, not rails) before python or R, I definately think R is better for data science and ruby better for everything else. Sadly, I predict a future where python rules over everything.

Dyac4y ago

I've been using https://exploratory.io/ a lot, which is r in a really nice wrapper where you can do everything point and click, by writing code by hand or a mix.

fithisux4y ago

Tidyverse!

nomilk4y ago· 7 in thread

1^NA is 1, and 2^NA is NA. Bizarre!

nosianu4y ago

It works and is IMO quite okay because NA is not the same as NaN (not a number). NA _does_ actually stand for a number, it's just that we don't know it.

Which is an interesting detail in R that should be mentioned anyway, the difference between NA and NaN. Anyone used to languages which just NaN may confuse NA for that non-value.

https://www.r-bloggers.com/2012/08/difference-between-na-and...

https://cran.r-project.org/doc/manuals/r-release/R-lang.html... (lots of details how NA and NaN are handled)

Except - 1^NaN also is 1... now that IMO is wrong. But you can try the same in your browser's JS console and you will get 1 as a result too, so R is not the only one.

There are several NA values in R - NA_integer_, NA_real_, NA_complex_ and NA_character_, and the results will be different if you use some of them. NA_character_ and NA_complex_ will produce errors (different ones).

_fnhrOP4y ago

Agree about 1^NaN being strange.

Also R's cleverness with NA's is not so consistent. For example:

    median(c(1,1,1,NA))

Should return 1, since no matter what value is behind NA the median is still 1. But it returns NA.

fithisux4y ago

R has many cases of inconsistency. Like substitute

"its value is substituted, unless env is .GlobalEnv in which case the symbol is left unchanged."

dim and dims :-)

R could do more here. I really like R.

nomilk4y ago

Interesting. I must admit I've never used substitute.

I tried dims but: Error in dims(iris) : could not find function "dims"

I do find the occasional oddity. I've noticed more very useful messages/warnings (particularly in common tidyverse functions) recently, so I think they help.

To be fair, these quirks are generally very uncommon in day to day use.

nojito4y ago

You may want to check out substitute2 which is much easier to use.

https://rdatatable.gitlab.io/data.table/reference/substitute...

sin74y ago

1*0 = 1 1*1 = 1 1*-1 = 1

2*0 = 1 2*1 = 2 2*-1 = 1/2

When 1 raised to any power equals 1, does the power matter at all? Even if it's unknown, the answer is 1.

nomilk4y ago

Good point. Although if NA were complex: 1^1+0i is 1+0i.

1 more reply

streamofdigits4y ago· 4 in thread

R is frequently compared with python and julia which are general purpose programming languages but it is not really a proper comparison. Once you approach R as a domain specific language / system then its various quirks and pecularities are more palatable and explainable: they are in a sense the price to pay for tapping a large domain of statistical analysis expertise that is not available elsewhere.

jstx14y ago

This is mental gymnastics. People have some job to do and are looking for an appropriate tool for it; sometimes that’s R and other times it isn’t. Who cares if you call it a DSL or a general purpose language. If I want to do something and the language makes it difficult, telling myself “oh but it’s a DSL” doesn’t get me any closer to solving my problem.

tmoertel4y ago

> If I want to do something and the language makes it difficult, telling myself “oh but it’s a DSL” doesn’t get me any closer to solving my problem.

Unless the thing that makes the language difficult is your expecations. In that case, offering you an alternative mental model that helps you make better decisions when using the language does get you closer to solving your problem.

ineedasername4y ago

>makes it more difficult

Yes, sure, as long as you recognize that as a very subjective determination.

From the statistician's non-programmer POV the syntax of R or some other language are similarly opaque. Learning one vs. another will present similar investments in time. From their perspective, R does not make things more difficult, and the fact that it's more of the lingua franca within the field has it's own benefits.

The people I see complain about R are usually people that learned a different general purpose language first and find that when work requires data analysis they much prefer the GPL for working through the non-analytical portions if their work. (Especially with python where pandas and numpy have made less specialized tasks much easier)

1 more reply

stewbrew4y ago

It's important to keep this in mind though because R (or rather S) is primarily supposed to be used interactively. A prof of mine used to call the R REPL and then go on from there. He called an editor from the REPL, wrote source files from the REPL etc. Once you see someone working with R like that, you start seeing R as what it is.

The beautiful it is to be used interactively, it really takes a lot of practice to write reliable code that doesn't abort with some error now and then.

1 more reply

folli4y ago· 4 in thread

If you get started with R, I heavily suggest to use some kind of IDE such as RStudio or Jupyter Notebook. It makes your life so much easier.

salamandersauce4y ago

ESS (Emacs Speaks Statistics) is also a great Emacs package for dealing with R.

nomilk4y ago

I don't have a source for this, but I think R as a language has one of the highest concentrations of users in a single IDE - and for good reason - something like 80% use the free (and amazing) RStudio IDE.

funesrequiem4y ago

And what would you say the remaining 20% use? Because I've never seen anyone using R outside of RStudio.

3 more replies

Mikeb854y ago

Honestly, R Studio is still the best IDE I've used for any language...

awild4y ago· 4 in thread

Maybe someone can help me with this, how do you integrate r as a cli tool? I'm in a mostly R shop but its integration is so confusing and/or bad with other tools that we usually just rewrite everything in python for integration (which obviously is a huge waste of time). R packages etc have me as an outsider confused,though seem like the obvious choice?

dm3194y ago

I love R, nothing better for data analysis, stats and plotting. However, if I was making software for other people to use, repeatedly, I would probably pick another language. The R language does have breaking changes, especially in commonly used packages.

fastaguy884y ago

You can always write: Rscript --vanilla your_r_script.r (and #!/usr/bin/env Rscript --vanilla )

Command line arguments are available as:

args <- commandArgs(trailingOnly=TRUE)

And there are three getopt()-like packages: getopt, optparse, and argparse.

awild4y ago

We've done that but someone used relative includes/require statements and everything broke. It's exceptionally annoying.

1 more reply

gompertz4y ago

Personally I use my programming language of choice to generate a ".r" script and then use the os exec system call of said language to call Rscript scriptname.r... If I'm understanding your question correctly.

zenlf4y ago· 3 in thread

On the contrary, I'm not a fan of R, I'm only a fan of Hadley Wickham and how the Tidyverse and ggplot2's API are designed.

They are just incredibly intuitive and easy to use. ggplot2 has fundamentally influenced how I think about plotting.

With my limited experience, I have never seen anything like it.

dm3194y ago

If it wasn't for some of the lisp-like capabilities of R, you would never have tidyverse

EDIT: reference https://news.ycombinator.com/item?id=15869039

jhbadger4y ago

On the other hand, if it were lispiness that was the issue, surely xlispstat would be the winner. I love xlispstat. I used it in grad school in the 1990s and even maintain the github repository https://github.com/jhbadger/xlispstat . But the fact is xlispstat never appealed to the general statistical community and R did.

1 more reply

CornCobs4y ago

Seconded. I was taught ggplot by a great stats professor and the framing of visualizations as a language (gg actually stands for the grammar of graphics!) describing the relation between data and visual elements (layers in the graph) really made something click.

The amount of consideration and careful design behind tidyverse APIs (tidyr, ggplot, dplyr) really astounds me. I've never felt the need to actually memorize any of them but they come to me so naturally whenever I type "library(tidyverse)". Very few DSLs, libraries or APIs have ever made me feel this way, and certainly NOT Python and the mess that pandas/matplotlib/scikit is. Even more impressive that he managed to build such a consistent layer atop the hack that is base R.

Note that I've nothing against base R. It really appeals to the hacker in me and it certainly has a ton of cool features (a condition system, multiple function evaluation forms - in what other language are `if`, `while`, `repeat` and even parentheses `(` and the BLOCK STATEMENT `{` all implemented as functions?) but damn if it isn't a mess of corner cases and gotchas.

aseerdbnarng4y ago· 2 in thread

This is probably written by a programmer for that reason (and reading the ‘why R is bad’ comments) shows how misunderstood R is by most programmers. Its like giving someone an introduction to the english language by showing them the alphabet and listing punctuation. Yes technically all true, but none of it will stick

dm3194y ago

Yes, there is a lot of R-bashing by people used to imperative languages designed for efficiency in repetitive tasks, not a functional language designed for numerical analysis. The complaints fall into these categories:

1. It's not zero-indexed (even though most numerical languages aren't)

2. Loops are slow (though if you're looping in R you're probably doing it wrong)

3. It's inconsistent

4. The syntax is weird.

But people don't talk about the somewhat beautiful functional ability of the language to wrangle data almost magically. Its basis in lisp allows for the tidyverse and data.table to exist[1], and ggplot is a formidable analysis/plotting platform that Python doesn't come close to.

[1] https://news.ycombinator.com/item?id=15869039

throwawayboise4y ago

I attended an intro to R workshop and found it very confusing. Being "functional" had nothing to do with it. Inconsistent, yes very much so in my opinion. It felt like a lot of little separately developed tools thrown together into a bundle. But I think mostly my difficulty with R is that I'm not a researcher or statistician. My exposure to and experience with those domains was an undergrad class or two many decades ago. If you don't deeply understand the problem space for which R is intended, you will be lost and confused trying to learn it.

1 more reply

dm3194y ago· 1 in thread

I love R. Once you get it, there is something beautiful about its functional approach. I like using either tidyverse or data.table with pipes, split, map, reduce. The code looks like layers of a filter that data flows through.

halhen4y ago

Agree! That, and (almost) everything is a vector... Which makes perfect sense for an analytics language.

Once I grokked that R became my default language for anything analytics.

scottmcdot4y ago· 1 in thread

The difference between assigning variables via "=" versus "<-" is not mentioned. That would be confusing to someone learning R.

kkoncevicius4y ago

In practice the difference is almost non-existent, unless you start doing assignments within function calls, which is a popular style among some R stars, like Martin Machler [1]. But on the other hand some of them resolve to just always use "=" everywhere, including one of R's creators - Ross Ihaka [2].

Anyhow, explaining the difference at that part of the tutorial is not easy, so I chose to omit it for now. But might introduce it later, along with "<<-" and "->>", probably after describing closures.

[1]: https://github.com/cran/diptest/blob/master/R/dipTest.R#L37

[2]: https://www.youtube.com/watch?v=88TftllIjaY&t=2101s

legerdemain4y ago

Saying that R is a domain-specific language for statisticians, and thus its quirks are ignorable, is an incomplete answer. An R program is never just a series of calls to specialized library functions. Programs still need to ingest and emit data, manipulate data ad hoc, take conditional branches based on some runtime condition, and so on. And that glue code must still be written in R. I've had to write a lot of that glue code in R.

As someone who mostly writes not-R, my own R irritation comes from a handful of things:

- The dot character "." has no semantic meaning in identifiers. It's just a valid character for names. Looking at function names like "is.numeric" really messes with my reading comprehension.

- Ambiguously, "." also separates identifiers of objects in one of R's type systems from method calls. In some cases, `foo(bar)` and `bar.foo()` are equivalent. But only in some cases.

- Even better, a popular R library defines a function `.()` (i.e., its name is just a single period character), whose job is to expose a surprising quote/unquote expression evaluation semantics.

- This is not to mention the special meaning of "." in formula literals, which are fairly ubiquitous in R.

- Different authors use different naming conventions. Base prefers "as.numeric," Tidyverse might have "to_factor," another library might prefer camel case.

- Finally, R has a surprisingly extensive syntax, exercised by different libraries to different extents, and a correspondingly rich semantics, with "types," "modes," multiple class systems, "expression" objects, immediate and lazy evaluation, expression quoting and unquoting, metaprogramming, and homoiconicity. It is a zoo of a language.

bitcharmer4y ago

In my domain q/kdb is used extensively. I don't have a decade to master obscure syntax/grammar just for one simple purpose of extracting some data set from a larger population and maybe do some basic statistics on it.

If you're like me R is a godsend. You'll also love the tonnes of free packages. You can't get wrong with R if you appreciate simplicity and intuitiveness.

huhtenberg4y ago

> Recycling

https://github.com/karoliskoncevicius/tutorial_r_introductio...

Gotta say this is very elegant.

jack_squat4y ago

This is the R resource I recommend: https://www.amazon.com/Using-Introductory-Statistics-Chapman...

Takes a weekend to work through the book and you get a statistics refresher as a bonus.

Gatsky4y ago

Great overview. Slight shame it leaves out 'lapply' etc though (and says as much at the top). I just remember realising that you can have lists and run functions on them when I was learning R, and it seemed like a superpower.

upbeat_general4y ago

R reminds me a lot of matlab. Used mainly for compatibility/libraries/ecosystem but still a frustrating interpreted language at its core.

curiousgal4y ago

Best thing about R is Shiny

fithisux4y ago

bookdown has a wealth of online books for R.

j / k navigate · click thread line to collapse

103 comments

69 comments · 18 top-level

_Wintermute4y ago· 17 in thread

haddr4y ago

This is why it is a good idea to do “set options(warn=2)” to turn warnings into errors & easily spot problems.

melling4y ago

Any other tricks that are helpful for a beginner to know?

    Use Rstudio 
    Include tidyverse
    Turn warnings into errors

5 more replies

salamandersauce4y ago

Most researchers are not programmers and don't care about programming. It's a tool to get the job done and I think you'd run into similar problems with other languages.

dtgriscom4y ago

If you divide two integers, you get an integer. You can then cast it to whatever you want. Or, if you want some other type, you need to cast it before the operation is done.

1 more reply

funesrequiem4y ago

_Wintermute4y ago

Sure, my email is in my profile.

fithisux4y ago

They should have done more for the software engineering side of things because people use it for this reason.

For repl driven development or academic code or exercises it is excellent.

stewbrew4y ago

Maybe these overly self-confident software engineers should just go RTFM.

clove4y ago

Sounds like a fun job. How'd you get that position? If you're retiring soon, I'll fill the position for the company.

sva_4y ago

My least favorite thing so far was indices starting at 1. It seems blasphemous, in a way.

kergonath4y ago

> My least favorite thing so far was indices starting at 1. It seems blasphemous, in a way.

1 more reply

CornCobs4y ago

Honestly indices starting from 1 fits really nicely in most situations. 1-based indexing together with ranges and inclusive range-based indexing makes loops and subsetting code really readable IMO

hashimotonomora4y ago

It’s pretty standard in math software such as Matlab and Octave.

ProjectArcturis4y ago

That's my favorite thing for R relative to Python. Far more intuitive to start at 1 rather than 0.

ekianjo4y ago

> It seems blasphemous, in a way.

It's more natural. You never count from zero with real life objects.

Mikeb854y ago

It's standard because indexes start at 1 in Fortran. Not sure why it's an issue, especially because you never need to use loops in R anyway.

epgui4y ago

Many languages are like that. I learned recently that indices also start at 1 in PostgreSQL / PLpgSQL.

j7ake4y ago· 8 in thread

funesrequiem4y ago

folli4y ago

What about python and its countless packages? (Honest question, I 'grew up' using python in an academic setting, but haven't caught up with the latest developments)

j7ake4y ago

I use python for more data engineering, large scale processing, and PyTorch.

In my experience, the specific things R does well, python does it in a clunkier way.

Statistical software written by statisticians in academia, bioconductor, and quick prototyping is still much faster in R than in python.

My use case is to prototype in R, then move to python if things become more production rather than exploratory.

FranzFerdiNaN4y ago

Python can of course do the same, its just so much clunkier to do it.

stewbrew4y ago

Python isn't really an advancement. But it's a more obvious choice for people with a background in software engineering. I have some hopes for Julia though.

jhbadger4y ago

Dyac4y ago

I've been using https://exploratory.io/ a lot, which is r in a really nice wrapper where you can do everything point and click, by writing code by hand or a mix.

fithisux4y ago

Tidyverse!

nomilk4y ago· 7 in thread

1^NA is 1, and 2^NA is NA. Bizarre!

nosianu4y ago

It works and is IMO quite okay because NA is not the same as NaN (not a number). NA _does_ actually stand for a number, it's just that we don't know it.

Which is an interesting detail in R that should be mentioned anyway, the difference between NA and NaN. Anyone used to languages which just NaN may confuse NA for that non-value.

https://www.r-bloggers.com/2012/08/difference-between-na-and...

https://cran.r-project.org/doc/manuals/r-release/R-lang.html... (lots of details how NA and NaN are handled)

Except - 1^NaN also is 1... now that IMO is wrong. But you can try the same in your browser's JS console and you will get 1 as a result too, so R is not the only one.

_fnhrOP4y ago

Agree about 1^NaN being strange.

Also R's cleverness with NA's is not so consistent. For example:

    median(c(1,1,1,NA))

Should return 1, since no matter what value is behind NA the median is still 1. But it returns NA.

fithisux4y ago

R has many cases of inconsistency. Like substitute

"its value is substituted, unless env is .GlobalEnv in which case the symbol is left unchanged."

dim and dims :-)

R could do more here. I really like R.

nomilk4y ago

Interesting. I must admit I've never used substitute.

I tried dims but: Error in dims(iris) : could not find function "dims"

I do find the occasional oddity. I've noticed more very useful messages/warnings (particularly in common tidyverse functions) recently, so I think they help.

To be fair, these quirks are generally very uncommon in day to day use.

nojito4y ago

You may want to check out substitute2 which is much easier to use.

https://rdatatable.gitlab.io/data.table/reference/substitute...

sin74y ago

1*0 = 1 1*1 = 1 1*-1 = 1

2*0 = 1 2*1 = 2 2*-1 = 1/2

When 1 raised to any power equals 1, does the power matter at all? Even if it's unknown, the answer is 1.

nomilk4y ago

Good point. Although if NA were complex: 1^1+0i is 1+0i.

1 more reply

streamofdigits4y ago· 4 in thread

jstx14y ago

tmoertel4y ago

> If I want to do something and the language makes it difficult, telling myself “oh but it’s a DSL” doesn’t get me any closer to solving my problem.

ineedasername4y ago

>makes it more difficult

Yes, sure, as long as you recognize that as a very subjective determination.

1 more reply

stewbrew4y ago

The beautiful it is to be used interactively, it really takes a lot of practice to write reliable code that doesn't abort with some error now and then.

1 more reply

folli4y ago· 4 in thread

If you get started with R, I heavily suggest to use some kind of IDE such as RStudio or Jupyter Notebook. It makes your life so much easier.

salamandersauce4y ago

ESS (Emacs Speaks Statistics) is also a great Emacs package for dealing with R.

nomilk4y ago

funesrequiem4y ago

And what would you say the remaining 20% use? Because I've never seen anyone using R outside of RStudio.

3 more replies

Mikeb854y ago

Honestly, R Studio is still the best IDE I've used for any language...

awild4y ago· 4 in thread

dm3194y ago

fastaguy884y ago

You can always write: Rscript --vanilla your_r_script.r (and #!/usr/bin/env Rscript --vanilla )

Command line arguments are available as:

args <- commandArgs(trailingOnly=TRUE)

And there are three getopt()-like packages: getopt, optparse, and argparse.

awild4y ago

We've done that but someone used relative includes/require statements and everything broke. It's exceptionally annoying.

1 more reply

gompertz4y ago

zenlf4y ago· 3 in thread

On the contrary, I'm not a fan of R, I'm only a fan of Hadley Wickham and how the Tidyverse and ggplot2's API are designed.

They are just incredibly intuitive and easy to use. ggplot2 has fundamentally influenced how I think about plotting.

With my limited experience, I have never seen anything like it.

dm3194y ago

If it wasn't for some of the lisp-like capabilities of R, you would never have tidyverse

EDIT: reference https://news.ycombinator.com/item?id=15869039

jhbadger4y ago

1 more reply

CornCobs4y ago

aseerdbnarng4y ago· 2 in thread

dm3194y ago

1. It's not zero-indexed (even though most numerical languages aren't)

2. Loops are slow (though if you're looping in R you're probably doing it wrong)

3. It's inconsistent

4. The syntax is weird.

[1] https://news.ycombinator.com/item?id=15869039

throwawayboise4y ago

1 more reply

dm3194y ago· 1 in thread

halhen4y ago

Agree! That, and (almost) everything is a vector... Which makes perfect sense for an analytics language.

Once I grokked that R became my default language for anything analytics.

scottmcdot4y ago· 1 in thread

The difference between assigning variables via "=" versus "<-" is not mentioned. That would be confusing to someone learning R.

kkoncevicius4y ago

[1]: https://github.com/cran/diptest/blob/master/R/dipTest.R#L37

[2]: https://www.youtube.com/watch?v=88TftllIjaY&t=2101s

legerdemain4y ago

As someone who mostly writes not-R, my own R irritation comes from a handful of things:

- The dot character "." has no semantic meaning in identifiers. It's just a valid character for names. Looking at function names like "is.numeric" really messes with my reading comprehension.

- Ambiguously, "." also separates identifiers of objects in one of R's type systems from method calls. In some cases, `foo(bar)` and `bar.foo()` are equivalent. But only in some cases.

- Even better, a popular R library defines a function `.()` (i.e., its name is just a single period character), whose job is to expose a surprising quote/unquote expression evaluation semantics.

- This is not to mention the special meaning of "." in formula literals, which are fairly ubiquitous in R.

- Different authors use different naming conventions. Base prefers "as.numeric," Tidyverse might have "to_factor," another library might prefer camel case.

bitcharmer4y ago

If you're like me R is a godsend. You'll also love the tonnes of free packages. You can't get wrong with R if you appreciate simplicity and intuitiveness.

huhtenberg4y ago

> Recycling

https://github.com/karoliskoncevicius/tutorial_r_introductio...

Gotta say this is very elegant.

jack_squat4y ago

This is the R resource I recommend: https://www.amazon.com/Using-Introductory-Statistics-Chapman...

Takes a weekend to work through the book and you get a statistics refresher as a bonus.

Gatsky4y ago

upbeat_general4y ago

R reminds me a lot of matlab. Used mainly for compatibility/libraries/ecosystem but still a frustrating interpreted language at its core.

curiousgal4y ago

Best thing about R is Shiny

fithisux4y ago

bookdown has a wealth of online books for R.

j / k navigate · click thread line to collapse