They are super fast and memory efficient, but tricky and error-prone to code, have to spend lots of time mucking around with I/O, and have zero visualization and data management support. But it does massively break down at >10k or certainly >100k rows. I think it’s underrated by computer scientist sort of people. Excel has a far, far larger user base than any of these other options.Matplotlib follows the Matlab model, which is fine, but is uglier than either IMO. Matlab’s interactive plots are super nice though. One view I’ve heard is, R’s visualizations are great for exploratory analysis, but you want something else for very high-quality graphs. Matlab visualization support is controversial. SAS people complain about poor graphing capabilities.It’s just a whole different ballgame with that large a dataset.) But Itamar Rosenn and Bo Cowgill (Facebook and Google respectively) were talking about multi-machine datasets that require cluster computation that R doesn’t come close to touching, at least right now. Porzak was talking about how going to MySQL gets around R’s in-memory limitations. (This was an interesting point at the R meetup.(Hive? Pig? Or quite possibly something else.) Hadoop, MPI) but It’s an open question what the standard distributed data analysis framework will be. There are a few multi-machine data processing frameworks that are somewhat standard (e.g. If your dataset can’t fit on a single hard drive and you need a cluster, none of the above will work.But: is there ANY package besides SAS that can do analysis for datasets that don’t fit into memory? That is, ones that mostly have to stay on disk? And exactly how good as SAS’s capabilities here anyway?.There were boatloads of SAS representatives at that conference and they sure didn’t seem to be on the leading edge. Then he asked if SAS was even offered as an option. At that R meetup last week, Jim Porzak asked the audience if there were any recent grad students who had learned R in school. I know dozens of people under 30 doing statistical stuff and only one knows SAS.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |