Skip to content

Conversation

@unutbu
Copy link
Contributor

@unutbu unutbu commented Sep 29, 2013

Motivation for this pull request came from this problem: http://stackoverflow.com/q/19039356/190597.

The problems:

  • pandas.rpy.common.load_data currently fails to load high-dimensional arrays such as the "Titanic" dataset. load_data prints an error to stdout and returns None.
  • Applying load_data to "iris3", "state.division" or "state.region" R datasets currently raise a TypeError.
  • Timeseries R data is currently converted to lists that lack a time series index.
  • Factor data is converted to a list of index numbers without any levels information.

This pull request:

  • Provides code which allows load_data to convert the "Titanic" dataset (and similar high-dimensional arrays) to "melted" DataFrames. The code does not use R's melt function, since that would introduce a new dependency.
  • load_data now returns a DataFrame for arrays like "iris3".
  • load_data converts R factors like "state.division" and "state.region" to lists which use the levels as values.
  • load_data converts ts (timeseries) R data, like "airmiles" to Series with an index showing the times. Note however I wasn't able to convert the index to a DatetimeIndex since the same code is used to convert numeric R data (like "euro") which does not have a time-related index.
  • moves already existant tests from rpy/common.py to tests/test_rpy.py
  • adds new unit tests which highlight the problems mentioned above, and tests for what I think is more desireable behavior as described above.
@jreback
Copy link
Contributor

jreback commented Sep 29, 2013

add rpy2 in ci/requirements

2.7, 3.3 maybe

@cpcloud
Copy link
Member

cpcloud commented Sep 29, 2013

i think need to add R in ci/install.sh as well

@cpcloud
Copy link
Member

cpcloud commented Sep 29, 2013

problem is that is a huge honking install on ubuntu

@cpcloud
Copy link
Member

cpcloud commented Sep 29, 2013

all this for r-base on ubuntu 12.04:

autoconf automake autotools-dev cdbs dh-translations dpatch fakeroot fontconfig fontconfig-config intltool libbz2-dev libcairo2 libdatrie1 libencode-locale-perl libfile-basedir-perl libfile-desktopentry-perl libfile-listing-perl libfile-mimeinfo-perl libfont-afm-perl libfontconfig1 libfontenc1 libgl1-mesa-dri libgl1-mesa-glx libglapi-mesa libhtml-form-perl libhtml-format-perl libhtml-parser-perl libhtml-tagset-perl libhtml-tree-perl libhttp-cookies-perl libhttp-daemon-perl libhttp-date-perl libhttp-message-perl libhttp-negotiate-perl libice6 libio-socket-inet6-perl libio-socket-ssl-perl libllvm3.0 liblwp-mediatypes-perl liblwp-protocol-https-perl libmailtools-perl libncurses5-dev libnet-http-perl libnet-ssleay-perl libpango1.0-0 libpaper-utils libpaper1 libpcre3-dev libpcrecpp0 libpixman-1-0 libsm6 libsocket6-perl libthai-data libthai0 libtiff4 liburi-perl libutempter0 libwww-perl libwww-robotrules-perl libx11-xcb1 libxaw7 libxcb-glx0 libxcb-render0 libxcb-shape0 libxcb-shm0 libxcomposite1 libxcursor1 libxdamage1 libxfixes3 libxft2 libxi6 libxinerama1 libxml-parser-perl libxmu6 libxpm4 libxrandr2 libxrender1 libxss1 libxt6 libxtst6 libxv1 libxxf86dga1 libxxf86vm1 m4 patchutils python-scour r-base-core r-base-dev r-base-html r-cran-boot r-cran-class r-cran-cluster r-cran-codetools r-cran-foreign r-cran-kernsmooth r-cran-lattice r-cran-mass r-cran-matrix r-cran-mgcv r-cran-nlme r-cran-nnet r-cran-rpart r-cran-spatial r-cran-survival r-doc-html r-recommended shared-mime-info tcl8.5 tk8.5 ttf-dejavu-core unzip x11-common x11-utils x11-xserver-utils xbitmaps xdg-utils xterm zip 
@jreback
Copy link
Contributor

jreback commented Sep 29, 2013

any way to have it skip all of the display stuff ?

@cpcloud
Copy link
Member

cpcloud commented Sep 29, 2013

someone more knowledgeable than me about apt-get may know the answer to that question. @yarikoptic can you grace us with your debian wisdom? is something like this possible?

@cpcloud
Copy link
Member

cpcloud commented Sep 29, 2013

It's simply not the case that all of those deps are required to install R. On Arch Linux, these are the dependencies:

blas lapack bzip2 libpng libjpeg libtiff ncurses pcre readline zlib perl gcc-libs libxt libxmu pango xz desktop-file-utils

@jtratner
Copy link
Contributor

why do you need to build R, can't you just use apt-get to get the package
without build-dep?

On Sun, Sep 29, 2013 at 12:12 AM, Phillip Cloud notifications@github.comwrote:

It's simply not the case that to build R this is required. On Arch Linux,
these are the following dependencies:

blas lapack bzip2 libpng libjpeg libtiff ncurses pcre readline zlib perl gcc-libs libxt libxmu pango xz desktop-file-utils


Reply to this email directly or view it on GitHubhttps://github.com//pull/5036#issuecomment-25314107
.

@cpcloud
Copy link
Member

cpcloud commented Sep 29, 2013

those ubuntu deps i listed above are from calling apt-get, shouldn't have said "build". meant "install"

@yarikoptic
Copy link
Contributor

well -- it feels that list a bit bloated due to recommends -- what if you do apt-get install r-base --no-install-recommend? that still would need x11-common xdg-utils though, but most probably that is unavoidable... I would recommend on your box to run sudo apt-get install debtree; debtree --no-recommends r-base | dot -Tsvg > /tmp/r-basedepends.svg; firefox /tmp/r-basedepends.svg which would give you a nice graph of dependencies so you could clearly see how they are pulled in.

@unutbu
Copy link
Contributor Author

unutbu commented Sep 29, 2013

Installing R on Travis seems like an unsolved problem for many people: http://yihui.name/en/2013/04/travis-ci-general-purpose/. Would it be helpful if I move the unit tests back into rpy/common.py so those with R installed could run the tests, but not have it part of the official test suite?

@jtratner
Copy link
Contributor

just add a raise nose.SkipTest("R not installed") at the top of the file
instead. (and, you know, check for that too :))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you change to nose.SkipTest('R not installed') thxs....then squash to smaller num of commits

@unutbu unutbu closed this Sep 29, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

5 participants