R import export manual




















There are packages to allow functionality developed in languages such as Java , perl and python to be directly integrated with R code, making the use of facilities in these languages even more appropriate. It is also worth remembering that R like S comes from the Unix tradition of small re-usable tools, and it can be rewarding to use tools such as awk and perl to manipulate data before import or after export. The traditional Unix tools are now much more widely available, including for Windows.

This manual was first written in , and the number of scope of R packages has increased a hundredfold since. For specialist data formats it is worth searching to see if a suitable package already exists. The easiest form of data to import into R is a simple text file, and this will often be acceptable for problems of small or medium scale.

The primary function to import from a text file is scan , and this underlies most of the more convenient functions discussed in Spreadsheet-like data. Often the simplest thing to do is to use the originating application to export the data as a text file and statistical consultants will have copies of the most common applications on their computers for that purpose. However, this is not always possible, and Importing from other statistical systems discusses what facilities are available to access such files directly from R.

For Excel spreadsheets, the available methods are summarized in Reading Excel spreadsheets. In a few cases, data have been stored in a binary form for compactness and speed of access. One application of this that we have seen several times is imaging data, which is normally stored as a stream of bytes as represented in memory, possibly preceded by a header. Such data formats are discussed in Binary files and Binary connections.

For much larger databases it is common to handle the data using a database management system DBMS. Importing data via network connections is discussed in Network interfaces. Unless the file to be imported from is entirely in ASCII, it is usually necessary to know how it was encoded. For text files, a good way to find out something about its structure is the file command-line tool for Windows, included in Rtools.

This reports something like. It is not possible to automatically detect with certainty which 8-bit encoding although guesses may be possible and file may guess as it did in the example above , so you may simply have to ask the originator for some clues e. We have too often been reduced to looking at the file with the command-line utility od or a hex editor to work out its encoding. In brief, leave the connection in the state you found it in.

There are generic functions open and close with methods to explicitly open and close connections. Files compressed via the algorithm used by gzip can be used as connections created by the function gzfile , whereas files compressed by bzip2 can be used via bzfile.

Unix programmers are used to dealing with special files stdin , stdout and stderr. These exist as terminal connections in R. They may be normal files, but they might also refer to input from and output to a GUI console. Even with the standard Unix R interface, stdin refers to the lines submitted from readline rather than a file. The three terminal connections are always open, and cannot be opened or closed.

Note carefully the language used here: the connections cannot be re-directed, but output can be sent to other connections. Text connections are another source of input. They allow R character vectors to be read as if the lines were being read from a text file.

A text connection is created and opened by a call to textConnection , which copies the current contents of the character vector to an internal buffer at the time of creation. Text connections can also be used to capture R output to a character vector. The connection is opened by the call to textConnection , and at all times the complete lines output to the connection are available in the R object.

Closing the connection writes any remaining output to a final element of the character vector. Pipes are a special form of file that connects to another process, and pipe connections are created by the function pipe. Opening a pipe connection for writing it makes no sense to append to a pipe runs an OS command, and connects its standard input to whatever R then writes to that connection.

Conversely, opening a pipe connection for input runs an OS command and makes its standard output available for R input from that connection. For convenience, file will also accept these as the file specification and call url. Sockets can also be used as connections via function socketConnection on platforms which support Berkeley-like sockets most Unix systems, Linux and Windows. Sockets can be written to or read from, and both client and server sockets can be used.

We have described functions cat , write , write. Translations of manuals into other languages than English are available from the contributed documentation section only a few translations are available.

Older versions of the manual can be found in the respective archives of the R sources. The HTML versions of the manuals are also part of most R installations accessible using function help.

That is, the objects that it works on, and the details of the expression evaluation process, which are useful to know when programming R functions.



0コメント

  • 1000 / 1000