Reading in Larger Datasets With Read - Table Reading in Larger Datasets With Read - Table

Uploaded by

RustEd

0% found this document useful (0 votes)

18 views4 pages

lojiftgfzdx

Original Title

Large Tables

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

lojiftgfzdx

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

18 views4 pages

Reading in Larger Datasets With Read - Table Reading in Larger Datasets With Read - Table

Uploaded by

RustEd

lojiftgfzdx

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 4

Search inside document

Reading in Larger Datasets with read.

table
With much larger datasets, doing the following things will make your life easier and will prevent R
from choking.
Read the help page for read.table, which contains many hints
Make a rough calculation of the memory required to store your dataset. If the dataset is larger
than the amount of RAM on your computer, you can probably stop right here.
Set comment.char = "" if there are no commented lines in your file.

6/9

Reading in Larger Datasets with read.table

Use the colClasses argument. Specifying this option instead of using the default can make
read.table run MUCH faster, often twice as fast. In order to use this option, you have to know the
class of each column in your data frame. If all of the columns are numeric, for example, then
you can just set colClasses = "numeric". A quick an dirty way to figure out the classes of
each column is the following:

initial <- read.table("datatable.txt", nrows = 100)

classes <- sapply(initial, class)
tabAll <- read.table("datatable.txt",
colClasses = classes)

Set nrows. This doesnt make R run faster but it helps with memory usage. A mild overestimate
is okay. You can use the Unix tool wc to calculate the number of lines in a file.

7/9

Know Thy System

In general, when using R with larger datasets, its useful to know a few things about your system.
How much memory is available?
What other applications are in use?
Are there other users logged into the same system?
What operating system?
Is the OS 32 or 64 bit?

8/9

Calculating Memory Requirements

I have a data frame with 1,500,000 rows and 120 columns, all of which are numeric data. Roughly,
how much memory is required to store this data frame?
1,500,000 120 8 bytes/numeric
= 1440000000 bytes
= 1440000000 / 220 bytes/MB
= 1,373.29 MB
= 1.34 GB

9/9

AWS Certified Solutions Architect - Professional
From Everand
AWS Certified Solutions Architect - Professional
VB Dev
No ratings yet
Introduction to R Language Data Analysis
Document9 pages
Introduction to R Language Data Analysis
areacode212
No ratings yet
Essential Algorithms: A Practical Approach to Computer Algorithms
From Everand
Essential Algorithms: A Practical Approach to Computer Algorithms
Rod Stephens
Rating: 4.5 out of 5 stars
4.5/5 (2)
DBA Interview Questions With Answers Part8
Document13 pages
DBA Interview Questions With Answers Part8
anu224149
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Performance Tuning From Blog
Document5 pages
Performance Tuning From Blog
michaelr@plimus
No ratings yet
SAS Interview Questions You’ll Most Likely Be Asked: Job Interview Questions Series
From Everand
SAS Interview Questions You’ll Most Likely Be Asked: Job Interview Questions Series
Vibrant Publishers
No ratings yet
Performance Tuning PostgreSQL
Document25 pages
Performance Tuning PostgreSQL
Enrique Herrera Noya
No ratings yet
Coding In C Decoded: Decoded, #1
From Everand
Coding In C Decoded: Decoded, #1
D. Brown
No ratings yet
Postgresql Performance Tuning
Document7 pages
Postgresql Performance Tuning
Jonatan Michael Garcia
No ratings yet
Magic Data: Part 1 - Harnessing the Power of Algorithms and Structures
From Everand
Magic Data: Part 1 - Harnessing the Power of Algorithms and Structures
Chuck Sherman
No ratings yet
Administering Oracle Database On AIX
Document15 pages
Administering Oracle Database On AIX
naveenchhibber
No ratings yet
Input: 1.1. Assignment
Document6 pages
Input: 1.1. Assignment
HalifatulKamelia
No ratings yet
R Tutorial
Document119 pages
R Tutorial
Prithwish Ghosh
No ratings yet
Fast C++ Tips: OpenMP, Vectors, HDF5
Document2 pages
Fast C++ Tips: OpenMP, Vectors, HDF5
WilliamSucky
No ratings yet
Why MySQL Could Be Slow With Large Tables
Document14 pages
Why MySQL Could Be Slow With Large Tables
Arisetty Sravan Kumar
No ratings yet
Best Practices Relating To Java Collection
Document3 pages
Best Practices Relating To Java Collection
Ashish Avinash Kamat
No ratings yet
Columnstore Indexes
Document11 pages
Columnstore Indexes
raghunathan
No ratings yet
Casandra Brass Tacks
Document2 pages
Casandra Brass Tacks
dipesh
No ratings yet
Tsizepro Manual
Document15 pages
Tsizepro Manual
gmeades
No ratings yet
Tuning Oracle Database Buffer Cache, Shared Pool and Redo Log Buffer Latches
Document6 pages
Tuning Oracle Database Buffer Cache, Shared Pool and Redo Log Buffer Latches
Jaya Sankar
No ratings yet
Interhyp
Document2 pages
Interhyp
Anilkumar Pasulapudi
No ratings yet
Stata Manual Introduction
Document24 pages
Stata Manual Introduction
Ayman Masani
No ratings yet
Excel on HANA Jump to HANA Topic
Document248 pages
Excel on HANA Jump to HANA Topic
Saha2
No ratings yet
1 1coreprogrammingunderstandcomputerstorageanddatatypes 110330031933 Phpapp02
Document16 pages
1 1coreprogrammingunderstandcomputerstorageanddatatypes 110330031933 Phpapp02
galacticocean
No ratings yet
Basic Performance Tips
Document8 pages
Basic Performance Tips
gyani
100% (1)
Essbase Application Performance Tuning
Document4 pages
Essbase Application Performance Tuning
ksrsarma
No ratings yet
9 - Analytics Databases
Document12 pages
9 - Analytics Databases
rohit kumar
No ratings yet
Oracle 11g New Tuning Features: Donald K. Burleson
Document36 pages
Oracle 11g New Tuning Features: Donald K. Burleson
Sreekar Pamu
No ratings yet
Day 3: Key Data Structures and Algorithms Concepts
Document4 pages
Day 3: Key Data Structures and Algorithms Concepts
Mukta
No ratings yet
Aws Redshift: Calculations Are Typically Executed On Small Number of Columns
Document8 pages
Aws Redshift: Calculations Are Typically Executed On Small Number of Columns
Bernardo C Hermont
No ratings yet
Essbase Performance Tuning Guide
Document17 pages
Essbase Performance Tuning Guide
ksrsarma
No ratings yet
Computer Science Fundamentals
Document1 page
Computer Science Fundamentals
Rajay Gordon
No ratings yet
Planning Interview Question
Document18 pages
Planning Interview Question
Iron Man
No ratings yet
Caching
Document7 pages
Caching
nikkusonik
No ratings yet
MySQL UC 2007: MySQL Server Settings Tuning
Document36 pages
MySQL UC 2007: MySQL Server Settings Tuning
Oleksiy Kovyrin
100% (4)
Tips For Optimize
Document4 pages
Tips For Optimize
mayank sharma
No ratings yet
Stata Training 1
Document58 pages
Stata Training 1
ismaelout
No ratings yet
ASM Interview Question
Document14 pages
ASM Interview Question
dayasc
No ratings yet
Term Paper On Cache Memory
Document4 pages
Term Paper On Cache Memory
aflslwuup
100% (1)
Memory Organization: Dr. Bernard Chen PH.D
Document77 pages
Memory Organization: Dr. Bernard Chen PH.D
Bathala Prasad
No ratings yet
Cache Organization and Performance
Document9 pages
Cache Organization and Performance
adviful
No ratings yet
Oracle DBA Faq's
Document67 pages
Oracle DBA Faq's
deepakkachole
No ratings yet
Abap Open SQL
Document13 pages
Abap Open SQL
mehmetonur55
No ratings yet
Important DBSControl Parameter
Document2 pages
Important DBSControl Parameter
Sanchit Tiwari
No ratings yet
1 Introduction To Data Structures
Document3 pages
1 Introduction To Data Structures
at9187
No ratings yet
Tuning Mappings For Better Performance
Document12 pages
Tuning Mappings For Better Performance
Naveen Shetty
No ratings yet
7 Oracle SQL Tuning Tactics You Can Start Implementing Immediately
Document10 pages
7 Oracle SQL Tuning Tactics You Can Start Implementing Immediately
nagarjunapurum
No ratings yet
Trouble Shoot System: 1. Information
Document15 pages
Trouble Shoot System: 1. Information
Thien Chuong
No ratings yet
Configure autoscaling for Cloud Bigtable clusters
Document6 pages
Configure autoscaling for Cloud Bigtable clusters
Jo booking
No ratings yet
ELECTRICAL AND ELECTRONIC ENGINEERING Cache Caching
Document15 pages
ELECTRICAL AND ELECTRONIC ENGINEERING Cache Caching
Enock Omari
No ratings yet
Introduction To Parallel Computing in R: 1 Motivation
Document6 pages
Introduction To Parallel Computing in R: 1 Motivation
Ranveer Raaj
No ratings yet
Oracle Database Administration: How Does One Create A New Database?
Document45 pages
Oracle Database Administration: How Does One Create A New Database?
Elango Gopal
No ratings yet
Aaron Thul Electronic Medical O Ce Logistics (EMOL)
Document28 pages
Aaron Thul Electronic Medical O Ce Logistics (EMOL)
anon-50432
100% (2)
1.when A Sparse Matrix Is Represented With A 2-Dimensional Array, We
Document6 pages
1.when A Sparse Matrix Is Represented With A 2-Dimensional Array, We
khushinagar9009
No ratings yet
Performance Tuning Cognos
Document12 pages
Performance Tuning Cognos
kajapanisrikanth
No ratings yet
Stack and Dynamic Memory
Document7 pages
Stack and Dynamic Memory
Mubashir Adnan Qureshi
No ratings yet
Managing Large Data Sets in LabVIEW - Developer Zone - National Instruments
Document7 pages
Managing Large Data Sets in LabVIEW - Developer Zone - National Instruments
Fayez Shakil Ahmed
No ratings yet
Activity 3 - Pandas Exploration
Document4 pages
Activity 3 - Pandas Exploration
Alfaizer Cruza
No ratings yet
57 Paper
Document5 pages
57 Paper
Lit How Than
No ratings yet
MCQs
Document2 pages
MCQs
RustEd
No ratings yet
Aspire Grand Test 2
Document17 pages
Aspire Grand Test 2
RustEd
No ratings yet
The Astrological e Magazine - April 2018
Document44 pages
The Astrological e Magazine - April 2018
RustEd
100% (2)
The Astrological e Magazine January 2018
Document44 pages
The Astrological e Magazine January 2018
RustEd
No ratings yet
GDM Definitions
Document4 pages
GDM Definitions
RustEd
No ratings yet
Aspire Quizzes With Answers
Document6 pages
Aspire Quizzes With Answers
manziltakforme
No ratings yet
Slides Reading Data2
Document8 pages
Slides Reading Data2
areacode212
No ratings yet
Data Frames Data Frames: Row - Names Read - Table Read - CSV Data - Matrix
Document2 pages
Data Frames Data Frames: Row - Names Read - Table Read - CSV Data - Matrix
RustEd
No ratings yet
3.cancer AwarenessAndManagement SarahforOE
Document52 pages
3.cancer AwarenessAndManagement SarahforOE
RustEd
No ratings yet
Some Problems in Adsorption and Calorimetric Studies of The Steps of Catalytic Processes
Document25 pages
Some Problems in Adsorption and Calorimetric Studies of The Steps of Catalytic Processes
RustEd
No ratings yet
Factors Factors: LM GLM
Document3 pages
Factors Factors: LM GLM
RustEd
No ratings yet
MCQs Basics of Prog Part I
Document2 pages
MCQs Basics of Prog Part I
RustEd
No ratings yet
1 - 2 Getting Help
Document14 pages
1 - 2 Getting Help
intj2001712
No ratings yet
Data Types - Matrices
Document4 pages
Data Types - Matrices
Long Tran
No ratings yet
1 - 2 Getting Help
Document14 pages
1 - 2 Getting Help
intj2001712
No ratings yet
Ae52/Ac52/At52 C & Data Structures: JUNE 2013
Document9 pages
Ae52/Ac52/At52 C & Data Structures: JUNE 2013
RustEd
No ratings yet
3 - 2 - Web Interview - Philip Hugenholtz
Document5 pages
3 - 2 - Web Interview - Philip Hugenholtz
RustEd
No ratings yet
1 - 2 Getting Help
Document14 pages
1 - 2 Getting Help
intj2001712
No ratings yet
M A Sanskrit
Document7 pages
M A Sanskrit
RustEd
No ratings yet
GATE 2016 Certificate From Principal
Document1 page
GATE 2016 Certificate From Principal
Praveen Kumar
No ratings yet
5 - 8 - Web Interview - Sibyl Bucheli
Document5 pages
5 - 8 - Web Interview - Sibyl Bucheli
RustEd
No ratings yet
CSTR Biomass Production Quiz Answers
Document2 pages
CSTR Biomass Production Quiz Answers
RustEd
No ratings yet
Attitude of Men Towards Women1
Document1 page
Attitude of Men Towards Women1
RustEd
No ratings yet
2 - 5 - Interview - Luke Thompson
Document2 pages
2 - 5 - Interview - Luke Thompson
RustEd
No ratings yet
Plasmid From Doron
Document3 pages
Plasmid From Doron
RustEd
No ratings yet
2 - 10 - Lecture 1-10 - Conversational Acts (Honors - Optional) (17 - 45)
Document10 pages
2 - 10 - Lecture 1-10 - Conversational Acts (Honors - Optional) (17 - 45)
RustEd
No ratings yet
Biochemical
Document1 page
Biochemical
RustEd
No ratings yet