Professional Documents
Culture Documents
• data.table basics
• data.table
✓ Rows
✓ Columns
✓ Groups
• https://www.rstudio.com/resources/cheatsheets/
Create a data.table
• Use data.table function
Create a data.table
• Convert a data.frame to a data.table
✓ Using as.data.table() or setDT() or
✓ Check whether an object is data.frame (or data.table) or not
• Is.data.frame() or is.data.table()
Subset of rows
Subset of columns
• Several ways to extract subset of columns
Use group by
• Import ‘db_transaction.csv’
✓ transaction_id: each transaction’s unique id
✓ payment: payment method (e.g., cash, credit_card, debit_card, mobile, octopus)
✓ transaction_amount: amount of the transaction in HKD
✓ day, mon, year: transaction date, month, and year, respectively
✓ time: transaction time
✓ user_id: id of the user who made the transaction
✓ register: registration year of the user
✓ gender: gender of the user
✓ age: age of the user
Chaining
• data.table[][][][]…
✓ Perform a sequence of data.table operations by chaining multiple “[]”
• What if we want to find users who have made transactions with five methods
✓ to investigate whether types of payment affect user’s purchasing behavior
✓ Each user have at least one transaction with every payment
• Cash
• Octopus
• Credit_card
• Debit_card (EPS)
• Mobile (e.g., Alipay, WeChat)
• Manual approach
✓ Just check every transactions
✓ But there are 30,000 rows…
ISOM 3390 Business Programming in R 21
Data.table
Chaining
• Semi-manual approach
✓ check every transactions but use a loop (e.g., for, while; will be covered later)
✓ Pick a user and recursively check whether he/she has made transactions with five methods
Chaining
• Chaining approach
✓ Utilize by at user &payment level
✓ Utilize by at user level
Chaining
• Chaining approach
✓ Utilize by at user &payment level
✓ Utilize by at user level
✓ Just a line