Welcome to Scribd!

Skip carousel

Data Services Evaluation Questions

Uploaded by

Jia

0% found this document useful (0 votes)

7 views5 pages

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

7 views5 pages

Data Services Evaluation Questions

Uploaded by

Jia

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 5

Search inside document

Evaluation Questions

Name:

Part 1 - Regular Expressions

 Please try to keep your regular expressions as simple as possible. LESS IS MORE
 All work must be done on https://regex101.com/
 Settings must match what is shown in the red boxes below

Question 1: Visit the URL https://www.amazon.com/dp/B084JCXSL6/ and write a regular

expression based on the HTML of the page to extract the product title information.

Answer:

______________________________________________________________________________
Question 2: Write a regular expression based on the HTML of the page to extract the price.

Answer:

______________________________________________________________________________

Question 3: Please write a regular expression based on the HTML of the page to extract the
canonical URL.

Answer:

______________________________________________________________________________

Question 4: Find what is common about the three URLs below and write one regular expression
to capture all three URLs.

https://www.homedepot.com/b/Furniture-Kitchen-Dining-Room-Furniture-Dining-Chairs/N-
5yc1vZc7p6/Ntk-EnrichedProductInfo/Ntt-chair?Ntx=mode+matchpartialmax&NCNI-5

https://www.homedepot.com/b/Kitchen-Kitchenware/N-5yc1vZaqzo

https://www.homedepot.com/b/Appliances-Small-Kitchen-Appliances-Coffee-Espresso-Coffee-
Makers/N-5yc1vZbv4w

Answer:

______________________________________________________________________________
Question 5: Find what is common about the three URLs below and write one regular expression
to capture all three URLs.

Product URLs:
https://www.homedepot.com/p/Cuisinart-Triple-Rivet-15-Piece-White-Knife-Set-with-Storage-
Block-C77TR-16P/304088574

https://www.homedepot.com/p/IMAX-Vintage-Silver-Camera-Boxes-Set-of-2-36130-
2/204369237

https://www.homedepot.com/p/Cuisinart-14-Cup-Programmable-Black-Stainless-Steel-Drip-
Coffee-Maker-DCC-3200BKSP1/312699251

Answer:

______________________________________________________________________________

Part 2 – Troubleshooting

Question 6: Suppose the scraper your built for a website has been running successfully for 6
days but on the 7th day, you notice that the price is not getting extracted anymore. What are
some possible causes for this (list as many as you can)?

Answer(s):

______________________________________________________________________________

Question 7: Suppose the scraper you built of a website stopped working because you were
getting blocked. What are some things you can do to get around the blocking problem (list as
many as you can)?

Answer(s):
Part 3 – Technical Assessment

Question 8: Please describe all the strategies/methodologies you would use if you were tasked
with creating a scraper to extract all the products from a website?

Answer(s):

______________________________________________________________________________

Question 9: Suppose you were tasked with extracting 30,000 products from a website 4x/day.
How would you accomplish this task?

Answer(s):

______________________________________________________________________________

Question 10: Please describe all the ways you can design a scraper to extract information from
a specific location (ex. All products from a store with postal code 10001)?

Answer(s):

______________________________________________________________________________

Question 11: Go to URL https://www.amazon.com/dp/B08H99878P/ and find the endpoint

(URL/API) that contains all the information highlighted in the green box (see screenshot below).

(Hint – this is not the answer https://www.amazon.com/dp/B08H99878P/ref=olp-opf-redir?

aod=1&ie=UTF8&condition=ALL)
Answer:

SEO Split Testing: Split Testing In SEO For Data Driven Success
From Everand
SEO Split Testing: Split Testing In SEO For Data Driven Success
Dr. Michael C. Melvin
No ratings yet
Jump into JMP Scripting, Second Edition
From Everand
Jump into JMP Scripting, Second Edition
Wendy Murphrey
No ratings yet
HTML&Css Quetion
Document11 pages
HTML&Css Quetion
manvi mahipal
No ratings yet
2014 SEO Checklist
Document10 pages
2014 SEO Checklist
Sandra Moody Hennessy
No ratings yet
Fundamentals of Software Engineering PDF
Document2 pages
Fundamentals of Software Engineering PDF
Betsegaw Demeke
100% (1)
5.3 Cost, Revenue and Profit Maximization WKST
Document2 pages
5.3 Cost, Revenue and Profit Maximization WKST
MissDang
100% (1)
Untitled
Document18 pages
Untitled
Waleed
No ratings yet
Comprog1 - SLK Q2 5a
Document9 pages
Comprog1 - SLK Q2 5a
emmanuel sim
No ratings yet
Notes-Domino Technical Interview Questions
Document7 pages
Notes-Domino Technical Interview Questions
Pramod Durai
No ratings yet
3.6 - Prototype Development (Part 2)
Document15 pages
3.6 - Prototype Development (Part 2)
Dillon Borkhatria
No ratings yet
Ultimate Microsoft Technical Q & A
Document320 pages
Ultimate Microsoft Technical Q & A
api-3824338
No ratings yet
Department of Education: Republic of The Philippines
Document4 pages
Department of Education: Republic of The Philippines
Kristel Grace Mae Binamira
No ratings yet
1stperiodical Exam
Document2 pages
1stperiodical Exam
Christian Zapico
No ratings yet
Demo Exam
Document4 pages
Demo Exam
sung ming
No ratings yet
Lab Exercise Set 6
Document9 pages
Lab Exercise Set 6
von toledo
No ratings yet
Exam
Document1 page
Exam
CHRISTIAN VINOYA
No ratings yet
ICT 9 Summative Exam - 092923
Document2 pages
ICT 9 Summative Exam - 092923
Edelweiss 27
No ratings yet
San Jose Community College exam covers MS Word parts
Document3 pages
San Jose Community College exam covers MS Word parts
Jane Edmar
No ratings yet
Revised Tle As CSS10 Q3 WK2
Document3 pages
Revised Tle As CSS10 Q3 WK2
Jonald Salinas
No ratings yet
OOP Exam Questions and Answers
Document15 pages
OOP Exam Questions and Answers
Shubham Kapoor
No ratings yet
AI-900: Microsoft Azure AI Fundamentals Sample Questions: User Guide
Document19 pages
AI-900: Microsoft Azure AI Fundamentals Sample Questions: User Guide
Eros
No ratings yet
AI-900-SampleQuestions March 2022
Document19 pages
AI-900-SampleQuestions March 2022
Thilina Android
No ratings yet
PL-900 Microsoft Power Platform Fundamentals Sample Questions
Document20 pages
PL-900 Microsoft Power Platform Fundamentals Sample Questions
Flávio Claro Leonardi
No ratings yet
Crucial Insights for Improving Packaging Design
Document6 pages
Crucial Insights for Improving Packaging Design
nowpat2
No ratings yet
1.1 - New and Emerging Technologies (Part 2)
Document55 pages
1.1 - New and Emerging Technologies (Part 2)
Archit Gupta
No ratings yet
Week 2
Document4 pages
Week 2
Super Max
No ratings yet
PL 300 Updated Part 1
Document90 pages
PL 300 Updated Part 1
Rakesh Saha
No ratings yet
Microsoft: Exam Questions 70-778
Document12 pages
Microsoft: Exam Questions 70-778
Olajuwon Hakeem
No ratings yet
Individual Activity For Module 1
Document4 pages
Individual Activity For Module 1
Jenny Delaserna
No ratings yet
EDP Blank
Document2 pages
EDP Blank
Jarra Ruddy
No ratings yet
4 M's of Production and Business Model
Document5 pages
4 M's of Production and Business Model
Nathaniel Remolin
No ratings yet
Lesson 7 Project Management Exercises
Document4 pages
Lesson 7 Project Management Exercises
JvnRodz P Gmlm
No ratings yet
Fundamentals of Customer Engagement Exam: March 31, 2018
Document5 pages
Fundamentals of Customer Engagement Exam: March 31, 2018
Jack
No ratings yet
Unit Super 2 Compile
Document4 pages
Unit Super 2 Compile
Ylsea Gaming
No ratings yet
STEMYr 6 T2 W1 Booklet
Document16 pages
STEMYr 6 T2 W1 Booklet
David Downing
No ratings yet
Exam Questions 70-778: Analyzing and Visualizing Data With Microsoft Power BI (Beta)
Document25 pages
Exam Questions 70-778: Analyzing and Visualizing Data With Microsoft Power BI (Beta)
shree s
No ratings yet
Siena College Production Operations Management Prelim Exam Review
Document3 pages
Siena College Production Operations Management Prelim Exam Review
Dexter Alcantara
No ratings yet
C++ - Mat - Must Go Through
Document92 pages
C++ - Mat - Must Go Through
Divya Thakur
No ratings yet
1 1 What Is Structured Data Structured Data For Beginners
Document12 pages
1 1 What Is Structured Data Structured Data For Beginners
JOSE CARDONA
No ratings yet
Name - Class - Date - Marketing Strategy Organizer
Document10 pages
Name - Class - Date - Marketing Strategy Organizer
Heather Covington
No ratings yet
Empowerment Technologies Module 3 Answer Sheet
Document9 pages
Empowerment Technologies Module 3 Answer Sheet
John keyster Alonzo
No ratings yet
Test pl300
Document44 pages
Test pl300
Imara Diaz
No ratings yet
Q2-W2 Quiz
Document3 pages
Q2-W2 Quiz
celina
No ratings yet
Microsoft - Certforall.70 778.brain - Dumps.2020 Mar 09.by - Eric.51q.vce
Document12 pages
Microsoft - Certforall.70 778.brain - Dumps.2020 Mar 09.by - Eric.51q.vce
Olajuwon Hakeem
No ratings yet
Microsoft: 70-486 Practice Exam
Document7 pages
Microsoft: 70-486 Practice Exam
Lựu Đạn
No ratings yet
Learn SAS Programming
Document7 pages
Learn SAS Programming
sarath.annapareddy
No ratings yet
Worksheets
Document31 pages
Worksheets
Jemerson Alinsub
No ratings yet
Marvelous Motion Computer Webquest 4 4 17
Document2 pages
Marvelous Motion Computer Webquest 4 4 17
api-262586446
100% (1)
Content Writer - Writing Assessment 2.20.23
Document2 pages
Content Writer - Writing Assessment 2.20.23
Rinar Mhaie Abrantes Ricahuerta
No ratings yet
Pseudocode Worksheet
Document2 pages
Pseudocode Worksheet
Kadia Henry
No ratings yet
Assignments Unit 1
Document3 pages
Assignments Unit 1
Hanah Zyra Addu
100% (1)
Mathematics 5 Module 4 Answer Sheet
Document2 pages
Mathematics 5 Module 4 Answer Sheet
DanielLarryAquino
No ratings yet
Answer Sheet For Research Design and Sampling Procedures
Document3 pages
Answer Sheet For Research Design and Sampling Procedures
Dolores Pancho
No ratings yet
Mwa2 Rojo
Document2 pages
Mwa2 Rojo
Joanna Rojo
No ratings yet
ICT-4th assessment-GRADE 3
Document6 pages
ICT-4th assessment-GRADE 3
Jai Ganesh
No ratings yet
Digital Marketing Tool Audit Outline
Document7 pages
Digital Marketing Tool Audit Outline
Hamna Waqar
No ratings yet
Article Question For Business English
Document8 pages
Article Question For Business English
Muhammad Abdillah
No ratings yet
Optimum Sigma is NOT 6
From Everand
Optimum Sigma is NOT 6
Kermit Taylor
No ratings yet
Practice Questions for UiPath Certified RPA Associate Case Based
From Everand
Practice Questions for UiPath Certified RPA Associate Case Based
Exam OG
No ratings yet
Confident Programmer Problem Solver: Six Steps Programming Students Can Take to Solve Coding Problems
From Everand
Confident Programmer Problem Solver: Six Steps Programming Students Can Take to Solve Coding Problems
Cloudy Heaven Games
No ratings yet
Arduino Playground - ShiftOutX Library
Document4 pages
Arduino Playground - ShiftOutX Library
Winex
No ratings yet
VTU 5th Sem Syllabus
Document27 pages
VTU 5th Sem Syllabus
Subash Prakash
No ratings yet
Erlang Programming PDF
Document2 pages
Erlang Programming PDF
Kristin
No ratings yet
Mistserver V2.7 Manual: Ddvtech
Document38 pages
Mistserver V2.7 Manual: Ddvtech
Work Utility
No ratings yet
NVidia DDSTool Tutorial R20 WEB
Document30 pages
NVidia DDSTool Tutorial R20 WEB
Richard Lawson
No ratings yet
DB Scheduler
Document6 pages
DB Scheduler
sandeep.srivastava00
100% (1)
Opti Cca Ts Interface Specifications English
Document46 pages
Opti Cca Ts Interface Specifications English
Eka Ahmad
No ratings yet
LCB2K LCD Multimedia Player Specifications V1.0.2
Document9 pages
LCB2K LCD Multimedia Player Specifications V1.0.2
soporte
No ratings yet
Student Grade Prediction
Document9 pages
Student Grade Prediction
Tuba Saleha
No ratings yet
Encryption Project
Document5 pages
Encryption Project
eyadakram
No ratings yet
Syslog Messages from VoIP Device
Document81 pages
Syslog Messages from VoIP Device
Ruben Montero Acosta
No ratings yet
1911 - Celltick's MAGEN (Mass Alert Geo Emergency Notifications) Suite PDF
Document9 pages
1911 - Celltick's MAGEN (Mass Alert Geo Emergency Notifications) Suite PDF
Askederin Musah
No ratings yet
1 - IRMSA Introduction 1 PDF
Document2 pages
1 - IRMSA Introduction 1 PDF
Anonymous q333arF
No ratings yet
cs2304 System Software 2 Marks and 16 Marks With Answer
Document18 pages
cs2304 System Software 2 Marks and 16 Marks With Answer
manojkumar024
No ratings yet
ATI Radially-Compliant Robotic Deburring Tools Flexdeburr™: Installation and Operation Manual
Document28 pages
ATI Radially-Compliant Robotic Deburring Tools Flexdeburr™: Installation and Operation Manual
Andrei Jila
No ratings yet
CCNA Questions
Document29 pages
CCNA Questions
felipemunozmora
No ratings yet
Raspberry Pi Internet Radio: Bob Rathbone Computer Consultancy
Document256 pages
Raspberry Pi Internet Radio: Bob Rathbone Computer Consultancy
Cristian Seguel
No ratings yet
Free Ebook: Generate Pagination Links in Laravel Like A Pro
Document6 pages
Free Ebook: Generate Pagination Links in Laravel Like A Pro
Dinesh Suthar
No ratings yet
Cisco Unified Real-Time Monitoring Tools Admin Guide
Document284 pages
Cisco Unified Real-Time Monitoring Tools Admin Guide
jsynarong
No ratings yet
Agricultural and Forest Entomology - Author Guidelines
Document4 pages
Agricultural and Forest Entomology - Author Guidelines
mib_579
No ratings yet
Fiberplanit
Document4 pages
Fiberplanit
dsupaboi
No ratings yet
Schrock Infographic Rubric PDF
Document2 pages
Schrock Infographic Rubric PDF
Cha Gao-ay
100% (1)
HCIP-Transmission V2.5 Lab Guide
Document170 pages
HCIP-Transmission V2.5 Lab Guide
Work Albert
100% (1)
Request For A Written Explanation
Document5 pages
Request For A Written Explanation
Amiel Gian Mario Zapanta
No ratings yet
Online Medicine Purchase: Under The Esteemed Guidance of MR - Atheeq Sultan Ghori Assistant Professor
Document28 pages
Online Medicine Purchase: Under The Esteemed Guidance of MR - Atheeq Sultan Ghori Assistant Professor
UDAY SOLUTIONS
No ratings yet
HERO (Percentile Hero - Edit)
Document13 pages
HERO (Percentile Hero - Edit)
David Mendoza
No ratings yet
ABB Logic Relays
Document24 pages
ABB Logic Relays
Andy Meyer
0% (1)
Titanic: Machine Learning For Kids:: Teachers' Notes
Document1 page
Titanic: Machine Learning For Kids:: Teachers' Notes
Irfan Bhaswara JSC
No ratings yet
SoftX3000 Operation Manual-Configuration Examples
Document553 pages
SoftX3000 Operation Manual-Configuration Examples
ricardomoreirascribd
100% (2)
Freescale-FRS iMX53 - Mentor Inflexion Quick Start Board Training FINAL
Document37 pages
Freescale-FRS iMX53 - Mentor Inflexion Quick Start Board Training FINAL
prova34
No ratings yet