You are on page 1of 7
Dashboard / My courses / 2022-2023 2° cielo / Pés-Graduaces / Outono / ABD-400083-202223-51 / Class B - 2/3 November {L intermediat Test 1 Started on Wednestay, November 2022, 650 PM, State Finished Completed on Wednesday, 2 November 2022, 800 PM: Time taken 1 hour 9 mins mome ‘Test instruction’s: 1. This isan open book individual test witha uration of 60 minutes. 2.You ae nat alowed to use your mobilephone nora communiate wih anyone but the professor 2. Questons 1 to 6 worth 1 pont 4. Question 7 and 8 worth 2 points (code must be fled in question and 10) 5. Quesvon 7 and 9 wil worth point without te corresponding code(obe file) in question 8 and 10. onan panera rears 280 ow QuesionT In the following ds path: databricks-datasts/defintive-quide/data/ight-data/cs, you wil find several es files. Cores ‘What isthe s2e of the fle: 2010-summary.csv? a. 6999 brea ema 7007 auason Load the following fle toa DF (Dataframe):éatabricks-datasets/definitive-guide/data/Might-data/esv/201S-summary.cs Soret How many rows do we have in the DF? a sa b 246 296 256 onan panera rears 280 ues Load the following fle toa DF: '/databricks-datasets/definitive-quide/data/Might-data/esy/2015-summary sv Cores How many different origin counties (origin. countty_name) do we have in the DF? a 125 b 12 18 am asin Load the following fle toa DF: Ydatabricks-datasets/definitive-gude/lata/Might-data/esv/2015-summarycsv How many fights have the origin Corgin_county_ name) ‘Portugal’? a M6 b 14 6 156 4123 onan panera rears 280 w ae Load the following file to a DF: '/databricks-datasets/definitive-guide/data/flight-data/csv/2015-summary.csv' Cer | tating rg USA de nye =" Ste? Qo may Read the text following text fle toa DF: Ydatabricks-datasets/samples/dlocs/README md How many tines do we have in the OF withthe word ‘Spark’ and ending with a? a3 b 4 es a2 onan pra tracey oter ebb a asin Lond the follonng leo» OF aac tase Raat dt-00/cov/ggl amend Cevete _hatisth avenge ricer the damonds th olor» eat = "ream ou may group tat eo a 3538 368 < Re 4. 2828 on8 Please write the code for question 7 - Groupay | Average price of eiamands Not raded #Ovestio 7 from pysparksqlfunctions import avg ark read sv /databricks-datasets/Rdatasets/data-001/esv/ggplot2/élamonds.sv,header="True!) aft fiter(sftcolor] == e) steréf7teue) Premium’ select(avg( price’) show) onan panera rears 280 ausin 1. Load the fle 'é2buy from Moodle ABD class page and tecnica resources alder to the Databrcks file system. 2. Join the diamonds DF: /databrcks-detasets/Rdstasets/data-001/esv/ggplot2/diamonds.cv' (same as the previous exercise) withthe “d2buy' DF, using the join columns / condition: diamonds.c0 == d2buy 4d 3. Caleulate the sum of the prices of the diamonds withthe flag "Yin the field “d2buy’ ofthe DF 's2buy" and withthe field ‘olor = € a. 2401 b. 2009 2626 2107 Please write the code for question 8 - Jin | Sum of the prices ofthe diamonds ‘rom pysparksql functions import * 191 =sparkread csi fdatabricks-datasets/Rdatasets/data-00'/csv/ggplot2/diamonds sv head f92=sparkreadcsu /fleStore/tables/d2buy sv, header="True) 9 display 9 fterit9rdbuy "yfltertsf9reolor) *E).agglsum¢ price’) show0) onan panera rears 280 = ASD 07 Notebooks oxida snares 280

You might also like