Professional Documents
Culture Documents
PySpark Learning Hub 1700684461
PySpark Learning Hub 1700684461
WWW.LINKEDIN.COM/IN/AKASHMAHINDRAKAR
PYSPARK LEARNING HUB : ARTICLE - 11
Agenda :
1. Standard way of creating schema( scan whole data )
2. Standard way of creating schema( scan 10 % data )
3. How do we enforce schema , style 1, schema DDL
4. How do we enforce schema , style 2, StructType
WWW.LINKEDIN.COM/IN/AKASHMAHINDRAKAR
PYSPARK LEARNING HUB : ARTICLE - 11
df.printSchema()
WWW.LINKEDIN.COM/IN/AKASHMAHINDRAKAR
PYSPARK LEARNING HUB : ARTICLE - 11
df.printSchema()
WWW.LINKEDIN.COM/IN/AKASHMAHINDRAKAR
PYSPARK LEARNING HUB : ARTICLE - 11
df=spark.read\
.format("csv")\
.schema(orders_schema)\
.load("/public/trendytech/datasets/orders_sample1.csv")
df.show(5)
df.printSchema()
WWW.LINKEDIN.COM/IN/AKASHMAHINDRAKAR
PYSPARK LEARNING HUB : ARTICLE - 11
])
df=spark.read\
.format("csv")\
.schema(order_schema_struct)\
.load("/public/trendytech/datasets/orders_sample1.csv"
df.show(5)
WWW.LINKEDIN.COM/IN/AKASHMAHINDRAKAR
PYSPARK LEARNING HUB : ARTICLE - 11
df.printSchema()
WWW.LINKEDIN.COM/IN/AKASHMAHINDRAKAR
PYSPARK LEARNING HUB : ARTICLE - 11
WWW.LINKEDIN.COM/IN/AKASHMAHINDRAKAR