PWC PySpark Interview Question | How to handle multiple delimiter in a csv file |
Input
data="""
Id|Name|Marks
1|Sagar|20,30,40
2|Alex|34,32,12
3|David|45,67,54
4|John|10,34,60
"""
dbutils.fs.put('/FileStore/tables/mutliple_delimiter.csv',str(data),True)
Solution:
from pyspark.sql.functions import col,split
df=spark.read.format('csv').option('header',True).option('sep','|').load('/FileStore/tables/mutliple_delimiter.csv')
df_output=df.withColumn("Physics",split(col("Marks"),',')[0]).withColumn("Chemistry",split(col("Marks"),',')[1]).withColumn("Maths",split(col("Marks"),',')[2]).drop(col("Marks"))
display(df_output)
I have prepared many courses on Azure Data Engineering
1. Build Azure End to. End Project
https://www.geekcoders.co.in/courses/...
2. Build Delta Lake project
https://www.geekcoders.co.in/courses/...
3. Master in Azure Data Factory with ETL Project and PowerBi
https://www.geekcoders.co.in/courses/...
4. Master in Python
https://www.geekcoders.co.in/courses/...
Check out my courses on Azure Data Engineering
https://www.geekcoders.co.in/s/store/...
hastags
tags
#dataengineer #interviewquestions #spark
#hashtags #hastag #tags
data="""
Id|Name|Marks
1|Sagar|20,30,40
2|Alex|34,32,12
3|David|45,67,54
4|John|10,34,60
"""
dbutils.fs.put('/FileStore/tables/mutliple_delimiter.csv',str(data),True)
Solution:
from pyspark.sql.functions import col,split
df=spark.read.format('csv').option('header',True).option('sep','|').load('/FileStore/tables/mutliple_delimiter.csv')
df_output=df.withColumn("Physics",split(col("Marks"),',')[0]).withColumn("Chemistry",split(col("Marks"),',')[1]).withColumn("Maths",split(col("Marks"),',')[2]).drop(col("Marks"))
display(df_output)
I have prepared many courses on Azure Data Engineering
1. Build Azure End to. End Project
https://www.geekcoders.co.in/courses/...
2. Build Delta Lake project
https://www.geekcoders.co.in/courses/...
3. Master in Azure Data Factory with ETL Project and PowerBi
https://www.geekcoders.co.in/courses/...
4. Master in Python
https://www.geekcoders.co.in/courses/...
Check out my courses on Azure Data Engineering
https://www.geekcoders.co.in/s/store/...
hastags
tags
#dataengineer #interviewquestions #spark
#hashtags #hastag #tags
همه توضیحات ...