Remove column from data spark

I have a Spark DataFrame with a very large number of columns. I want to remove two columns from it in order to get a new data frame.

If there were fewer columns, I could use the select method in the API as follows:

pcomments = pcomments.select(pcomments.col("post_id"),pcomments.col("comment_id"),pcomments.col("comment_message"),pcomments.col("user_name"),pcomments.col("comment_createdtime")); 

But since selecting columns from a long list is a tedious task, is there a workaround?

+18
source share
1 answer

Use the drop method and using the ColumnRenamed methods .

Example:

  val initialDf= .... val dfAfterDrop=initialDf.drop("column1").drop("coumn2") val dfAfterColRename= dfAfterDrop.withColumnRenamed("oldColumnName","new ColumnName") 
+29
source

Source: https://habr.com/ru/post/1263094/


All Articles