When filtering a DataFrame with string values, I found that pyspark.sql.functions
lower
it is upper
useful if your data can have entries in columns such as "foo" and "Foo":
import pyspark.sql.functions as sql_fun
result = source_df.filter(sql_fun.lower(source_df.col_name).contains("foo"))
source
share