When filtering a DataFrame with string values, I found that pyspark.sql.functions lowerit is upperuseful if your data can have entries in columns such as "foo" and "Foo":
import pyspark.sql.functions as sql_fun
result = source_df.filter(sql_fun.lower(source_df.col_name).contains("foo"))
source
share