Pyspark RDD.filter () with template

I have a Pyspark RDD with a text column that I want to use as a filter, so I have the following code:

table2 = table1.filter(lambda x: x[12] == "*TEXT*")

To the problem ... As you can see, I use *to try to tell him to interpret this as a template, but not succeed. Does anyone have any help?

+4
source share
1 answer

Lambda function is pure python, so something like below will work

table2 = table1.filter(lambda x: "TEXT" in x[12])
+5
source

Source: https://habr.com/ru/post/1653209/


All Articles