df.printschema()
root
|-- skills: array (nullable = true)
| |-- element: string (containsNull = true)
df.show()
+--------------------+
| skills|
+--------------------+
| [Java, java]|
|[Java Developer, ...|
| [dev]|
+--------------------+
Now let's register it as a temporary table:
>>> df.registerTempTable("t")
, LIKE:
>>> res = sqlContext.sql("select skills, lower(skill) as skill from (select skills, explode(skills) skill from t) a where lower(skill) like '%java%'")
>>> res.show()
+--------------------+--------------+
| skills| skill|
+--------------------+--------------+
| [Java, java]| java|
| [Java, java]| java|
|[Java Developer, ...|java developer|
|[Java Developer, ...| java dev|
+--------------------+--------------+
.