You can use something similar to this great solution from @ zero323 :
df.toDF(*(c.replace('.', '_') for c in df.columns))
as an alternative:
from pyspark.sql.functions import col replacements = {c:c.replace('.','_') for c in df.columns if '.' in c} df.select([col(c).alias(replacements.get(c, c)) for c in df.columns])
The replacement dictionary will then look like this:
{'emp.city': 'emp_city', 'emp.dno': 'emp_dno', 'emp.sal': 'emp_sal'}
UPDATE:
if I have a dataframe with a space in the column names, same as replacing as '.' , and a space with '_'
import re df.toDF(*(re.sub(r'[\.\s]+', '_', c) for c in df.columns))
source share