I have two lists:
Hourly_Sports,DEF (show,channel)
Hourly_Sports,21 (show,views)
I split the lines and rebuilt them using code:
def split_show_views(line):
show,views=line.split(',')
return (show, views)
show_views = show_views_file.map(split_show_views)
def split_show_channel(line):
show,channel=line.split(',')
return (show, channel)
show_channel = show_channel_file.map(split_show_channel)
joined_dataset = show_views.join(show_channel)
Now when I call collect, the list looks like this:
(u'Baked_Talking', (u'MAN', u'138'))
and now I only want to "channel" and "view part", Instructions:
def extract_channel_views(show_views_channel):
<INSERT_CODE_HERE>
return (channel, views)
It seems that the merged list is made up of separated lines, so that I cannot use the split function again, and I checked with the python built-in functions, but did not find any extraction function? It seems to me that the “channel” and “opinions” are defined in the previous steps, so I don’t need to add anything? If this is not the case, how can I define the channel and views? I tried something like show,channel,views=split('',('','')), I don’t think it’s right, but I really don’t know how to do it.