I clear the page with BeautifulSoup, and part of the logic is that sometimes part of the content of the tag <td>may contain <br>.
So sometimes it looks like this:
<td class="xyz">
text 1
<br>
text 2
</td>
and sometimes it looks like this:
<td class="xyz">
text 1
</td>
I look through this and add output_row to the list, which I end up adding to the list of lists. I see the previous format or the last, I want the text to be in one cell.
I found a way to determine if I see the tag <br>because td.string appears as nothing, and I also know that in text 2 there is always “ABC”. So:
elif td.string == None:
if 'ABC' in td.contents[2]:
new_string = td.contents[0] + ' ' + td.contents[2]
output_row.append(new_string)
print(new_string)
else:
Jupyter Notebook, " 1 2" . CSV, . , td.string ( <br>), 1 , <br>, .
, ( ), , .
:
with open('C:/location/file.csv', 'w',newline='') as csv_file:
writer=csv.writer(csv_file,delimiter=',')
for row in output_rows:
writer.writerow(row)
csv_file.close