HTML parser for creating formatted GTFS data

There is a transit agency that does not provide GTFS transit data schedule data. I would like to make an Android application that can search in it, so this format will be very useful. Transit timetable data has a website, but it seems difficult to separate useful things.

<td class="b stopPoint p0" background="nline.gif"><a href="line.cgi?id=1&dir=back&zero=15901&city=so&term=20141214"><img src="coming.gif" class="stopPoint" alt="A megállóhoz tartozó indulási időpontok megjelenítéséhez kérem, kattintson ide!" /></a></td>
<td class="b stopTime p0">2</td>
<td class="b stopPeakTime p0">2</td>
<td class="b stopName p0" colspan="1">Frankenburg úti aluljáró</td>
<td class="b stopTransfer p0"><img src="transfer.gif" class="iconTransfer" alt="Átszállási lehetőség a felsorolt autóbuszvonalakra" />&nbsp;&nbsp;<a href="line.cgi?id=10&dir=to&zero=1590&city=so&term=20141214">10</a>, <a href="line.cgi?id=10Y&dir=to&zero=1590&city=so&term=20141214">10Y</a></td>

Perhaps a useful existing parser will be useful for this purpose. Are there any workers?

+4
source share
1 answer

Ask the transit agency if there is a way to provide the schedule data in a more meaningful format. They may have a different data format that would be better than they currently have.

, , / . html python beautifulsoup, .

+1

Source: https://habr.com/ru/post/1609573/


All Articles