NSXMLParser clamps ampersand &

I am parsing some HTML with NSXMLParser, and it encounters a parser error anytime it encounters an ampersand. I can filter out the ampersands before I parse them, but I would rather parse all that is.

This gives me error 68, NSXMLParserNAMERequiredError: name required.

My best guess is the character set problem. I'm a little vague in the world of character sets, so I think my ignorance will bite me in the ass. The source HTML uses charset iso-8859-1, so I use this code to initialize Parser:

NSString *dataString = [[[NSString alloc] initWithData:data encoding:NSISOLatin1StringEncoding] autorelease];
NSData *dataEncoded = [[dataString dataUsingEncoding:NSUTF8StringEncoding allowLossyConversion:YES] autorelease];
NSXMLParser *theParser = [[NSXMLParser alloc] initWithData:dataEncoded];

Any ideas?

+3
source share
3 answers

: , XML ... HTML!

, NSXMLParser HTML, libxml2

, .

+7

XML? XML , , &

+2

NSString , , (dataUsingEncoding), , :

NSString *dataString = [[NSString alloc] initWithData:data
                             encoding:NSISOLatin1StringEncoding];

NSData *dataEncoded = [dataString dataUsingEncoding:NSUTF8StringEncoding 
                                     allowLossyConversion:YES];

[dataString release];

NSXMLParser *theParser = [[NSXMLParser alloc] initWithData:dataEncoded];
0

Source: https://habr.com/ru/post/1722638/


All Articles