I want to take the HTML generated by the QTextEdit editor and convert it to something more convenient to use on a real web page. Unfortunately, the HTML generator, which is part of the QTextEdit api, is not publicly available and cannot be modified. I would prefer not to create a WYSIWYG HTML editor when I have most of what I need.
In a short discussion of the qt-interest mailing list, someone mentioned using XQuery through the QtXmlPatterns module.
For the ugly HTML example that the editor displays, it uses <span style=" font-weight:600"> for bold text, <span style=" font-weight:600; text-decoration: underline"> for bold and underline text etc. Here is an example:
<html> <head> </head> <body style=" font-family:'Lucida Grande'; font-size:14pt; font-weight:400; font-style:normal;"> <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">plain text</p> <p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"></p> <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">plain text <span style=" font-weight:600;">bold text</span></p> <p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; font-weight:600;"></p> <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">plain text <span style=" font-style:italic;">italics text</span></p> <p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; font-style:italic;"></p> <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">plain text <span style=" text-decoration: underline;">underline text</span></p> <p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"></p> <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">plain text <span style=" font-weight:600; text-decoration: underline;">bold underline text</span></p> <p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">plain text <span style=" font-weight:600;">bold text </span><span style=" font-weight:600; text-decoration: underline;">bold underline text</span></p> </body> </html>
What I would like to convert to this is something like this:
<body> <p>plain text</p> <p/> <p>plain text <b>bold text</b></p> <p/> <p>plain text <em>italics text</em></p> <p/> <p>plain text <u>underline text</u></p> <p/> <p>plain text <b>bold text <u>bold underline text</u></b></p> </body>
I got about 90% of the way to where I need to. I can correctly convert the first 4, where each element of the <span> style has only one of italics, bold or underlining attributes. I am having problems when the span style has several attributes. For example, if the span style has both font-weight:600 and text-decoration: underline .
Here is my XQuery code that I still have:
declare function local:process_span_data($node as node()) { for $n in $node return ( for $attr in $n/@style return ( if(contains($attr, 'font-weight:600')) then ( <b>{data($n)}</b> ) else if(contains($attr, 'text-decoration: underline')) then ( <u>{data($n)}</u> ) else if (contains($attr, 'font-style:italic')) then ( <em>{data($n)}</em> ) else ( data($n) ) ) ) }; declare function local:process_p_data($data as node()+) { for $d in $data return ( if ($d instance of text()) then $d else local:process_span_data($d) ) }; let $doc := doc('myfile.html') for $body in $doc/html/body return <body> { for $p in $body/p return ( if (contains($p/@style, '-qt-paragraph-type:empty;')) then ( <p /> ) else ( if (count($p/*) = 0) then ( <p>{data($p)}</p> ) else ( <p> {for $data in $p/node() return local:process_p_data($data)} </p> ) ) ) }</body>
Which gives ALMOST the correct result:
<body> <p>plain text</p> <p/> <p>plain text <b>bold text</b> </p> <p/> <p>plain text <em>italics text</em> </p> <p/> <p>plain text <u>underline text</u> </p> <p/> <p>plain text <b>bold underline text</b> </p> <p>plain text <b>bold text </b> <b>bold underline text</b> </p> </body>
Can someone point me in the right direction to achieve my desired result? Thanks in advance from XQuery n00b!