HTML to RTF Converter for .NET.

I have already seen many posts on the RTF for HTML site and some other posts about some HTML to RTF converters, but I'm really trying to get a complete breakdown of what is considered the most widely used commercial product, open source product or if people recommend to go home. I apologize if you think this is a duplicate question, but I'm trying to create a product matrix to see which is the most viable for our application. I also think that would be helpful to others.

The converter will be used in the ASP.NET 2.0 application (we will update to 3.5 soon, but still adhere to WebForms), using SQLServer 2005 (soon 2008) as the database.

From reading a few posts, SautinSoft seems to be popular as a commercial component. Are there any other commercial components that you recommend for converting HTML to RTF? Price matters, but even if it costs a bit on the expensive side, list it.

In open source, I read that OpenOffice.org can be run as a service so that it can convert files. However, it seems that this is only Java. I suppose I need some kind of intervention to use this? What open source .NET components, if any, are designed to convert HTML to RTF?

For home growth, is XSLT a way to go with XHTML? If so, which component do you recommend for creating XHTML? Otherwise, what other home adventures do you recommend.

Also, note that I currently care about RTF for HTML. If the commercial component offers this and the price remains unchanged, fine, otherwise, please do not mention it.

+5
source share
4 answers

I would recommend doing it yourself, as the task is not so difficult. Firstly, the easiest way to convert one Xml format to another Xml format is with Xslt. Converting Xml documents to C # is very simple.

Here is a good msdn blog post to get you started. Mike even mentions that it was easier to do this manually in order to deal with a third party.

link

Actually, I already answered this question here . Assume this is a duplicate.

0
source

What is it worth in a certain order.

Some time ago I wanted to export to RTF, and then import from RTF, which RTF encounters under MS Word.

The first problem is that RTF is not an open standard. This is the internal MS standard, and therefore they change it as and when they like, and usually don’t worry about compatibility. Currently, RTF versions range from 1.3 to 1.9, and they are all different. Inside, they use twips to measure only for good measure.

I bought an O'Reilly pocket book on this subject, which helped and read a lot of MS documentation, which is good, but there are many and many for each version.

Due to the way that RTF is encoded using a regular expression for control, this is incredibly hard work and requires careful processing and concentration for testing and work. I use the Mac editor, which was built into the regular expression, so I can confidently test each section and embed it in the code.

Due to the number of versions, there is also a lot of incompatibility between versions, but there is a lot of commonality, and in the end it was quite difficult / easy to get to where I wanted (after about a week of reading and a week of coding) and creating a really simple version.

I never found a commercial solution, but I needed to free it for budget, so cut a lot, but be careful when choosing one to make sure that it does what you want and has support.

I don’t think where you came from HTML / XML / XHTML, I converted CSV formats, this is RTF.

I'm not sure what I would recommend DIY or buy. Maybe on a DIY balance, but your own circumstances will dictate it.

Edit: One thing going from content to RTF is easier than the other way around.

BTW does not criticize the MS fior version of RTF, hey, they are proprietary so they can do what they like.

+1
source

I just stumbled upon this rich WYSIWYG (RTE) text editor for the web, which also has an HTML to RTF converter, Cute Editor for .NET . Does anyone have experience with this component? My main experiences for web RTE were CKEditor (fckEditor) and TinyMCE, but as far as I can tell, CKEditor and TinyMCE do not have RTF converters built into HTML.

0
source

Since I need to implement some of the features of mailmerge with formatting rich text in a web application, I thought it would be nice to share my impressions.

Basically, I explored two alternatives:

  • using the Google Docs API to take advantage of the power of Google Docs.
  • using XSLT as shown in this essay

The Google Docs API works well. The problem is that when you load an HTML document with page breaks, for example:

<p style="page-break-before:always;display:none;"/> 

and ask Google to convert the document to RTF, you will lose all interruptions, which does not meet my requirements. However, if page breaks are not a problem for you, you can check this solution.

XSLT solution works ... sort of.

It works if you directly reference the MSXML3 COM object, bypassing the System.Xml classes. Otherwise, I could not get it to work. Moreover, he seems to respect everything except basic formatting and tags, without regard to text color, size, etc. However, it distinguishes page breaks. :-)

Here is a quick library I wrote using tidy.net to get HTML to convert XHTML. Hope it helps.

 using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.IO; namespace ADDS.Mailmerge { public class XHTML2RTF { MSXML2.FreeThreadedDOMDocument _xslDoc; MSXML2.FreeThreadedDOMDocument _xmlDoc; MSXML2.IXSLProcessor _xslProcessor; MSXML2.XSLTemplate _xslTemplate; static XHTML2RTF instance = null; static readonly object padlock = new object(); XHTML2RTF() { _xslDoc = new MSXML2.FreeThreadedDOMDocument(); //XSLData.xhtml2rtf is a resource file // containing XSL for transformation // I got XSL from here: // http://www.codeproject.com/KB/HTML/XHTML2RTF.aspx _xslDoc.loadXML(XSLData.xhtml2rtf); _xmlDoc = new MSXML2.FreeThreadedDOMDocument(); _xslTemplate = new MSXML2.XSLTemplate(); _xslTemplate.stylesheet = _xslDoc; _xslProcessor = _xslTemplate.createProcessor(); } public string ConvertToRTF(string xhtmlData) { try { string sXhtml = ""; TidyNet.Tidy tidy = new TidyNet.Tidy(); tidy.Options.XmlOut = true; tidy.Options.Xhtml = true; using (MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes(xhtmlData))) { StringBuilder sb = new StringBuilder(); using (MemoryStream sw = new MemoryStream()) { TidyNet.TidyMessageCollection messages = new TidyNet.TidyMessageCollection(); tidy.Parse(ms, sw, messages); sXhtml = Encoding.UTF8.GetString(sw.ToArray()); } } _xmlDoc.loadXML(sXhtml); _xslProcessor.input = _xmlDoc; _xslProcessor.transform(); return _xslProcessor.output.ToString(); } catch (Exception exc) { throw new Exception("Error in xhtml conversion. ", exc); } } public static XHTML2RTF Instance { get { lock (padlock) { if (instance == null) { instance = new XHTML2RTF(); } return instance; } } } } } 
0
source

Source: https://habr.com/ru/post/887525/


All Articles