The easiest way to transfer html table data to a readable document

Good,

Over the past 6 months, I have been struggling to create a system that allows you to enter the user in the form of large sex text areas (with support for tables, lists, etc.). Quite a lot allows the user to enter data as if it were a word. However, when you want to export all this data, I could not find a working solution ...

My first step was to try to find reporting software that supported raw HTML from the data source and displayed it as normal html, it worked fine, except that the keep together function is awful or the data is split in half (tables, lists etc.) which I do not want. Or the report always goes to the next page to avoid this, ending with 15 + blank pages in the final document.

So I'm looking for some advice / direction on what would be the best solution for exporting my data to a readable document (pdf or word pref).

What I got is the following data splitting, where the data is often raw html.

-period

- block

--- Group

---- Question

----- Data

What would be the best choice? Trying to render html to pdf or rtf? I need some advice :(

And also sometimes the data is 2-3 pages with lists of mixed tables and plain text.

0
source share
4 answers

I would suggest that you try to save this in a browser and add a stylesheet in HTML to make it one way screen and another way on paper . Adding a print style sheet to your HTML is just as easy:

<link rel="stylesheet" media="print" href="print.css"> 

You should be able to parse input with something like the Html Agility Pack and convert it (i.e. with XSLT ) to whatever output format you want.

Another option is to write HTML in a browser, but with Content-Type set for the version with Microsoft Word (there are several to choose from, depending on the version of Word you are aiming for), you should force the browser to ask if the user wants to open the page using Microsoft Word. With Word 2007 and newer, you can also write Office Open XML Word directly, as it is based on XML.

The types of content you can use are as follows:

 application/msword 

For Microsoft Word binaries, but should also work for HTML.

 application/vnd.openxmlformats-officedocument.wordprocessingml.document 

For new Office Open XML formats in Word 2007 and later.

+2
source

The solution you can use is to run the application on the server using System.Diagnostics.Process, which converts the site and saves it as a PDF document.

You can use wkhtmltopdf , which is an open source console program that can convert from HTML to PDF or image.

The installer for windows can be obtained from wkhtmltox-0.10.0_rc2 Windows Installer (i368) .

After installing wkhtmltopdf, you can copy the files to the installation folder inside your solution. You can use this setting in the solution:

The converted pdf will be saved in the pdf folder.

And here is the code to convert:

 var wkhtmltopdfLocation = Server.MapPath("~/wkhtmltopdf/") + "wkhtmltopdf.exe"; var htmlUrl = @"http://stackoverflow.com/q/7384558/750216"; var pdfSaveLocation = "\"" + Server.MapPath("~/wkhtmltopdf/pdf/") + "question.pdf\""; var process = new Process(); process.StartInfo.UseShellExecute = false; process.StartInfo.CreateNoWindow = true; process.StartInfo.FileName = wkhtmltopdfLocation; process.StartInfo.Arguments = htmlUrl + " " + pdfSaveLocation; process.Start(); process.WaitForExit(); 

htmlUrl is the location of the page to be converted to pdf. It is installed on this stackoverflow page. :)

+2
source

This is a general question, but two things come to mind with a visitor pattern and with a change in the Mime type.

Visitor Template You can have two separate rendering methods. It will depend on your implementation.

MIME type When the request is completed, enter the date in response, etc.

 HttpContext.Current.Response.Clear(); HttpContext.Current.Response.Charset = "utf-16"; HttpContext.Current.Response.ContentEncoding = System.Text.Encoding.GetEncoding("windows-1250"); HttpContext.Current.Response.AddHeader("content-disposition", string.Format("attachment; filename={0}.doc", filename)); HttpContext.Current.Response.ContentType = "application/msword"; HttpContext.Current.Response.Write("-Period"); HttpContext.Current.Response.Write("/n"); HttpContext.Current.Response.Write("--Unit"); HttpContext.Current.Response.Write("/n"); HttpContext.Current.Response.Write("---Group"); HttpContext.Current.Response.Write("/n"); HttpContext.Current.Response.Write("----Question"); HttpContext.Current.Response.Write("/n"); HttpContext.Current.Response.Write("-----Data"); HttpContext.Current.Response.Write("/n"); HttpContext.Current.Response.End(); 
+1
source

Here is another option, use print screens (although it doesn't care about scrolling, I think you can do it). This example can be expanded to meet the needs of your business, although this is a hacky look. You pass it the url that creates the image.

Name it

  protected void Page_Load(object sender, EventArgs e) { int screenWidth = Convert.ToInt32(Request["ScreenWidth"]); int screenHeight = Convert.ToInt32(Request["ScreenHeight"]); string url = Request["Url"].ToString(); string bitmapName = Request["BitmapName"].ToString(); WebURLToImage webUrlToImage = new WebURLToImage() { Url = url, BrowserHeight = screenHeight, BrowserWidth = screenWidth, ImageHeight = 0, ImageWidth = 0 }; webUrlToImage.GenerateBitmapForUrl(); webUrlToImage.GeneratedImage.Save(Server.MapPath("~") + @"Images\" +bitmapName + ".bmp"); } 

Create an image from a web page.

 using System; using System.Drawing; using System.Windows.Forms; using System.Threading; using System.IO; public class WebURLToImage { public string Url { get; set; } public Bitmap GeneratedImage { get; private set; } public int ImageWidth { get; set; } public int ImageHeight { get; set; } public int BrowserWidth { get; set; } public int BrowserHeight { get; set; } public Bitmap GenerateBitmapForUrl() { ThreadStart threadStart = new ThreadStart(ImageGenerator); Thread thread = new Thread(threadStart); thread.SetApartmentState(ApartmentState.STA); thread.Start(); thread.Join(); return GeneratedImage; } private void ImageGenerator() { WebBrowser webBrowser = new WebBrowser(); webBrowser.ScrollBarsEnabled = false; webBrowser.Navigate(Url); webBrowser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser_DocumentCompleted); while (webBrowser.ReadyState != WebBrowserReadyState.Complete) Application.DoEvents(); webBrowser.Dispose(); } void webBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) { WebBrowser webBrowser = (WebBrowser)sender; webBrowser.ClientSize = new Size(BrowserWidth, this.BrowserHeight); webBrowser.ScrollBarsEnabled = false; GeneratedImage = new Bitmap(webBrowser.Bounds.Width, webBrowser.Bounds.Height); webBrowser.BringToFront(); webBrowser.DrawToBitmap(GeneratedImage, webBrowser.Bounds); if (ImageHeight != 0 && ImageWidth != 0) GeneratedImage = (Bitmap)GeneratedImage.GetThumbnailImage(ImageWidth, ImageHeight, null, IntPtr.Zero); } } 
+1
source

Source: https://habr.com/ru/post/1386318/


All Articles