How to get images from a .pptx file using MS Open XML SDK?

I started experimenting with the Open XML SDK 2.0 for Microsoft Office .

Currently, I can do certain things, such as extracting all the texts on each slide and getting the size of the presentation. For example, I do the latter like this:

using (var doc = PresentationDocument.Open(pptx_filename, false)) { var presentation = doc.PresentationPart.Presentation; Debug.Print("width: " + (presentation.SlideSize.Cx / 9525.0).ToString()); Debug.Print("height: " + (presentation.SlideSize.Cy / 9525.0).ToString()); } 

Now I would like to get the embedded images in this slide. Does anyone know how to do this or can point me to some documents on this subject?

+5
source share
2 answers

First you need to capture the SlidePart in which you want to get the images:

 public static SlidePart GetSlidePart(PresentationDocument presentationDocument, int slideIndex) { if (presentationDocument == null) { throw new ArgumentNullException("presentationDocument", "GetSlidePart Method: parameter presentationDocument is null"); } // Get the number of slides in the presentation int slidesCount = CountSlides(presentationDocument); if (slideIndex < 0 || slideIndex >= slidesCount) { throw new ArgumentOutOfRangeException("slideIndex", "GetSlidePart Method: parameter slideIndex is out of range"); } PresentationPart presentationPart = presentationDocument.PresentationPart; // Verify that the presentation part and presentation exist. if (presentationPart != null && presentationPart.Presentation != null) { Presentation presentation = presentationPart.Presentation; if (presentation.SlideIdList != null) { // Get the collection of slide IDs from the slide ID list. var slideIds = presentation.SlideIdList.ChildElements; if (slideIndex < slideIds.Count) { // Get the relationship ID of the slide. string slidePartRelationshipId = (slideIds[slideIndex] as SlideId).RelationshipId; // Get the specified slide part from the relationship ID. SlidePart slidePart = (SlidePart)presentationPart.GetPartById(slidePartRelationshipId); return slidePart; } } } // No slide found return null; } 

Then you need to search for a Picture object that will contain the image you are looking for based on the image file name:

 Picture imageToRemove = slidePart.Slide.Descendants<Picture>().SingleOrDefault(picture => picture.NonVisualPictureProperties.OuterXml.Contains(imageFileName)); 
+1
source

The easiest way to get images from Openxml formats:

Use any zip archive library to extract images from the media folder in the pptx file. This will contain the images in the document. Similarly, you can manually replace the .pptx extension with .zip and extract to get images from the media folder.

Hope this helps.

0
source

Source: https://habr.com/ru/post/895521/


All Articles