ReadPDF (tm packet) in R

I tried to read some kind of online PDF document in R. I used the readRDF function. My script is as follows

 safex <- readPDF(PdftotextOptions='-layout')(elem=list(uri='C:/Users/FCG/Desktop/NoteF7000.pdf'),language='en',id='id1') 

R showed a message that the running command has the status 309. I tried different options of pdftotext . however, this is the same message. and the generated text file has no content.

Can anyone read this pdf

+1
r cygwin tm
Jul 10 '13 at 6:32
source share
1 answer

readPDF has bugs and probably no worries (check out this well-documented fight against it).

Assuming that ...

  • you have xpdf installed (see here for details)

  • all your PATHs are okay (see here for how to do this) and you restarted your computer.

Then you might be better off avoiding readPDF and using this workaround instead:

 system(paste('"C:/Program Files/xpdf/pdftotext.exe"', '"C:/Users/FCG/Desktop/NoteF7000.pdf"'), wait=FALSE) 

And then read the text file in R , like this ...

 require(tm) mycorpus <- Corpus(URISource("C:/Users/FCG/Desktop/NoteF7001.txt")) 

And try to confirm that everything went well:

 inspect(mycorpus) A corpus with 1 text document The metadata consists of 2 tag-value pairs and a data frame Available tags are: create_date creator Available variables in the data frame are: MetaID [[1]] Market Notice Number: Date F7001 08 May 2013 New IDX SSF (EWJG) The following new IDX SSF contract will be added to the list and will be available for trade today. Summary Contract Specifications Contract Code Underlying Instrument Bloomberg Code ISIN Code EWJG EWJG IShares MSCI Japan Index Fund (US) EWJ US EQUITY US4642868487 1 (R1 per point) Contract Size / Nominal Expiry Dates & Times 10am New York Time; 14 Jun 2013 / 16 Sep 2013 Underlying Currency Quotations Minimum Price Movement (ZAR) Underlying Reference Price USD/ZAR Bloomberg Code (USDZAR Currency) Price per underlying share to two decimals. R0.01 (0.01 in the share price) 4pm underlying spot level as captured by the JSE. Currency Reference Price The same method as the one utilized for the expiry of standard currency futures on standard quarterly SAFEX expiry dates. JSE Limited Registration Number: 2005/022939/06 One Exchange Square, Gwen Lane, Sandown, South Africa. Private Bag X991174, Sandton, 2146, South Africa. Telephone: +27 11 520 7000, Facsimile: +27 11 520 8584, www.jse.co.za Executive Director: NF Newton-King (CEO), A Takoordeen (CFO) Non-Executive Directors: HJ Borkum (Chairman), AD Botha, MR Johnston, DM Lawrence, A Mazwai, Dr. MA Matooane , NP Mnxasana, NS Nematswerani, N Nyembezi-Heita, N Payne Alternate Directors: JH Burke, LV Parsons Member of the World Federation of Exchanges Company Secretary: GC Clarke Settlement Method Cash Settled - Clearing House Fees - On-screen IDX Futures Trading: o 1 BP for Taker (Aggressor) o Zero Booking Fees for Maker (Passive) o No Cap o Floor of 0.01 Reported IDX Futures Trades o 1.75 BP for both buyer and seller o No Cap o Floor of 0.01 Initial Margin Class Spread Margin VSR Expiry Date R 10.00 R 5.00 3.5 14/06/2013, 16/09/2013 The above instrument has been designated as "Foreign" by the South African Reserve Bank Should you have any queries regarding IDX Single Stock Futures, please contact the IDX team on 011 520-7399 or idx@jse.co.za Graham Smale Director: Bonds and Financial Derivatives Tel: +27 11 520 7831 Fax:+27 11 520 8831 E-mail: grahams@jse.co.za Distributed by the Company Secretariat +27 11 520 7346 Page 2 of 2 
+3
Nov 12 '13 at 10:10
source share



All Articles