Modifying / Editing Tags Using PHP

This is my first question, so please bear with me. I apologize if I did not post it correctly.

I managed to get the job description from the XML file created by our database, however the database software is very old and it converts certain characters.

My problem is this: marker points are converted to the following:

" Production of Monthly Management Accounts and variance analysis<BR> 

So, entered (quotation mark) where the dot should be, and <BR> is entered to start a new line.

I was wondering if anyone knows how to convert the quotation mark and <BR> to <li> and </li> repeatedly. I considered several options, such as preg_match and substr_replace , however none of them gave the desired results.

Obviously, the text between <li> and </li> will change depending on the job, etc.

To top it all off, after this works, I need to be able to add list elements to <ul> and </ul> , but I assume that I can find the first instance of <li> and replace it with <ul><li> as well as the last instance </li> and replace with </li></ul> .

I apologize for making me angry. I hope I get it.

Edit: Thanks so much for all the quick answers, I'm going to give them bash tomorrow. Iโ€™m stuck with this for most of the day, so think itโ€™s time to leave.

Just to give more information if this helps ...

The database software is about 12 years old and its support is rather limited. If we want to do something, it tends to cost a lot of money. It has several options for exporting data, but XML has retained HTML formatting for some reason, so I went with this route.

All jobs were first written in Word, and then inserted into the "job" field in the database, so there is a chance that the code was misinterpreted.

I completed the test task and made sure that I used marker points in Word and copied it - pasted it into the "job" field, quotation marks appeared where the points should be, so I believe that the old software does not "understand" "bullet points .

I will try all your wonderful answers and report back tomorrow!

Thanks!

EDIT 2

Hi, below I have inserted the actual output from the original view. I tried using the preg_replace parameter below, which works on one line, but as you can see, the output annoyingly places everything continuously without line breaks.

 An exciting opportunity has arisen to join an established company based in Luton for a high calibre Management Accountant. Reporting to the Finance Director, the Management Accountant will provide accurate and reliable management information and financial support to the business. <BR>Key Responsibilities:<BR>" Production of Monthly Management Accounts and variance analysis<BR>" Preparation of Management Reports for Management Meetings.<BR>" Production of Monthly Forecasts and Annual Budgets using Excel.<BR>" Decision support to the business<BR>" Attending and presenting at meetings with business managers<BR>" Assisting external auditors with their audit process at each year end<BR>" Ad-hoc project work<BR>Experience:<BR>" Qualified accountant (ACA or CIMA) <BR>" Strong communication skills - to communicate effectively with all levels of management<BR>" High level of personal motivation, focus and a commitment to quality<BR>" Ability to adapt to the demands of a constantly changing business<BR>" Ability to interact with people at all levels in a sensitive and effective way<BR>If you are interested in this role then please apply now.<BR> 
+4
source share
3 answers

Assumptions:

  • " the line begins (quote followed by 3 spaces, markdown removes spaces)
  • <BR> is at the very end of the line
  • There will be no other options and split lines.

RegEx:

 /^" (.*)<BR>$/ 

PHP:

 $replacedData = preg_replace( '/^" (.*)<BR>$/', '<li>\1</li>', $data ); 

As you said, the content is concentrated together, you can try this regular expression:

 /" (.*?)<BR>/ 

Although you should be warned that it may get the wrong quote if the lines contain "quoted" text.

Alternatively, if you know that lines end with <BR>" (3 spaces are removed due to markdowns), you can use 3 replacements to get the desired effect:

 $repData = preg_replace( array( '/<BR>" /', '<BR>', '" ' ), array( '</li><li>', '</li></ul>', '<ul><li>' ), $data ); 

Again, this can pick up the wrong elements, especially if <BR> exists elsewhere in the code.

+2
source

The first comment is fixing a bad database.

Also, why is there formatting in an XML file or in a database? If this is an XML file, just drop everything except the actual job description text from the element and your PHP script will display it beautifully. In the above example, crop
from the back end, then run it through the trim ('' ') to clear any closing quotation marks and spaces.

Or is this one of those cases when you get XML from a database, and the person who wrote this part clearly did not understand what XML is for?

Edit: Ahh. It just hit me. Perhaps you mean that the job description is a simple text block, with what should actually be placed in subheadings, formatted as you demonstrated. If this happens, it will be very difficult for you to get it for sure, because there are chances (when working with unstructured data) there are some deviations in formatting. I think itโ€™s best to have a regular expression to pull all the text between "and the BR tags, build an array of this and manually check some samples. Oh and fix the database.

0
source

Assuming you extracted a variable; for convenience, Iโ€™ll just install one:

 $myVar = '" Production of Monthly Management Accounts and variance analysis<BR>'; 

As another answer says, trim () is your friend, and therefore it is str_replace () or strip_tags () depending on what you want to do and what else you can have in your database.

Try this (assuming you saved the contents in $ myVar, as in my example).

 $cleanedVar = strip_tags(trim($myVar,'" ')); 

Or that:

 $cleanedVar = str_replace("<BR>","",trim($myVar,'" ')); 

Both of these lines will give you the result in $ cleanedVar as

Production of monthly management and variance analysis accounts

0
source

Source: https://habr.com/ru/post/1345912/


All Articles