Regex matching space after PCRE extension (php)

I am trying to migrate a fairly large and rather old database, one of the columns consists of file names. The problem is that there can be several file names in this one field, separated by a space. For instance:

"Filename.mp3 file anem.mid fi le nam e.rm"

I tried to break this line with preg_split(), the closest regex I could find was

/(?<=\.[\w]{3})(\s)/

I know that /(?<=\.[\w]+)(\s)/it will not work, since in PCRE the lookbehind must have a fixed width. And since this is a music database, there are unconventional extensions.

Any suggestions?

+4
source share
2 answers

You can use this regex for split:

~\.\w+\K\h+~

RegEx Demo

RegEx Details:

  • \.: match literal point
  • \w+: 1 + 1
  • \K: Reset ( )
  • \h+: 1 +
+7

, :

<?php
$filename = "Filename.mp3 file anem.mid fi le nam e.rm";

// Temp storage for a single file pieces
$new_filename = [];

// Store whole files
$filenames = [];

// Split up the string based on spaces
$spaces = explode( ' ', $filename );

// Loop the pieces broken by a space
foreach( $spaces as $piece )
{
    // just keep adding pieces to this array
    $new_filename[] = $piece;

    // if this piece contains a period then we have a whole filename
    if( strpos( $piece, '.' ) !== false )
    {
        // add this whole filename to the list by rejoining the temp var on spaces
        $filenames[] = implode( ' ', $new_filename );

        // reset the temp variable
        $new_filename = [];
    }
}

print_r( $filenames );

:

Array
(
    [0] => Filename.mp3
    [1] => file anem.mid
    [2] => fi le nam e.rm
)
0

Source: https://habr.com/ru/post/1696359/


All Articles