php - RegEx: Issue parsing forum post body (with quote) -


this type of thing hammer away on until can right, in case believe it’s part of regex i've never had head around completely. greedy vs. non-greedy stuff.

i have content:

[quote=mick-mick topic=33586] gave dayz hour of life. can never back. :/ had wait wait. slow loads server selection screen once           chose server took 3 minutes server.  i'll still give h1z1 shot sure. :) [/quote]  test 

the regex i’m attempting use is:

/(\[quote=[a-za-z0-9]+\](.*)\[\/quote\])?(.*)/m 

but it’s matching quote line.

as can see need username (mick-mick), topic id, , content of quote, , content following quote. also, quote may not exist in content @ all.

can me on this? missing? using preg_match in php.

final update:

to match multiple quotes and grab content isn't in quote got little difficult. but, here goes:

(?:   \[quote=([a-z0-9\-]+)   \s*topic=(\d+)\]   (.*?)   \[/quote\]  |   (.+?)   (?=\[quote|$) ) 

this time use alternating non-capture group around everything. either match quote (with our capture groups 1, 2, , 3) or match 1+ other characters capture group 4 (this isn't part of quote). crucial addition here positive lookahead ((?=...)). zero-length assertion (meaning "checks" doesn't match) looks either [quote or end of string ($) following it. used don't keep matching new quote.

note: global match in php, you'll need use preg_match_all().


update:

i updated grab content before/after quote , make quote optional (by adding optional non-capturing group: (?:...)?). re-read question , saw quotes have quote/topic (if isn't case, you'll need combine these expressions bit..here is:

(.*?)(?:\[quote=([a-z0-9\-]+)\s*topic=(\d+)\](.*)\[/quote\])?(.*) 

and used like:

preg_match('~(.*?)(?:\[quote=([a-z0-9\-]+)\s*topic=(\d+)\](.*)\[/quote\])?(.*)~si', $html, $matches); $matches[0]; // full match $matches[1]; // before quote (empty if quote doesn't exist) $matches[2]; // quote value: `mick-mick` $matches[3]; // topic value: `33586` $matches[4]; // quote contents: `i just...` $matches[5]; // else (entire string quote doesn't exist) 

demo


you have issues in expression, pretty close. here cleaned version:

\[quote=([a-z0-9\-]+)\s*(.*?)\](.*)\[/quote\] 

you can use this:

preg_match('~\[quote=([a-z0-9\-]+)\s*(.*?)\](.*)\[/quote\]~si', $html, $matches); $matches[0]; // full match $matches[1]; // quote value: `mick-mick` $matches[2]; // quote parameters: `topic=33586` $matches[3]; // quote contents: `i just...` 

demo


the fundamental issue had wrapped in (...)? , followed (.*). means first part optional, couldn't matched, , matched 0+ characters..since . not match new line (unless use s modifier in example), matched first line quote.

also, used quote=[a-za-z0-9]+ when quote ([quote=mick-mick topic=33586]) had hyphen, space, , equal sign in it. instead, used [a-z0-9\-] (with i modifier case-insensitivity), followed whitespace (\s*) followed lazy capture of rest of parameters.

let me know if have questions or want different functionality.


Comments

Popular posts from this blog

c++ - OpenCV Error: Assertion failed <scn == 3 ::scn == 4> in unknown function, -

php - render data via PDO::FETCH_FUNC vs loop -

The canvas has been tainted by cross-origin data in chrome only -