regex - Regular expression that both includes and excludes certain strings in R -


i trying use r parse through number of entries. have 2 requirements the entries want back. want entries contain word apple don't contain word orange.

for example:

  1. i apples
  2. i apples
  3. i apples , oranges

i want entries 1 , 2 back.

how go using r this?

thanks.

using regular expression, following.

x <- c('i apples', 'i apples',         'i apples , oranges', 'i oranges , apples',        'i oranges , apples oranges more')  x[grepl('^((?!.*orange).)*apple.*$', x, perl=true)] # [1] "i apples"        "i apples" 

the regular expression looks ahead see if there's no character except line break , no substring orange , if so, dot . match character except line break wrapped in group, , repeated (0 or more times). next apple , character except line break (0 or more times). finally, start , end of line anchors in place make sure input consumed.


update: use following if performance issue.

x[grepl('^(?!.*orange).*$', x, perl=true)] 

Comments

Popular posts from this blog

php - render data via PDO::FETCH_FUNC vs loop -

c++ - OpenCV Error: Assertion failed <scn == 3 ::scn == 4> in unknown function, -

The canvas has been tainted by cross-origin data in chrome only -