regex - Regexp matching separated proxy ip and port -


i trying scrape proxy address websites. never learn regex in deep. there few common formats , here's regex using

regex ip = new regex(@"(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})(?:\t*)(?: *)(?::*)(\d{2,5})"); 

different website uses different format, 8.8.8.8\t\t 80, 8.8.8.8:80, 8.8.8.8 \t80

this regex able capture of address mismatch 123.123.123.123 ip 123.123.123.1 , port 23 if ip address not followed port separated 1 of 3 elements

i want 3 common element \t,space,: 0 or more @ least 1 of 3 appears.

i think of negative lookahead noob make use.

any suggesion?

if ok addresses 123.123.123.123 : :: : 80, can use following:

(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})[ \t:]+(\d{2,5}) 

regular expression visualization

if want match addresses 123.123.123.123 : 80, 123.123.123.123 80, 123.123.123.123:80 (max 1 :), can lookahead.

(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})(?=[^\d])\s*:?\s*(\d{2,5}) 

regular expression visualization

or can use or operation:

(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})(?:\s+|\s*:\s*)(\d{2,5}) 

regular expression visualization


Comments

Popular posts from this blog

c++ - OpenCV Error: Assertion failed <scn == 3 ::scn == 4> in unknown function, -

php - render data via PDO::FETCH_FUNC vs loop -

The canvas has been tainted by cross-origin data in chrome only -