Overview
Examples
Screenshots
Comparisons
Applications
Download
Documentation
Tutorials
Bazaar
Status & Roadmap
FAQ
Authors & License
Forums
Funding Ultimate++
Search on this site
Search in forums












SourceForge.net Logo
Home » Community » Newbie corner » What's wrong in this regexp ? [SOLVED,FIXED]
What's wrong in this regexp ? [SOLVED,FIXED] [message #36272] Thu, 17 May 2012 16:08 Go to next message
awlee is currently offline  awlee
Messages: 4
Registered: May 2012
Junior Member
Hi,

I have some problem with this code and I don't understand what's wrong Sad

#include <CtrlLib/CtrlLib.h>
#include <plugin/pcre/pcre.h>
#include <Web/Web.h>

using namespace Upp;

GUI_APP_MAIN
{
	String content;
	HttpClient web;
	web.URL("http://ultimatepp.org");
	content = web.ExecuteRedirect(3,3); // read html page
	
	String regexp = "http-equiv=\"content-type\"[^>]+charset=([^\\s\"]+).*?<title>([^<]+)";

	RegExp r0(regexp,PCRE_CASELESS|PCRE_MULTILINE|PCRE_DOTALL);

	PromptOK("Test1");
	if (r0.Match(content))
		if (r0.GetCount() > 0)
			for (int i=0; i<r0.GetCount(); ++i)
				PromptOK(r0[i]);
			
	PromptOK("Test2");
	content = "<meta http-equiv=\"content-type\" content=\"text/html; charset=utf-8\" /><meta name=\"generator\" content=\"U++ HTML Package\"><title>Woooow!</title>";
	if (r0.Match(content))
		if (r0.GetCount() > 0)
			for (int i=0; i<r0.GetCount(); ++i)
				PromptOK(r0[i]);

	PromptOK("End");
}

At "Test1" it won't show anything
At "Test2" all is ok

P.S. Windows 7 Home Basic x64, MSSDK 7.1 x64, U++ b4193

[Updated on: Wed, 23 May 2012 00:34]

Report message to a moderator

Re: What's wrong in this regexp ? [message #36275 is a reply to message #36272] Thu, 17 May 2012 18:42 Go to previous messageGo to next message
sergeynikitin is currently offline  sergeynikitin
Messages: 748
Registered: January 2008
Location: Moscow, Russia
Contributor

I recommend free open-source software KIKI for testing regular expression. There is in standard repositories under linux. For Windows use this link
http://code.google.com/p/kiki-re/downloads/list

(I use ver.0.5.6)
Very simple interface.


SergeyNikitin<U++>( linux, wine )
{
    under( Ubuntu || Debian || Raspbian );
}
Re: What's wrong in this regexp ? [message #36276 is a reply to message #36272] Thu, 17 May 2012 18:52 Go to previous messageGo to next message
awlee is currently offline  awlee
Messages: 4
Registered: May 2012
Junior Member
I think I was wrong with description of my problem...

- regular expression is the same in both cases
- results are different
- we can save HttpClient response and we'll see html page

why this regexp works so different?

both strings without specific UTF8 chars (I think)
Re: What's wrong in this regexp ? [message #36278 is a reply to message #36272] Thu, 17 May 2012 19:05 Go to previous messageGo to next message
dolik.rce is currently offline  dolik.rce
Messages: 1789
Registered: August 2008
Location: Czech Republic
Ultimate Contributor

Hi awlee!

It seems like U++ doesn't support PCRE_DOTALL flag correctly at this moment. For a quick workaround, add '(?s)' switch at the beginning of your regexp:
	String regexp = "(?s)http-equiv=\"content-type\"[^>]+charset=([^\\s\"]+).*?<title>([^<]+)";


I'll try to put together a patch to support this (and other) missing feature soon.

Best regards,
Honza
Re: What's wrong in this regexp ? [message #36279 is a reply to message #36276] Thu, 17 May 2012 19:06 Go to previous messageGo to next message
sergeynikitin is currently offline  sergeynikitin
Messages: 748
Registered: January 2008
Location: Moscow, Russia
Contributor

I suggest to say what you want to do with this expression.

SergeyNikitin<U++>( linux, wine )
{
    under( Ubuntu || Debian || Raspbian );
}
Re: What's wrong in this regexp ? [message #36280 is a reply to message #36278] Thu, 17 May 2012 19:09 Go to previous messageGo to next message
awlee is currently offline  awlee
Messages: 4
Registered: May 2012
Junior Member
dolik.rce wrote on Thu, 17 May 2012 21:05

Hi awlee!

It seems like U++ doesn't support PCRE_DOTALL flag correctly at this moment. For a quick workaround, add '(?s)' switch at the beginning of your regexp:
	String regexp = "(?s)http-equiv=\"content-type\"[^>]+charset=([^\\s\"]+).*?<title>([^<]+)";


I'll try to put together a patch to support this (and other) missing feature soon.

Best regards,
Honza


Yep.. Thanks a lot for workaround. It works for me Smile

Question resolved.
=)
Re: What's wrong in this regexp ? [message #36281 is a reply to message #36280] Thu, 17 May 2012 20:16 Go to previous messageGo to next message
dolik.rce is currently offline  dolik.rce
Messages: 1789
Registered: August 2008
Location: Czech Republic
Ultimate Contributor

Issue reported and patch submitted to Mirek for check Smile

You can see it all here: http://www.ultimatepp.org/redmine/issues/286

Honza

[Updated on: Thu, 17 May 2012 21:03]

Report message to a moderator

Re: What's wrong in this regexp ? [message #36317 is a reply to message #36272] Sun, 20 May 2012 11:42 Go to previous messageGo to next message
dolik.rce is currently offline  dolik.rce
Messages: 1789
Registered: August 2008
Location: Czech Republic
Ultimate Contributor

Hi awlee,

The issue should be fixed in version 4970. All the PCRE_* options should work now, even without the (?s) workaround, see plugin/pcre/RegExp.h for details (such as list of all recognized options).

Best regards,
Honza
Re: What's wrong in this regexp ? [message #36367 is a reply to message #36272] Wed, 23 May 2012 00:33 Go to previous message
awlee is currently offline  awlee
Messages: 4
Registered: May 2012
Junior Member
Thanks a lot for quick answer and fix! Smile

patch applied by hands )
Previous Topic: Strange problem using URR...
Next Topic: Problem with Refreshing GUI...
Goto Forum:
  


Current Time: Sun May 05 17:34:08 CEST 2024

Total time taken to generate the page: 0.02531 seconds