Home » U++ Library support » U++ Core » HttpRequest : problem with multiple redirections ?
HttpRequest : problem with multiple redirections ? [message #43460] |
Thu, 07 August 2014 11:12 |
|
jibe
Messages: 294 Registered: February 2007 Location: France
|
Experienced Member |
|
|
Hi,
I want to get some data about books on various websites : ISBNdb, GoogleBooks, Worldcat... This works well with all of them, but not always with Amazon.fr : with it, using the same URL, I do not get always the same content! Sometimes, it is the one I get with Firefox, and sometimes it is another. This seems to be aleatory, and any content I get, there is never any error (HttpRequest::GetError() returns 0).
Trying with wget (I am using Linux), I see that with this URL, there is 3 redirections. Could it be the problem ? And if not, what could it be ?
My code is very simple : I'm supposed to know the URL for the book (if not, I make a search on the ISBN of the book, and I always find the right URL, even on Amazon.fr - if the book is known on the site, of course !)
String content;
HttpRequest http;
...
http.Url(url);
content = http.Execute();
An url showing this problem :
http://www.amazon.fr/14-Jean-Echenoz/dp/2707322571/ref=sr_1_1/278-1397759-3160153?ie=UTF8&qid=1372075436&sr=8-1&keywords=9782707322579
Any idea ?
|
|
|
|
Re: HttpRequest : problem with multiple redirections ? [message #43470 is a reply to message #43460] |
Fri, 08 August 2014 15:02 |
|
jibe
Messages: 294 Registered: February 2007 Location: France
|
Experienced Member |
|
|
Hi Mirek,
Thanks for your reply. The "bad" content is not so bad, it seems to be another similar (outdated ?) page about the same book. The problem is that it's not organized the same way, so I don't retrieve the data I need, or I should parse it a different way.
I will try tracing the requests and see what can be done.
What is surprising is that I get (sometimes) this bad content only when I get directly the URL. The first time I look for a book, I make a search on the site, obtain a list of books, select the right one and follow the link. This link is the URL that I store and use next times, but curiously, I get always the right page when I first search the book rather than using the stored URL!
I just wanted to have other's opinion about this : anyway, I can workaround the problem either parsing the "bad" content when I get it, or doing the search of the book first rather than use the direct url. I'll let know if I find the reason of this bad content.
Thanks for your advices.
|
|
|
Re: HttpRequest : problem with multiple redirections ? [message #43476 is a reply to message #43470] |
Sat, 09 August 2014 09:51 |
|
mirek
Messages: 13975 Registered: November 2005
|
Ultimate Member |
|
|
jibe wrote on Fri, 08 August 2014 15:02Hi Mirek,
Thanks for your reply. The "bad" content is not so bad, it seems to be another similar (outdated ?) page about the same book. The problem is that it's not organized the same way, so I don't retrieve the data I need, or I should parse it a different way.
I will try tracing the requests and see what can be done.
What is surprising is that I get (sometimes) this bad content only when I get directly the URL. The first time I look for a book, I make a search on the site, obtain a list of books, select the right one and follow the link. This link is the URL that I store and use next times, but curiously, I get always the right page when I first search the book rather than using the stored URL!
I just wanted to have other's opinion about this : anyway, I can workaround the problem either parsing the "bad" content when I get it, or doing the search of the book first rather than use the direct url. I'll let know if I find the reason of this bad content.
Thanks for your advices.
Are you using the same HttpRequest for both? In that case, it would mean cookies are responsible... HttpRequest preserves cookies even for successive calls. You can also try if that is the issue by using "CopyCookies" (copies cookies from one HttpRequest to another).
Mirek
[Updated on: Sat, 09 August 2014 09:51] Report message to a moderator
|
|
|
Re: HttpRequest : problem with multiple redirections ? [message #43488 is a reply to message #43460] |
Mon, 11 August 2014 09:34 |
|
jibe
Messages: 294 Registered: February 2007 Location: France
|
Experienced Member |
|
|
Hi, Mirek,
Yes, it's that
I tried to remove cookies on my browser, and I obtain the "bad" page (curious site, giving an almost similar page with a very different code - all CSS classes and id are different ! - depending on the cookies...).
What is done in my application is that : the first time it looks for the book by the ISBN, obtain a list of the corresponding books (normaly only one, as 2 different books cannot have the same ISBN), then follow the link to get the page. I keep this URL in the database. It's sometime later that, if we use the link, we get the "bad" page. But in this case, I think that the cookie is no more available, as the application has been stopped...
Probably, I should keep the cookie in the database ? Well, I will see : probably a workaround will finaly be simpler.
Thank you for your help !
|
|
|
Goto Forum:
Current Time: Fri Mar 29 09:30:02 CET 2024
Total time taken to generate the page: 0.01377 seconds
|