Unable to scrape certain pages with "UseBasicParsing" for unknown reason - by weqew

Status : 

 


9
0
Sign in
to vote
ID 781613 Comments
Status Active Workarounds
Type Bug Repros 5
Opened 3/18/2013 2:50:36 PM
Access Restriction Public

Description



When trying to use Invoke-webrequest on Microsoft Download it just hangs on certain pages. The cmdlet with the parameters down below just hangs forever. Even with a timeoutparameter it just stops.

Invoke-Webrequest -UseBasicParsing http://www.microsoft.com/en-us/download/details.aspx?id=26617

If you try to do a Invoke-Webrequest and the same URL you get results. But with the UseBasicParsing the entire process just freezes.
Sign in to post a comment.
Posted by Keith Garner on 4/11/2015 at 11:18 AM
Also occurs for me when reading a simple Microsoft support web site: kb2894518

invoke-WebRequest "http://support.microsoft.com/kb/2894518" -UseBasicParsing # HANGS
Invoke-WebRequest http://support.microsoft.com/kb/2894518    # OK
Invoke-WebRequest http://support.microsoft.com/kb/2894518 -UseBasicParsing -OutFile .\test.txt # OK

This bug reproduces on several verisons of powershell, WIndows 8.1, WIndows Sever 2012 R2, *AND* Windows 10 build 10041.

CPU runs at 25% for the process while in the hang state.



Posted by Oscar Virot on 11/29/2014 at 11:11 AM
Why trying to parse SVTPlay (Region locked to Sweden) I found that if I saved the content from the Invoke-Webrequest the call to the cmdlet went well. Trying to view the whole variable doesnt work ($page). I have tried to access parts of the data using $page.Content, BUT if I view $page.links powershell freezes.