Welcome to my blog. If you like my posts, please subscribe to my feed and if you want to keep in touch, follow me on twitter! Happy reading!

Programmatically extract body of a web page

The technique used in this article can be used to display the contents of an HTML page inside your ASP.NET pages instead of using IFRAMES.

System.Net.WebRequest req = System.Net.WebRequest.Create(Request.QueryString["url"]);
System.Net.HttpWebResponse resp = (System.Net.HttpWebResponse)req.GetResponse();
System.IO.Stream respStream = resp.GetResponseStream();
System.IO.StreamReader sr = new System.IO.StreamReader(respStream);
string responseFromServer = sr.ReadToEnd();
System.Text.RegularExpressions.Regex bodyRegex = new System.Text.RegularExpressions.Regex(@"(]*>[\u0000-\uFFFF]+?)");
System.Text.RegularExpressions.Match bodyMatch = bodyRegex.Match(responseFromServer);
Literal1.Text = bodyMatch.Result("$0");
sr.Close();
respStream.Close();
resp.Close();

Thanks to ClayCo at the ASP.NET Forums for his help. Here is the link to the original thread http://forums.asp.net/t/1023144.aspx

Leave a Reply