Monday 13 October 2008

Capturing an image/ thumbnail of a webpage in C#

For a while I've been figuring out how to programmatically get an image of a web page using C# and .Net. This could have a number of uses such as displaying a thumbnail of a web page. I found a number of methods by Googling but on the whole they seemed a bit lengthy. Eventually, I combined bits of a number of methods and then simplified things by trying out an alternative approach myself. One key thing I wanted to do was to create a mini picture of a website for display in a new desktop app I'm developing.

In the end I wrote a simple class with a single static method called
GrabImageOfWebPage. GrabImageOfWebPage takes a .Net WebBrowser control instance as an argument together with the required size for the captured image. The web page loaded in the WebBrowser control is captured (the entire client area of the control is captured) and shrunk/ enlarged into a bitmap of the required size. Here's the code:


using System;
using System.Collections.Generic;
using System.Text;
using System.Windows.Forms;
using Microsoft;
using mshtml;
using System.Runtime.InteropServices;
using System.Runtime.InteropServices.ComTypes;
using System.Drawing;

namespace BrowserComponents
{
///
/// Class providing a static method to return a bitmap of a web page rendered in
/// a .Net webbrowser control.
///

public class CBrowserImageGrabber
{

[ComVisible(true), ComImport()]
[GuidAttribute("0000010d-0000-0000-C000-000000000046")]
[InterfaceTypeAttribute(ComInterfaceType.InterfaceIsIUnknown)]

private interface IViewObject
{
[return: MarshalAs(UnmanagedType.I4)]
[PreserveSig]
int Draw(
//tagDVASPECT
[MarshalAs(UnmanagedType.U4)] UInt32 dwDrawAspect,
int lindex,
IntPtr pvAspect,
[In] IntPtr ptd,
//[MarshalAs(UnmanagedType.Struct)] ref DVTARGETDEVICE ptd,
IntPtr hdcTargetDev,
IntPtr hdcDraw,
[MarshalAs(UnmanagedType.Struct)] ref tagRECT lprcBounds,
[MarshalAs(UnmanagedType.Struct)] ref tagRECT lprcWBounds,
IntPtr pfnContinue,
[MarshalAs(UnmanagedType.U4)] UInt32 dwContinue);
}

public static Image GrabImageOfWebPage

(WebBrowser Browser, Size ImageSize)
{
// Get the view object of the browser
//
IViewObject VObject = Browser.Document.DomDocument as IViewObject;


if (VObject != null)
{
// Construct a bitmap as big as the required image.
//
Bitmap bmp = new Bitmap(ImageSize.Width, ImageSize.Height);


// The size of the portion of the web page to be captured.
//
mshtml.tagRECT SourceRect = new tagRECT();
SourceRect.left = 0;
SourceRect.top = 0;
SourceRect.right = Browser.Right;
SourceRect.bottom = Browser.Bottom;



// The size to render the target image. This can be used
// to shrink the image to a thumbnail.
//
mshtml.tagRECT TargetRect = new tagRECT();
TargetRect.left = 0;
TargetRect.top = 0;
TargetRect.right = ImageSize.Width;
TargetRect.bottom = ImageSize.Height;



// Draw the web page into the bitmap.
//
using (Graphics gr = Graphics.FromImage(bmp))
{
IntPtr hdc = gr.GetHdc();
int hr =
VObject.Draw((int)DVASPECT.DVASPECT_CONTENT,
(int)-1, IntPtr.Zero, IntPtr.Zero,
IntPtr.Zero, hdc, ref TargetRect, ref SourceRect,
IntPtr.Zero, (uint)0);
gr.ReleaseHdc();
}



// Return the bitmap.
//
return bmp;
}
else
{
return null;
}
}
}
}


}
To gain visibility of the types in this example, you have to include the following uses:
using mshtml;
using System.Runtime.InteropServices;
using System.Runtime.InteropServices.ComTypes;
using System.Drawing;
and you also need to add a .Net reference to your project for Microsoft.mshtml.

Using the method is then pretty easy. The example code below create a webbrowser control and loads a webpage. When the webpage is fully loaded, it grabs an image 10% the size of the original page and displays it in a picture box.

WebBrowser mWebBrowser;

public Form1()
{
InitializeComponent();

mWebBrowser = new WebBrowser();
mWebBrowser.Width = 1024;
mWebBrowser.Height = 768;
mWebBrowser.ScrollBarsEnabled = false;

mWebBrowser.DocumentCompleted +=
new WebBrowserDocumentCompletedEventHandler
(mWebBrowser_DocumentCompleted);
mWebBrowser.Navigate
(@"http://www.software-product-development.blogspot.com");
}


void mWebBrowser_DocumentCompleted(object sender,
WebBrowserDocumentCompletedEventArgs e)
{
if (mWebBrowser.ReadyState == WebBrowserReadyState.Complete)
{
Image Img =
BrowserComponents.CBrowserImageGrabber.
GrabImageOfWebPage(mWebBrowser, new Size(102, 77));

if (Img != null)
{
pictureBox1.Image = Img;
}
}
}

Monday 6 October 2008

Losing a PageRank Value

In the mini-update toolbar export on Sept 26th this blog lost its PR, i.e the value went to N/A. Previously the PR had been 3. Why this has happened I don't know. I haven't linked to any silly sites or reduced the frequency of posting.

Today I also noticed that a few inner pages on the blog have PR. I haven't noticed this before. It's strange that the older posts have PR but the blog home page is back to PR N/A. Google Analytics isn't showing any change in the level of traffic to the blog.