awstats

Parsing HTML table into Excel spreadsheet

This script fetches a HTML page from the internet and parses the tables it finds in it, into an Excel spreadsheet.
This has been used for grabbing an AWstats index.html summary of sites and their bandwidth, and then parsing to Excel so one can tally up totals.

It's not perfect - I haven't used it in a while but from memory, it overwrites some rows due to being poorly coded :)

When the time comes that I need to use this again, I'll have another look at it.

#!/usr/bin/perl -w
use Spreadsheet::WriteExcel;
use HTML::TableExtract;
use LWP::Simple;
# Create a Table extraction
my $te = new HTML::TableExtract(gridmap=>1);
# Get the table out of my stats page
my $content = get("http://localhost/stats/index.html");
# Pass the data out of the table
$te->parse($content);
# Create a new Excel workbook
my $workbook = Spreadsheet::WriteExcel->new("totals.xls");
# Add a worksheet
my $worksheet = $workbook->add_worksheet();
# For each table, dump its parsed table rows into the worksheet
for my $ts ($te->table(1,0))
{
foreach my $row ($ts->rows)
{
$worksheet->write(0,0, $row, @{$row});
}
}

for my $ts ($te->table(1,1))
{
foreach my $row ($ts->rows)
{
$worksheet->write(1,0, $row, @{$row});
}
}

# Now add up the Total bandwidth
$worksheet->write_formula(2, 3, '=SUM(D1:D2)/1000000' );
#worksheet->write_formula(2, 3, '=SUM(D3/1000000)');

Subscribe to RSS - awstats