XML, the Perl Way

Previous
4. First Examples
Table of Content
Table of Content
Next
6. Other features

5. Data base integration

We now have all the tools we need to build documents that include data straight out of relationnal data bases. The only decision we have to make is how to design our documents, and our DTD's. Are we going to include entire tables or single values, and how.

Here are some simple examples of what can be done:

5.1 Including a table

For this first example we will include a whole table in the document.

The document we use is books1.xml, where the table is generated by the <rel_table query="SELECT code, name, price FROM books"/> tag.

<?xml version="1.0"?>
<html>
  <head><title>XML::Twig Data Base Example (1)</title></head>
  <body>
    <h1>XML::Twig Data Base Example</h1>
    <h2>Catalog</h2>
    <rel_table query="SELECT code, name, price FROM books"/>
  </body>
</html>

The code in ex2_1.pl mixes DBI and XML::Twig to build the table.

#!/bin/perl -w

use strict;

use DBI;
use XML::Twig;

my $file= shift;

my $dbh= connect_to_db();

my $twig= new XML::Twig( twig_roots => 
                           { rel_table => \&create_table },
                         twig_print_outside_roots => 1,
                       );

$twig->parsefile( $file);

$dbh->disconnect();
exit;



# connect to the data base
sub connect_to_db
  { my $driver = "mysql";
    my $dsn = "DBI:$driver:database=test;";
    my $dbh = DBI->connect($dsn, 'test', '', {AutoCommit=>1});
    my $drh = DBI->install_driver($driver);
    return( $dbh);
  }


sub create_table
  { my( $twig, $rel_table)= @_;
    my $query= $rel_table->att( 'query');

    # prepare the select
    my $select= $query;
    unless ($select=~ /;$/) { $select.= ";"; } # I always forget the ; 
    my $sth= $dbh->prepare( $select);

    print "<table border=\"1\">\n";

    $sth->execute();

    # print column titles
    print "<tr>";
    my $field= $sth->{NAME};
    for( my $i=0; $i<=$#{$field}; $i++)
      { print "<td><strong>" . ucfirst( $field->[$i]) . "</strong></td>"; }
    print "</tr>\n";

    my $row;
    while( $row= $sth->fetchrow_arrayref())
      { # print each row
        print "<tr>";
        for( my $i=0; $i<=$#{$row}; $i++)
          { my $value= $row->[$i];
            # format prices
            $value=~ s/^(\d*)(\d\d)$/\$$1.$2/ if( $field->[$i] eq 'price');
            print "<td>$value</td>";
          }
    print "</tr>\n";
      }

    print "</table>\n";
  }
    

This code can also be used to process slightly trickier queries, as in books2.xml.

<?xml version="1.0"?>
<html>
  <head><title>XML::Twig Data Base Example (2)</title></head>
  <body>
    <h1>XML::Twig Data Base Example</h1>
    <h2>Catalog</h2>
    <rel_table query="SELECT books.code, books.name, books.price, ratings.rating FROM books, ratings where books.code=ratings.code"/>
  </body>
</html>

5.2 Including values from a table

Depending on how generic, and how convenient to write we want the queries to be, several options are possible. Here are a couple:

The first document is books3.xml, which includes very generic queries.

<?xml version="1.0"?>
<html>
  <head><title>XML::Twig Data Base Example (3)</title></head>
  <body>
    <h1>XML::Twig Data Base Example (3)</h1>
    <p>I really liked <include query='select name from books where code="P001"'/>, which, along with <include query='select name from books where code="P002"'/> and <include query='select name from books where code="P003"'/> (<include query='select rating from ratings where code="P003"'/>)are the basis of a good Perl library.</p>
  </body>
</html>

It can be processed using the ex2_2.pl script.

#!/bin/perl -w

use strict;

use DBI;
use XML::Twig;

my $file= shift;

my $dbh= connect_to_db();

my $twig= new XML::Twig( twig_roots => 
                           { include => \&include },
                         twig_print_outside_roots => 1,
                       );

$twig->parsefile( $file);

$dbh->disconnect();
exit;



# connect to the data base
sub connect_to_db
  { my $driver = "mysql";
    my $dsn = "DBI:$driver:database=test;";
    my $dbh = DBI->connect($dsn, 'test', '', {AutoCommit=>1});
    my $drh = DBI->install_driver($driver);
    return( $dbh);
  }


sub include
  { my( $twig, $include)= @_;
    my $query= $include->att( 'query');
    $query=~ s/&quot;/"/;                      # because of slight
                                               # oversight in XML::Twig
    # prepare the select
    my $select= $query;
    unless ($select=~ /;$/) { $select.= ";"; } # I always forget the ; 
    my $sth= $dbh->prepare( $select);
    $sth->execute();

    my $row= $sth->fetchrow_arrayref();        # there will be only one row
    print $row->[0];                           # and one field in the result
  }
    

A shorter but less generic way would be a document like books3.xml.

<?xml version="1.0"?>
<html>
  <head><title>XML::Twig Data Base Example (3)</title></head>
  <body>
    <h1>XML::Twig Data Base Example (3)</h1>
    <p>I really liked <include query='select name from books where code="P001"'/>, which, along with <include query='select name from books where code="P002"'/> and <include query='select name from books where code="P003"'/> (<include query='select rating from ratings where code="P003"'/>)are the basis of a good Perl library.</p>
  </body>
</html>

It can be processed using the ex2_3.pl script.

#!/bin/perl -w

use strict;

use DBI;
use XML::Twig;

my $file= shift;

my $dbh= connect_to_db();

my $twig= new XML::Twig( twig_roots => 
                           { include => \&include },
                         twig_print_outside_roots => 1,
                       );

$twig->parsefile( $file);

$dbh->disconnect();
exit;



# connect to the data base
sub connect_to_db
  { my $driver = "mysql";
    my $dsn = "DBI:$driver:database=test;";
    my $dbh = DBI->connect($dsn, 'test', '', {AutoCommit=>1});
    my $drh = DBI->install_driver($driver);
    return( $dbh);
  }


sub include
  { my( $twig, $include)= @_;
    my $field= $include->att( 'field');
    my $code= $include->text;

    my $query= "select $field from books where code='$code'";; 

    # prepare the select
    unless ($query=~ /;$/) { $query.= ";"; } # I always forget the ; 
    my $sth= $dbh->prepare( $query);
    $sth->execute();

    my $row= $sth->fetchrow_arrayref();      # there will be only one row
    print $row->[0];                         # and one field in the row
  }
    

5.3 Dumping an XML table into a data base table

We are now going to fill a relationnal table from an XML file, which could come from another, incompatible, data base for example.

The XML file looks like this: teams_extract.xml (the whole file is in teams.xml).

<?xml version="1.0"?>
<teams>
<team name="New York">
  <name>Houston, Allan</name>
  <name>Sprewell, Latrell</name>
</team>
<team name="San Antonio">
  <name>Duncan, Tim</name>
</team>
</teams>

The script to load the table: ex2_4.pl is pretty simple, the only notable features being the fact that we prepare the SQL statement once and then bind parameters to it, and that the purge does not delete the parent element of a name.

#!/bin/perl -w

use strict;

use DBI;
use XML::Twig;

my $file= shift;

my $dbh= connect_to_db();
# prepare the select just once, the actual values will replace the ? later
my $insert= $dbh->prepare(  "INSERT INTO team (name, team) VALUES (?, ?);");

my $twig= new XML::Twig( twig_handlers => { name => \&insert_row});

$twig->parsefile( $file);

$dbh->disconnect();
exit;



# connect to the data base
sub connect_to_db
  { my $driver = "mysql";
    my $dsn = "DBI:$driver:database=test;";
    my $dbh = DBI->connect($dsn, 'test', '', {AutoCommit=>1});
    my $drh = DBI->install_driver($driver);
    return( $dbh);
  }


sub insert_row
  { my( $twig, $ename)= @_;
    my $name= $ename->text;
    my $team= $ename->parent->att( 'name');


    # finalize the select
    $insert->bind_param( 1, $name);          # refers to the first ? in the query
    $insert->bind_param( 2, $team);          # refers to the second ? in the query

    $insert->execute();                      # excute the SQL statement

    $twig->purge;                            # will not delete the parent

  }
    


Previous
4. First Examples
Table of Content
Table of Content
Next
6. Other features