DBPrism Servlet Engine
DBPrism CMS Search Engine

Every dynamic component of DBPrism CMS is compouned by Cocoon Content Aggregation. These components are defined into the sitemap.xmap file and are the rss and document . The source below is a fragment of DBPrism CMS sitemap which defines this content aggregation for the tag content :

   <map:match pattern="**.html">
    <map:aggregate element="content">
     <map:part src="cocoon:/content/{1}.xml"/>
     <map:part src="cocoon:/content/rss.xml"/>
    </map:aggregate>
    <map:transform type="extractor"/>
    <map:transform type="i18n"/>
    <map:transform src="stylesheets/document2html.xsl">
      <map:parameter name="base" value="@WEB_BASE@"/>
    </map:transform>
    <map:serialize type="html"/>
   </map:match>
      

Document is also compouned by header and body, here the definition:

   <map:match pattern="content/live/search.xml">
     <map:aggregate element="document">
        <map:part src="cocoon:/header/live/search.xml"/>
        <map:part src="cocoon:/internal/live/search.xml"/>
     </map:aggregate>
     <map:serialize/>
   </map:match>
      

Header is generated using regular CMS operations, it means loading the title, authors, and context information from the page stored into the XMLDB repository at the path /home/CMS_USER/cms/[en|es]/live/search.xml, with this information the CMS can defined where the Search service is located into the directory structure. Body is generated dinamically by an stored procedure excecution, here the sitemap.xmap entry with match with the above pattern:

   <map:match pattern="internal/live/search.xml">
    <map:generate src="/cms/search.html">
        <map:parameter name="use-request-parameters" value="true"/>
        <map:parameter name="Cache-Control" value="no-cache"/>
        <map:parameter name="base" value="/public"/>
    </map:generate>
    <map:serialize/>
   </map:match>
      

The above configuration defines the the body content of the CMS document will be generated using the Java Stored Procedure html of the package search using the DAD cms , the HTTP request parameters are merged with the sitemap.xmap parameter base , Cache-Control and use-request-parameters are not included, to get more information about this look at the page Cocoon Request Wrapper , The package definition is:

CREATE OR REPLACE PACKAGE search AUTHID CURRENT_USER AS
  PROCEDURE html;
  PROCEDURE html(webUrl VARCHAR2,
                 avQueryString VARCHAR2);
  PROCEDURE html(base VARCHAR2,
                 webUrl VARCHAR2,
                 avQueryString VARCHAR2);
  PROCEDURE html(cmsUser VARCHAR2,
                 base VARCHAR2,
                 webUrl VARCHAR2,
                 avQueryString VARCHAR2);
  PROCEDURE html(cmsUser VARCHAR2,
                 base VARCHAR2,
                 webUrl VARCHAR2,
                 avQueryString VARCHAR2,
                 fromRow NUMBER);
  PROCEDURE html(cmsUser VARCHAR2,
                 base VARCHAR2,
                 webUrl VARCHAR2,
                 avQueryString VARCHAR2,
                 fromRow NUMBER,
                 toRow NUMBER);
  FUNCTION authorize RETURN NUMBER;
END search;

CREATE OR REPLACE PACKAGE BODY search AS
  -- Private store procedure, main entry point to the Java Code
  PROCEDURE doIt(cmsUser VARCHAR2,
                 base VARCHAR2,
                 webUrl VARCHAR2,
                 avQueryString VARCHAR2,
                 fromRow NUMBER,
                 toRow NUMBER)
        AS LANGUAGE JAVA
        NAME 'com.prism.cms.ext.AvQuery.search(java.lang.String, java.lang.String, java.lang.String, java.lang.String, int, int)';

  PROCEDURE html is
  begin
     doIt(user,'/home','/dbprism/doc/','nullQuery',1,11);
  end html;
  PROCEDURE html(webUrl VARCHAR2,
                 avQueryString VARCHAR2) is
  begin
     doIt(user,'/home',webUrl,avQueryString,1,11);
  end html;
  PROCEDURE html(base VARCHAR2,
                 webUrl VARCHAR2,
                 avQueryString VARCHAR2) is
  begin
     doIt(user,base,webUrl,avQueryString,1,11);
  end html;
  PROCEDURE html(cmsUser VARCHAR2,
                 base VARCHAR2,
                 webUrl VARCHAR2,
                 avQueryString VARCHAR2) is
  begin
     doIt(cmsUser,base,webUrl,avQueryString,1,11);
  end html;
  PROCEDURE html(cmsUser VARCHAR2,
                 base VARCHAR2,
                 webUrl VARCHAR2,
                 avQueryString VARCHAR2,
                 fromRow NUMBER) is
  begin
     doIt(cmsUser,base,webUrl,avQueryString,fromRow,fromRow+10);
  end html;
  PROCEDURE html(cmsUser VARCHAR2,
                 base VARCHAR2,
                 webUrl VARCHAR2,
                 avQueryString VARCHAR2,
                 fromRow NUMBER,
                 toRow NUMBER) is
  begin
     doIt(cmsUser,base,webUrl,avQueryString,fromRow,toRow);
  end html;
  FUNCTION authorize RETURN NUMBER is
  begin
     return 1;
  end authorize;
END search;
     

Most of the PLSQL code showed above are for providing default values to the Java Code. The code below shows the most important part of the Search Engine, this search engine uses a function

translate(avQueryString)
taked from the section code examples of Oracle Technology Network which translates Altavista (TM) query syntax to Oracle Text (TM) syntax and is not relevant at this point. I will try to explain each part of the code following.

public void doSearch(String ext, String baseURL, String
              avQueryString) throws SQLException,IOException,SAXException {

Arguments , this function receives three arguments. ext means html, wml or xml given the posibilty of making different behaviours depending on the target output. baseURL is added to the begining of the links when the search engine makes urls to documents which contain the query string. avQueryString is the query string with Altavista (TM) syntax.

String interMediaQueryString = translate(avQueryString);

interMediaQueryString hold the converted query string, now with Oracle Text (TM) query syntax.

    if (avQueryString.equalsIgnoreCase("nullQuery")) {
      Jxtp.tagOpen("body","xmlns:i18n='http://apache.org/cocoon/i18n/2.0'");
      Jxtp.tagOpen("section");
        Jxtp.tag("title",Jxtf.tag("i18n:text","AdvSearch"));
      Jxtp.tagClose("section");
      Jxtp.tagClose("body");
      return;
    }
      

If no query string is given as argument value implements Advance Search modality, like "find documents which contains DBPrism in titles of level 1. The code below makes the tags for the top of the result page.

    Jxtp.tagOpen("body","xmlns:i18n='http://apache.org/cocoon/i18n/2.0'");
    Jxtp.tagOpen("section");
    Jxtp.tagOpen("title");
      Jxtp.tagOpen("i18n:translate");
        Jxtp.tag("i18n:text","SearchResult");
        Jxtp.tag("i18n:param",avQueryString);
      Jxtp.tagClose("i18n:translate");
    Jxtp.tagClose("title");
    Jxtp.tagOpen("table");
      // Title
      Jxtp.tagOpen("tr");
        Jxtp.tag("th",Jxtf.tag("i18n:text","Rank"));
        Jxtp.tag("th",Jxtf.tag("i18n:text","Title"));
        Jxtp.tag("th",Jxtf.tag("i18n:text","SubTitle"));
      Jxtp.tagClose("tr");
      

The resulting output of the search is enclosed in a html table like structure, at this point the code makes the corresponding header for the table.

      long startTime = System.currentTimeMillis();
      OraclePreparedStatement stmt = null;
      ResultSet rset = null;
      try {
        Connection conn = DriverManager.getConnection("jdbc:default:connection:");
        String referer = Jowa.GETCGIVAR("HTTP_REFERER");        
        StringBuffer qry = new StringBuffer("<query>\n<textquery lang=\"");
        qry.append(("es".equalsIgnoreCase(Actions.getLang())) ? "spanish" : "english").append("\" grammar=\"CTXCAT\">\n");
        qry.append(interMediaQueryString);
        qry.append("\n</textquery>\n<score datatype=\"INTEGER\"/>\n</query>");
        stmt = (OraclePreparedStatement) conn.prepareStatement(
           "insert into av_queries(owner,refererurl,avquery,ctxquery) "+
                   "values (?,?,?,?)");
        stmt.setString(1,user);
        stmt.setString(2,referer);
        stmt.setString(3,avQueryString);
        stmt.setString(4,qry.toString());
        stmt.executeUpdate();
        stmt.close();
        String whereToFind = baseUrl+"/"+user+"/cms/"+Actions.getLang()+"/";
        stmt = (OraclePreparedStatement) conn.prepareStatement(
           "select * from (select rownum as ntop_pos, q.* from ("+
           "select score(1) rank,"+
                   "extract(c.object_value,'/document/header/*'),"+
                   "substr(r.any_path,?) "+
                   "from cms_docs c,resource_view r "+
                   "where under_path(r.res,?,2)=1 and "+
                   "sys_op_r2o(extractValue(r.res,'/Resource/XMLRef'))=c.object_id "+
                   "and contains(c.object_value,?,1)>1 "+
                   "order by rank desc "+
                   ") q) where ntop_pos>=? and ntop_pos<?");
        stmt.setInt(1,whereToFind.length()+1);
        stmt.setString(2,whereToFind);
        stmt.setString(3,qry.toString());
        stmt.setInt(4,from);
        stmt.setInt(5,to);
        rset = stmt.executeQuery();
        long elapsedTime = System.currentTimeMillis()-startTime;
        int count = 0;
        while(rset.next()) {
          int rank = rset.getInt(2);
          XMLType header = (XMLType)rset.getObject(3);
          String path = rset.getString(4);
          Jxtp.tagOpen("tr");
            Jxtp.tagOpen("td");
              Jxtp.tag("a","("+rank+")",
                       "href='"+webUrl+path.replaceFirst(".xml",".html")+"'");
            Jxtp.tagClose("td");
            Jxtp.tagOpen("td");
              String title;
              try {
                title = header.extract("/title/text()","").getStringVal();
              } catch (Exception e) {
                title = e.getMessage();
              }
              Jxtp.tag("a",title,
                       "href='"+webUrl+path.replaceFirst(".xml",".html")+"'");
            Jxtp.tagClose("td");
            XMLType subtitle = header.extract("/subtitle","");
            if (subtitle!=null)
              Jxtp.tag("td",subtitle.getStringVal());
            else
              Jxtp.tag("td","");
          Jxtp.tagClose("tr");
          count++;
        }
      

Execute the query on the table cms_docs only with documents located under the path /[home|public]/cms/[en|es] directory and with the pagination Top-N query.

The SQL select expresion uses Oracle Text function
contains(c.object_value,?,1)>1
which returns values higher than cero when the XML documents stored in the selected column contains the given query.
        Jxtp.tagOpen("tr");
          Jxtp.tagOpen("td");
            Jxtp.tagOpen("i18n:translate");
              Jxtp.tag("i18n:text","ElapsedTime");
              Float seconds = new Float(elapsedTime);
              Jxtp.tag("i18n:param",""+seconds.floatValue()/1000f);
            Jxtp.tagClose("i18n:translate");
          Jxtp.tagClose("td");
          Jxtp.tag("td","");
          Jxtp.tagOpen("td");
            Jxtp.tagOpen("i18n:translate");
              Jxtp.tag("i18n:text","PagesFound");
              Jxtp.tag("i18n:param",""+count);
            Jxtp.tagClose("i18n:translate");
          Jxtp.tagClose("td");
        Jxtp.tagClose("tr");
        Jxtp.tagOpen("tr");
          Jxtp.tagOpen("td");
            if ((from-10)>0)
              Jxtp.tag("a",Jxtf.tag("i18n:text","Prev"),"href='search.html?cmsUser="+user+"&amp;webUrl="+webUrl+"&amp;avQueryString="+avQueryString+"&amp;fromRow="+(from-10)+"&amp;toRow="+(to-10)+"'");
            else
              Jxtp.tag("a",Jxtf.tag("i18n:text","Prev"),"href='javascript:void(0)'");
          Jxtp.tagClose("td");
          Jxtp.tag("td"," ");
          Jxtp.tagOpen("td");
            if (count==(to-from))
              Jxtp.tag("a",Jxtf.tag("i18n:text","Next"),"href='search.html?cmsUser="+user+"&amp;webUrl="+webUrl+"&amp;avQueryString="+avQueryString+"&amp;fromRow="+(from+10)+"&amp;toRow="+(to+10)+"'");
            else
              Jxtp.tag("a",Jxtf.tag("i18n:text","Next"),"href='javascript:void(0)'");
          Jxtp.tagClose("td");
        Jxtp.tagClose("tr");
      } catch (SQLException sqe) {
        Jxtp.tagOpen("tr");
          Jxtp.tag("td","");
          Jxtp.tagOpen("td");
            Jxtp.tagOpen("i18n:translate");
              Jxtp.tag("i18n:text","PagesFound");
              Jxtp.tag("i18n:param","0");
            Jxtp.tagClose("i18n:translate");
          Jxtp.tagClose("td");
          Jxtp.tag("td",sqe.getLocalizedMessage());
        Jxtp.tagClose("tr");
      } finally {
        if (rset!=null) {
          rset.close();
          rset=null;
        }
        if (stmt!=null) {
          stmt.close();
          stmt=null;
        }
      }
    Jxtp.tagClose("table");
    Jxtp.tagClose("section");
    Jxtp.tagClose("body");
      

Close the opened table, shows the time elapsed in the query excecution and close the body section. Also handle the exceptions.

This work is licensed under a Creative Commons License . Creative Commons License
(C) 1999-2008 - DBPrism ~ DBPrism CMS | Marcelo F. Ochoa | TANDIL ~ Argentina | 2008-10-07T20:16:32
DBPrism at SourceForgeBuilt with Cocoon2