Site Applications Package Format Specification

0.1-draft1

All rights reserved.


Table of Contents

1. Introduction
2. Document Structure
3. Basic Package Format
3.1. File format
3.2. Files
3.3. Basic Metadata
3.4. Configuration script
4. PHP aspect
4.1. Declarations
4.2. Processing
4.3. Files
4.4. Environment Variable
4.5. Configuration Script Language
5. Database Aspect
5.1. Declarations
5.2. Processing
5.3. Environment Variables
5.4. Database Server Types
6. Apache Aspect
6.1. Declarations
6.2. Processing
7. Web content aspect
7.1. Declarations
7.2. Processing
7.3. Example
7.4. Environment Variables
8. Entry Points Aspect
8.1. Declarations
8.2. Processing
8.3. Example
References

1. Introduction

Site Application Package is a file that contains files and metadata required to create and manage instances of the web application.

The package is a ZIP filethat contains

main metadata file APP-META.xml
additional metadata files, such as icons and pictures, referenced from APP-META.xml
management scripts. scripts/configure script performs application-specific tasks during installation, upgrade, configuration and uninstallation of the application

Here is the structure of as typical package.

APP-META.xml      # Metadata container. XML file.
scripts/
  configure       # This script will be invoked when application
  ...             # instance is being managed
  ...
  ...             # Additional files to be used by the 'configure'
  ...             # reside in the same directory
images/
  icon1.png       # Icon and screenshots of the application
  screenshot2.jpg
  screenshot.jpg
  ...
php-lib1/         # Metadata may specify bundled PHP libraries to
  ...             # be added to the PHP include-path of the
php-lib2/         # application.
  ...
htdocs/           # Metadata may specify Web content directories
  index.php
  logo.png        # There might be several content directories,
  ...             # shareable and non-shareable by the
upload/           # instances of the application, writable
  skeleton.png    # and non-writable by the web server.
  ...
      

APP-META.xml contains all the metadata required to instantiate and manage application. This includes name, version, description and changelog of the application, resources required for the application to function properly and description of user-supplied configuration settings. Typical structure of metadata:

<site-application xmlns="http://swsoft.com/schemas/siteapps/1">

  <!-- common properties shared by all packages -->

  <name>phpBB</name>
  <version>2.0.22</version>
  <release>6</release>
  <homepage>http://phpbb.com/</homepage>
  <description>...</description>

  ...

  <changelog>
    <version version="2.0.22" release="6">
      <entry>Fixed bug in ...</entry>
    </version>
  </changelog>

  ...

  <!-- Per-package resources and requirements.
       Format of these entries is declared in aspects -->

  <!-- PHP aspect.
       Package declares earliest supported PHP version -->

  <php>
    <min-version>4.0.3</min-version>
  </php>

  <!-- DB aspect.
       Package declares need of single database on one of the
       supported database servers -->

  <db>
    <id>main</id>
    <default-name>phpbb</default-name>
    <db-server min-version="3.22">mysql</db-server>
    <db-server min-version="7.0.3">postgresql</db-server>
    <db-server min-version="7">microsoft:sqlserver</db-server>
  </db>

  ...

  <!-- Probably more requirements -->

</site-application>
      

scripts directory contains executable scripts that configure application instance upon its creation, upgrade or reconfiguration.

php-lib1, php-lib2, src-static and src-upload directories have to be described in APP-META.xml and provided with additional attributes that specify

location where code and files from these directories should be stored when application instance is created
whether the code can be shared by the several instances of application.

2. Document Structure

This specification is divided to the three parts:

basic package format
points of extensibility
standard extension aspects

First part of specification describes the basic metadata required for the web application to instantiate and operate.

Second part describes extensions mechanisms needed to describe addtional languages, operating systems, software components, etc. A group of related extensions is called an aspect.

Third part is the standard set of aspects which are expected to be supported by the Site Applications implementations. However it is allowed for implementers to omit ones that are not applicable.

3. Basic Package Format

3.1. File format

A package is a ZIP file [ZIP] file with the .app.zip extension.

3.2. Files

Package MUST contain only regular files and directories.

There MUST NOT be two files or directories in one directory whose filenames differ only in case.

Names of the files included in a package SHOULD contain only printable ASCII characters (except for the TAB, NL, and CR characters (ASCII codes 32-127)). To ensure web application compatibility with Microsoft Windows, names of the files included in a package SHOULD NOT contain the following characters: <, >, :, ", /, \, |, *, ?.

Special Windows device names (con, con.*, nul, nul.*, lpt etc) MUST NOT be used in packages. It is recommended to check for such files during package unpacking on Windows.

3.3. Basic Metadata

Each package MUST contain a well-formed XML file named APP-META.xml in the Package's root directory. This file contains whole metadata of the package. Aspects declare additional XML elements to be added after basic metadata.

The basic metadata uses the XML namespace http://swsoft.com/schemas/siteapps/1. Future incompatible versions of metadata will use a different namespace.

RELAX NG schema of basic metadata

The following items are described in the basic metadata:

3.3.1. Package name

<sa:name>phpbb</sa:name>

Free-formed string specifies the user-visible name of the web application in the package.

3.3.2. Package version

<sa:version>2.0.22</sa:version>
<sa:release>6</sa:release>

Package version consists of two parts: application version and package release, former corresponds to the version of application packaged, and later to the release of the package containing the same version of application (packages may be released many times e.g. for fixing bugs in packaging or adding localizations).

Version format and the algorithm for determining the chronological relationship between different Package versions are specified by the Debian Policy: Version Format in Debian Policy

Unlike Debian's version-release approach, application version and package release are separated to ease parsing.

3.3.3. Homepage

<sa:homepage>http://phpbb.com/</sa:homepage>

URL of the official site of web application in the package.

3.3.4. Package homepage

<sa:homepage>http://swsoft.com/</sa:homepage>

URL of the official site of the application packager.

3.3.5. Default Installation Prefix

<sa:default-prefix>/forum/</sa:default-prefix>

Relative URL with the default path where the application is supposed to be installed on domain.

3.3.6. Summary

<sa:summary>High powered, fully scalable, and highly customizable Open Source bulletin board package.</sa:summary>
<sa:summary xml:lang="es-ES">...</sa:summary>

Single-sentence summary of the package for end users.

3.3.7. Description

<sa:description>
  phpBB is a high powered, fully scalable, and highly customizable
  Open Source bulletin board package. phpBB has a user-friendly
  interface, simple and straightforward administration panel, and
  helpful FAQ. phpBB is the ideal free community solution for all web
  sites.
</sa:description>
<sa:description xml:lang="it-IT">...</sa:description>

One-paragraph description of the package for the the users.

3.3.8. Icon

<sa:icon><sa:file>images/phpbb.png</sa:file></sa:icon>

Icon may be provided to be displayed in UI for the site application. file element must contain full path in archive to the 64x64 pixels image file. Icon must be in JPEG, PNG or TIFF formats.

3.3.9. Screenshots

<sa:screenshot>
  <sa:file>images/admin.png</sa:file>
  <sa:description>Administrative interface</sa:description>
  <sa:description xml:lang="he-IL">...</sa:description>
</sa:screenshot>
<sa:screenshot>
  <sa:file>images/main.png</sa:file>
  <sa:description>Main page</sa:description>
  <sa:description xml:lang="ja-JA">...</sa:description>
</sa:screenshot>
</code>

Several screenshots with descriptions may be provided. file element must contain full path in archive to the image. It is recommended to use 800x600 pixels images. Images must be in JPEG, PNG or TIFF formats.

3.3.10. License

<sa:license must-accept="true">
  <sa:text>
    <sa:name>GPLv2</sa:name>
    <sa:file>licenses/gplv2.txt</sa:file>
  </sa:text>
  <sa:text xml:lang="de-DE">
    <sa:name>GPLv2</sa:name>
    <sa:file>licenses/gplv2-de_DE.txt</sa:file>
  </sa:text>
<sa:license>
        

or

<sa:license>
  <sa:text>
    <sa:name>Revised BSD</sa:name>
    <sa:url>http://opensource.org/licenses/bsd-license</sa:url>
  </sa:text>
</sa:license>
        

Name of the license, whether license must be accepted by the user, and either full path to the license file in the package or URL to the full text of the license.

3.3.11. Configuration script language

sa:configuration-script-language>php</sa:configuration-script-language>

Interpreter to be used to run configuration script. Valid interpreters are described in the specification aspects.

3.3.12. Upgrading limit

[TODO: this element needs better name]

<sa:upgradable-from version="1.0" release="1"/>

Minimal version of the package from which current package can upgrade. If this element is absent, it is supposed that upgrades are not supported by the package at all.

3.3.13. Changelog

<sa:changelog>
  <sa:version version="2.1.22" release="1">
    <sa:entry>New upstream version</sa:entry>
    <sa:entry xml:lang="ru-RU">...</sa:entry>
  </sa:version>
  <sa:version version="2.1.21" release="5">
    ...
  </sa:version>
  ...
</sa:changelog>

Changelog contains the human-readable list of changes between consequitive package versions. Order of entries in changelog is not specified, Controller should sort them.

3.3.14. URL to go after installation

<sa:after-installation-url>/tutorial/</sa:after-installation-url>

URL to be navigated to after application instantiation. It is not guaranteed that this page will ever be visited, so post-installation things as additional configuration, creating initial accounts etc must not be implemented by this mechanism.

3.3.15. Application settings

Applications often need additional parameters for succesful installation and configuration. While most of the questions asked during conventional web application installation are related to various resources and answers may be provided by the Controller without user intervention, some settings need to be entered by the user.

Settings are declared in settigns element in basic metadata. Settings may grouped by the group element. Each group has name declared by the name element and list of settings. Groups may not nest.

Each setting element represents the single setting to be asked from user.

To make user interface more convinient, the following information is supplied for each setting:

Label. Short name for the setting.
Description. Description of the setting.
Default value for the setting.
Data type of setting (string, number, enum, etc)

Settings and groups are listed in the order suggested to be used in interface.

3.3.15.1. Data types

Settings are typed, control panels may use the type information to validate input. The following types are defined:

3.3.15.2. Predefined settings

There are settings which are often required by the applications. Several identifiers are predefined in this specification to give implementors way to create better interfaces (predefined settings may be used by the control panel to provide values without asking user).

The following identifiers are predefined:

title. Title of application instance.
admin_name. Name of user administering application instance.
admin_password. Password of administrative account of application instance.
admin_email. Email of administrative account of application instance.
locale. Locale of application instance

Values of locale setting are required to be in format defined in RFC 3066. It is mandatory to use two-part identifiers with ISO 639 language name as first part of locale and ISO 3166 contry code as second part of locale. Other words, 'i-' or 'x-' locale names are not allowed.

It is recommended to make locale setting of enum type to declare all languages supported by the application.

3.4. Configuration script

Package may include the web application configuration script. This script will be invoked during adding application to a site, updating application, changing settings of application and removing application from the site.

Configuration script must be named configure and reside in the scripts directory of the Package root directory.

The script file and script language declaration in metadata must be either both included in the package or both omitted from the package.

Configuration script may be written in any programming language specified in Basic metadata section.

Any aspect may declare language to be available for executing configuration script. Such aspect must declare name of language to be used to identify language in basic metadata (as there is no established registry of programming language names, it is recommended to use lowercased name from Wikipedia list of programming languages [Langs]) and rules of executing configuration script.

Configuration script must be run on the host application is being installed to, with the permissions of user which ows this application instance, so only unprivileged actions may be performed in the configuration script.

During execution, the working directory of the configuration script must be set to the actual location of the script. All contents of scripts directory must also reside in the script's current working directory.

If package does not contain configuration script, Controller should skip all configuration script invocations.

If any script invocation fails (script returns non-zero exit code), it must be treated as fatal error and Controller must refuse to continue operation. Stdout and stderr of the script should be captured to log error in this case.

It is recommended to capture and log stdout and stderr of all configuration scripts runs for papertrail purposes.

3.4.1. Configuration script actions

[TODO: this section needs better name]

Configuration script is invoked when package is installed, configured, upgraded or uninstalled.

3.4.1.1. Installing application to the site

Configuration script is invoked when application installed to the site. At the moment of invocation all resources declared by the application must be allocated and instance files unpacked and placed to the filesystem. The script is invoked with the following arguments:

add

[TODO: this argument should probably be changed to 'install' to match text of specification]

3.4.1.2. Upgrading application on the site

Configuration script is invoked when applicanon instance is being upgraded. At the moment of invocation all resources needed by the new version of application must already be allocated. The script of new application version is invoked with the following arguments, where [old version] is the old version, and [old release] is the old release of the application being upgraded

upgrade [old version] [old release]
3.4.1.3. Changing settings

Configuration script is invoked when application is being configured (this does not include installing and upgrading). The script is invoked with the following arguments:

configure
3.4.1.4. Deinstallation application from the site

Configuration script is invoked during deinstallation before releasing all allocated resources and removing application files. The script is invoked with the following arguments:

remove

3.4.2. Environment Variables

All information about application, resources and settings is passed to the configuration script through environment variables.

Several predefined environment variables are always passed to scritp, and any aspect may declare additional environment variables.

3.4.2.1. Full instance URL

Full URL specifying where the application is to be installed, represented by the four environment variables corresponding to the URL parts as defined in RFC 1738:

BASE_URL_SCHEME - URL scheme. Allowed values: http, https.
BASE_URL_HOST - URL host.
BASE_URL_PORT - URL port (may be omitted if default port for protocol is used: 80 for http, 443 for https).
BASE_URL_PATH - URL path including trailing slash.

For example:

BASE_URL_SCHEME=http
BASE_URL_HOST=example.com
BASE_URL_PORT not defined
BASE_URL_PATH=phpBB/
            

Note that leading slash is not included in BASE_URL_PATH, as defined by RFC 1738.

3.4.2.2. Settings

For each application setting declared in package, the corresponding environment variable SETTINGS_[id] must be passed on to the installation script, where the [id] is the value of the id attribute of the setting description.

For the boolean, string, float, and integer property value data type elements, the corresponding environment variables must contain values entered by user.

For the enum setting, the environment variable must contain the identifier of one of the values (defined by the id attribute of the enum/choice element) selected by the user (e.g., if you have choice with id interface_color containing variants with ids black and blue, then variable SETTINGS_interface_color with value black or blue will be exported).

4. PHP aspect

This aspect describes declaring PHP web application needs and handling of them in Controller.

4.1. Declarations

RELAX NG schema of PHP aspect metadata

<p:min-version>4.0</p:min-version>

If web application works only on PHP since particular version, min-version element with desired minimal PHP version must be specified in metadata.

<p:reqired-extension>fcntl</p:reqired-extension>

If web application needs some PHP extensions to operate, names of each such extension must be specified in required-extension element in metadata. Name of extension is specified in zend_module_entry structure of extension code and may be obtained by the get_loaded_extensions PHP function.

<p:file-uploads>true</p:file-uploads>

If web application needs one of allow_url_fopen, file_uploads, safe_mode, short_open_tag, register_globals, magic_quotes_gpc settings to be true or false, apropriate element with must be added to metadata. If web application does not care about value of setting, corresponding element must not be included in metadata.

<p:memory-limit>16m</p:memory-limit>

If web application needs particular value of max_execution_time, memory_limit or post_max_size settings, apropriate elements with minimal acceptable values of settings must be added to metadata. If web application does not care about those settings, corresponding elements must not be included in metadata. Values are in format understood by PHP (integer or integer with k/m/b suffix for kilobytes, megabytes and gigabytes).

<p:required-function>system</p:required-function>

If web application needs specific PHP functions known to be frequently disabled (such as system function), each such function name must be declared in required-function element in metadata.

<p:code-dir>lib/php-ws-federation</p:code-dir>

If web application needs additional PHP code to be available in include_path, but not in document root, directory of such code should be placed somewhere in the Package, and code-dir element must contain full path path to the directory with code inside archive. There may be many such directories.

4.2. Processing

When Controller adds site application with PHP aspect to the site, Controller must ensure the number of things mentioned below, and refuse to install package if some of the requirements can't be fulfilled.

PHP must be enabled on site application installs to.

If min-version element exists in metadata, PHP version on site must be same or greater than min-version.

For each required-extension element, corresponding PHP extension must be available on the site.

For each allow-url-fopen, file-uploads, safe-mode, short-open-tag element present in metadata, corresponding PHP setting must have matching value on the site. For each execution-time-limit, memory-limit and post-size-limit element in metadata, corresponding PHP setting (max_execution_time, memory_limit, post_max_size) must have same or bigger value on the site. For each required-function element in metadata, corresponding function must be not in disabled_functions setting.

4.3. Files

For each code-dir element in metadata, code from the mentioned directory must be made available to the application in include_path. This code is potentially shareable between application instances.

4.4. Environment Variable

The PHP_VERSION environment variable must be passed on to the configuration script with the version of PHP (as a string value) installed on the Web site where the Package is to be installed.

4.5. Configuration Script Language

This extension defines php language for use by configuration scripts. When configuration script uses php language, it should be run by standalone PHP interpreter. All the requirements described in the site application metadata in PHP section apply to the interpreter running configuration scripts.

5. Database Aspect

This aspect describes declaring database connectivity required by web appilcation and handling it in Controller.

5.1. Declarations

RELAX NG schema of DB aspect metadata

<d:db
  <d:id>main</d:id>
  <d:default-name>phpbb</d:default-name>
  <d:db-server min-version="4.0" tables-prefix="true">mysql</d:db-server>
  <d:db-server min-version="7.4">postgresql</d:db-server>
</d:db>

If web application requires database(s) to work, for each database single db element must be added to metadata with the following content:

default-name element declares proposed name of database to be provided to the application. It is not guaranteed that actual database will be of the proposed name.

One or more db-server elements declare acceptable types of database servers. Each such element declares name of DB server and (optionally) minimal version of DB server web application can work with. Database servers should be sorted from the most preferable one on the top of list to the least preferable on the bottom of list. tables-prefix attribute declares that application instance can share database by using prefixed tables.

id element declares identifier of database. This identifier will be used in environment variables passed to configuration scripts.

5.2. Processing

When Controller adds site application with DB aspect to the site, Controller must ensure the following things and refuse to install package if some of the requirements can't be fulfilled.

For each db element database one of the types mentioned in db-server must be allocated to the user installing application. Minimal version of DB server in min-version attribute restriction must be taken into account. If tables-prefix attribute is present, Controller may chose to reuse existing database by giving unique prefix to the application. In this case Controller must ensure that tables of different applications do not overlap.

Controllers may use any policy to allocate database, from automatic provisioning to providing connection manually by the user.

5.3. Environment Variables

For each database, the following environment variables must be passed on to configuration script:

  • DB_<identifier>_TYPE - the database server type (on of the specified in db-server elements)

  • DB_<identifier>_NAME - the database name

  • DB_<identifier>_LOGIN - the database user login name

  • DB_<identifier>_PASSWORD - the database user password

  • DB_<identifier>_HOST - the database server host IP address or domain name

  • DB_<identifier>_PORT - the port number for connecting to the database server. If the port number is default for the selected DB server, this variable my be omitted.

  • DB_<identifier>_VERSION - the version of the database server

Environment variables DB_<identifier>_HOST and DB_<identifier>_PORT must not be specified if application is to use the local transport (UNIX sockets or named pipes) to connect to the database.

  • DB_<identifier>_PREFIX - name of tables prefix.

This variable is defined only if application shares database with another ones. In this case application must create tables with provided prefix, and must not touch tables without this prefix. Applications which declared support of shared database in metadata must correctly handle situation when whole database is issued to the instance.

5.4. Database Server Types

db-server element describes name of database server. Currently defined names are:

mysql - MySQL
postgresql - PostgreSQL
microsoft:sqlserver - Microsoft SQL Server

Another names should be taken from the JDBC drivers registry [JDBCDRIVERS]. Official database server driver should be used if more than one driver available. JDBC driver name (and subname if name specifies company, as with 'microsoft:sqlserver') is used.

6. Apache Aspect

This aspect is used by web applications using Apache-specific features.

6.1. Declarations

RELAX NG schema of Apache aspect metadata

This aspect should be used only if web application works exclusively with Apache due to some Apache-specific features.

For each Apache module required for application, required-module element with name of module must be declared in metadata.

6.2. Processing

When Controller adds site application with Apache aspect to the site, Controller must ensure the following:

  • Site is served from Apache

  • For each required-module element in metadata, corresponding Apache module must be enabled on the site.

7. Web content aspect

This aspect describes how to declare web applications with web content available from document root (typicaly it is static HTML and images, and PHP files).

7.1. Declarations

RELAX NG schema of Web content metadata

[TODO: this aspect need to be extended to cover more generic URL-mapping]

This aspect should be used if web application has files in document root (e.g. it is not purely mod_python based application whith keeps all the configuration in database).

Web applications typically keep different kinds of data on disk: static media, uploaded files, PHP code, CGIs etc. Those kinds differ in the possibility of sharing data between applications: uploaded files are per-instance, while PHP code may be used by the several instances of the application.

Each identified kind of data should be placed in the separate directory in package, and separate content element must be added to metadata, describing:

  • Possibility to share data between instances.

  • URL of content relative to the base URL of application.

  • Directory in Package where the data is stored.

  • Approximate size of content in package in bytes. This will be used by the Controller to check whether the application can be installed.

  • Whether web server should be able to create and change files in this directory.

7.2. Processing

When Controller adds site application with Web content aspect to the site, Controller must ensure the following:

  • User has enough disk space available to copy all non-shared parts.

  • Each content part is available by the relative to the base Web application URL.

  • If content part is writable, Web server is able to create/modify/delete files in the corresponding directory.

7.3. Example

<s:web xmlns:s="http://swsoft.com/schemas/siteapps/1">
  <s:content shared="true" rel-url="/images" web-server-writable="false">
    <s:dir>code/images</s:dir>
  </s:content>
  <s:content shared="true" rel-url="/" web-server-writable="false">
    <s:dir>code/src</s:dir>
  </s:content>
  <s:content shared="false" rel-url="/upload" web-server-writable="true">
    <s:dir>code/upload</s:dir>
  </s:content>
</s:web>
</code>

Package contents:

META-APP.xml
 code/
   images/
     arrow.png
   src/
     about.php
   upload/
      

If application is installed to the http://example.com/myapp/, then arrow.png will be available at http://example.com/myapp/images/arrow.png and about.php will be available at http://example.com/myapp/about.php.

Controller may use single instance of arrow.png and about.php for the several instances of application, but upload directory must never be shared.

7.4. Environment Variables

WEB_MODE environment variable must be passed to the configuration script with one of two possible string values: shared or instance, reflecting mode of application installation.

For each content part WEB_<id>_DIR must be passed with the absolute path to the directory where content is installed.

For each content part WEB_<id>_MODE must be passed with the one of two possible string values: shared or instance, reflecting the mode of the content part installation. Obviously, when application is installed in non-shared mode, all such variables will contain instance value.

id of each content part is the text content of dir element.

8. Entry Points Aspect

This aspect describes how Site Applications declare additional entry points and how Controllers process such declarations.

8.1. Declarations

XML element described in the following schema must be added to metadata to declare use of Entry Points aspect.

RELAX NG schema of entry points aspect metadata

If web application contains entry points except main one (base URL of web application), packager may declare it using entry points extension.

For each such entry point, entry element should be added with the following content:

url element. This is URL relative to the base URL of web application.
label element. This is label to use in button in control panel. This element is localizable.
description element. This is decription to be used as tooltip in control panel. This element is also localizable.

8.2. Processing

When Controller adds to the site site application with Entry Points aspect, it may provide user with the choice what entry points to show in control panel interface and where. Installed entry points must point to the base URL of site application + URL of entry point.

Controller should provide base URL of installed application as main entry point.

Controller should use provided label and description. Controller should use icon of site application, if it needs icon to show entry point in interface.

8.3. Example

<entry-points xmlns="http://swsoft.com/schemas/siteapps/1">
  <entry>
    <url>/admin</url>
    <label>Administrative interface</label>
  </entry>
  <entry>
    <url>/guest</url>
    <label>Guest view</label>
    <label xml:lang="ru-RU">Вид сбоку</label>
    <description>View the gallery as user not logged in</description>
    <description xml:lang="ru-RU">Галерея, как видна случаному посетителю</description>
  </entry>
</entry-points>

References

XML technologies

[XMLNS] Namespaces in XML (Second Edition). W3C Recommendation 16 Aug 2006.

[RNG] RELAX NG specification. OASIS Committee Specification 3 December 2001

[RNC] RELAX NG compact syntax specification. OASIS Committee Specification 21 November 2002