BaseX

Last updated
BaseX
Original author(s) Christian Grün
Initial release2007
Stable release
11.0 / June 6, 2024;0 days ago (2024-06-06)
Repository
Written in Java
Platform Java SE
Available inEnglish, Dutch, French, German, Hungarian, Indonesian, Italian, Japanese, Mongolian, Romanian, Russian, Spanish [1]
Type XML database
License BSD-3-Clause [2]
Website basex.org

BaseX is a native and light-weight XML database management system and XQuery processor, developed as a community project on GitHub. [3] It is specialized in storing, querying, and visualizing large XML documents and collections. [4] BaseX is platform-independent and distributed under the BSD-3-Clause license. [2]

Contents

In contrast to other document-oriented databases, XML databases provide support for standardized query languages such as XPath and XQuery. BaseX is highly conformant to World Wide Web Consortium (W3C) specifications [5] [6] and the official Update and Full Text extensions. The included GUI enables users to interactively search, explore and analyze their data, and evaluate XPath/XQuery expressions in realtime (i.e., while the user types).

Technologies

Database layout

BaseX uses a tabular representation of XML tree structures to store XML documents. The database acts as a container for a single document or a collection of documents. The XPath Accelerator encoding scheme and Staircase Join Operator have been taken as inspiration for speeding up XPath location steps. [8] Additionally, BaseX provides several types of indices to improve the performance of path operations, attribute lookups, text comparisons and full-text searches. [9]

History

BaseX was started by Christian Grün at the University of Konstanz in 2005. In 2007, BaseX went open source and has been under the BSD-3-Clause license since then. [10] [11]

Supported systems

The BaseX server is a pure Java 1.8 application and thus runs on any system that provides a suitable Java implementation. It has been tested on Windows, Mac OS X, Linux and OpenBSD. [12] In particular, packages are available for Debian [13] and Ubuntu. [14]

Further reading

Related Research Articles

Berkeley DB (BDB) is an embedded database software library for key/value data, historically significant in open-source software. Berkeley DB is written in C with API bindings for many other programming languages. BDB stores arbitrary key/data pairs as byte arrays and supports multiple data items for a single key. Berkeley DB is not a relational database, although it has database features including database transactions, multiversion concurrency control and write-ahead logging. BDB runs on a wide variety of operating systems, including most Unix-like and Windows systems, and real-time operating systems.

<span class="mw-page-title-main">XML</span> Markup language by the W3C for encoding of data

Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.

In computing, the term Extensible Stylesheet Language (XSL) is used to refer to a family of languages used to transform and render XML documents.

XSLT is a language originally designed for transforming XML documents into other XML documents, or other formats such as HTML for web pages, plain text or XSL Formatting Objects, which may subsequently be converted to other formats, such as PDF, PostScript and PNG. Support for JSON and plain-text transformation was added in later updates to the XSLT 1.0 specification.

Java XML is a mark up language for marking up the structure of a text document. The Java programming language XML APIs developed by Sun Microsystems consist of the following separate computer-programming APIs:

Saxon is an XSLT and XQuery processor created by Michael Kay and now developed and maintained by his company, Saxonica. There are open-source and also closed-source commercial versions. Versions exist for Java, JavaScript and .NET.

XPath 2.0 is a version of the XPath language defined by the World Wide Web Consortium, W3C. It became a recommendation on 23 January 2007. As a W3C Recommendation it was superseded by XPath 3.0 on 10 April 2014.

eXist-db is an open source software project for NoSQL databases built on XML technology. It is classified as both a NoSQL document-oriented database system and a native XML database. Unlike most relational database management systems (RDBMS) and NoSQL databases, eXist-db provides XQuery and XSLT as its query and application programming languages.

An XML database is a data persistence software system that allows data to be specified, and sometimes stored, in XML format. This data can be queried, transformed, exported and returned to a calling system. XML databases are a flavor of document-oriented databases which are in turn a category of NoSQL database.

An RDF query language is a computer language, specifically a query language for databases, able to retrieve and manipulate data stored in Resource Description Framework (RDF) format.

XQuery Update Facility is an extension to the XML Query language, XQuery. It provides expressions that can be used to make changes to instances of the XQuery 1.0 and XPath 2.0 Data Model.

XPath is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) in 1999, and can be used to compute values from the content of an XML document. Support for XPath exists in applications that support XML, such as web browsers, and many programming languages.

In software development XRX is a web application architecture based on XForms, REST and XQuery. XRX applications store data on both the web client and on the web server in XML format and do not require a translation between data formats. XRX is considered a simple and elegant application architecture due to the minimal number of translations needed to transport data between client and server systems. The XRX architecture is also tightly coupled to W3C standards to ensure XRX applications will be robust in the future. Because XRX applications leverage modern declarative languages on the client and functional languages on the server they are designed to empower non-developers who are not familiar with traditional imperative languages such as JavaScript, Java or .Net.

XQuery is a query and functional programming language that queries and transforms collections of structured and unstructured data, usually in the form of XML, text and with vendor-specific extensions for other data formats. The language is developed by the XML Query working group of the W3C. The work is closely coordinated with the development of XSLT by the XSL Working Group; the two groups share responsibility for XPath, which is a subset of XQuery.

Sedna is an open-source database management system that provides native storage for XML data. The distinctive design decisions employed in Sedna are (i) schema-based clustering storage strategy for XML data and (ii) memory management based on layered address space.

Virtual Token Descriptor for eXtensible Markup Language (VTD-XML) refers to a collection of cross-platform XML processing technologies centered on a non-extractive XML, "document-centric" parsing technique called Virtual Token Descriptor (VTD). Depending on the perspective, VTD-XML can be viewed as one of the following:

<span class="mw-page-title-main">XQuery API for Java</span> Application programming interface

XQuery API for Java (XQJ) refers to the common Java API for the W3C XQuery 1.0 specification.

<span class="mw-page-title-main">XML transformation language</span> Type of programming language

An XML transformation language is a programming language designed specifically to transform an input XML document into an output document which satisfies some specific goal.

Zorba is an open source query processor written in C++, implementing

Qizx is a proprietary XML database that provides native storage for XML data.

References

  1. "Translations - BaseX Documentation".
  2. 1 2 "BaseX Open Source" . Retrieved 2021-06-28.
  3. GitHub: BaseX
  4. "Overview on database instances created with BaseX" . Retrieved 30 June 2011.
  5. "W3C: XQuery Test Suite Result Summary". World Wide Web Consortium. Retrieved 30 June 2011.
  6. "W3C: XPath and XQuery Full Text 1.0 Test Suite Result Summary". World Wide Web Consortium. Retrieved 30 June 2011.
  7. BaseX XQJ API
  8. Christian Grün; Marc Kramis; Alexander Holupirek; Marc H. Scholl; Marcel Waldvogel (30 June 2006). "Pushing XPath accelerator to its limits" (PDF). Universität Konstanz. Archived from the original (PDF) on 27 September 2011. Retrieved 30 June 2011.
  9. "Storing and Querying Large XML Instances" (PDF). Universität Konstanz. Archived from the original (PDF) on 9 October 2011. Retrieved 30 June 2011.
  10. "BaseX 5.0: XML Database with Visual Frontend". Linux Magazine . Retrieved 30 June 2011.
  11. "Open Source Kompetenzzentrum of the german Bundesverwaltungsamt" (in German). Archived from the original on 3 November 2011. Retrieved 30 June 2011.
  12. "Startup - BaseX Documentation".
  13. "Debian -- Package search results -- basex".
  14. "basex package: Ubuntu". 25 April 2023.