INTRODUCTION:
The term database refers to the collection of
related records, and the software should be referred to as the database
management system or DBMS. Database management systems are usually categorized
according to the data model that they support: relational, object-relational,
network, and so on. The data model will tend to determine the query languages
that are available to access the database. The world of database boasts many
kinds of technologies, which cater to the need of many kinds of organizations. Since
1970 different models and methods have been developed to describe, analyses and
design computer based files and databases. The existing relational DBMS
technology has been successfully applied to many application domains. RDBMS
technology has proved to be an effective solution for data management
requirements in large and small organizations, and today this technology forms
a key component of most information systems. However, Applications in domains
such as Multimedia, Geographical Information Systems, digital libraries, mobile
database etc. demand a completely different set of requirements in terms of the
underlying database models. The conventional relational database model is no
longer appropriate for these types of data. Furthermore the volume of data is
significantly larger than in classical database systems. Finally, indexing,
retrieving and analysing these data types require specialized functionality
that are not available in conventional database systems. This paper will cover
some requirements of these emerging databases such as multimedia database,
spatial database, temporal database, biological/genome database, mobile
database, big data, their underlying technologies, data models and languages.
These trends have resulted into the development of new database technologies to
handle new data types and applications.
Some emerging database
technologies are:
1-MULTIMEDIA DATABASE:
Multimedia computing has emerged as a major
area of research and has started dominating all facets of lives of mankind. A
multimedia database is a database that hosts one or more primary media file
types such as video, audio, radar signals and documents or pictures in various
encoding. These forms have in common that they are much larger than the earlier
forms of data integers, character strings of fixed length and vastly varying
size. These are fall into three main categories:
Static media (time-independent, i.e. images and
handwriting)
Dynamic media (time-dependent, i.e. video and sound
bites)
Dimensional media (i.e. 3D games or computer-aided drafting programs- CAD)
All
primary media files are stored in binary strings of zeroes and ones, and are
encoded according to file type. The term "data" is typically
referenced from the computer point of view, whereas the term
"multimedia" is referenced from the user point of view. There are
numerous different types of multimedia databases, including:
The Authentication Multimedia Database is a 1:1 data
comparison ratio.
The Identification Multimedia Database is a data comparison of one-to-many
A
newly-emerging type of multimedia database, is the Biometrics Multimedia
Database, which specializes in automatic human verification based on the
algorithms of their behavioral or physiological profile. This method of identification
is superior to traditional multimedia database methods requiring the typical
input of personal identification numbers and passwords. Due to the fact that
the person being identified does not need to be physically present, where the
identification check is taking place. This removes the need for the person
being scanned to remember a PIN or password. Fingerprint identification
technology is also based on this type of multimedia database. The historic
relational databases (i.e. the Binary Large Objects - BLOBs- developed for SQL
databases to store multimedia data) do not conveniently support content-based
searches for multimedia content. This is due to the relational database not
being able to recognize the internal structure of a Binary Large Object and
therefore internal multimedia data components cannot be retrieved.
2-TEMPORAL DATABASE:
Time
is an important aspect of real world phenomena. Events occur at specific points
in time. Objects and relationships among objects exist over time. The ability
to model this temporal dimension of real world is essential to many computer
applications such as econometrics, inventory control, airline reservations,
medical records, accounting, law, banking, land and geographical information
systems. In contrast, existing database technology provides little support for
managing such data. A temporal database is formed by compiling and storing
temporal data. The difference between temporal data and non-temporal data is
that a time period is appended to data expressing when it was valid or stored
in the database. The data stored by conventional databases consider data to be
valid at present time as in the time instance ―now‖. When data in such a
database is modified, removed or inserted, the state of the database is
overwritten to form a new state. The state prior to any changes to the database
is no longer available. Thus, by associate time with data, it is possible to
store the different database states. In essence, temporal data is formed by
time-stamping ordinary data (type of data we associate and store in
conventional databases). In a relational data model, tuples are time-stamped
and in an object-oriented data model, objects/attributes are time stamped. Each
ordinary data has two time values attached to it, a start time and an end time
to establish the time interval of the data. In a relational data model, relations
are extended to have two additional attributes, one for start time and another
for end time. Different Forms of Temporal Databases Time can be
interpreted as valid time (when data occurred or is true in reality) or
transaction time (when data was entered into the database).
a historical database stores data with respect to
valid time.
a rollback database stores data with respect to
transaction time.
a
bitemporal database stores data with respect to both valid and transaction time
–
They
store the history of data with respect to valid time and transaction time. A
central goal of conventional relational database design is to produce a
database schema consisting of a set of relational schemas. In normalization
theory, normal forms constitute attempts at characterizing ―good‖ relation
schemas, and a wide variety of normal forms has been proposed, the most
prominent being third normal form and Boyce-Codd normal form. An extensive
theory has been developed to provide a solid formal footing for relational
database design, and most database textbooks expose their readers to the core
of this theory. In temporal databases, there is an even greater need for
database design guidelines. However, the conventional normalization concepts
are not applicable to temporal relational data models because these models
employ relational structures different from conventional relations. New
temporal normal forms and underlying concepts that may serve as guidelines
during temporal database design are needed. Temporal data models generally
define time slice operators, which may be used to determine the snapshots
contained in a temporal relation. Accepting a temporal relation as their
argument and a time point as their parameter, these operators return the
snapshot of the relation corresponding to the specified time point. Adopting a
longer term and more abstract perspective, it is likely that new database
management technologies and application areas will continue to emerge that
provide ‗temporal ‘challenges. Due to the ubiquity of time and its importance
to most database management applications, and because built-in temporal support
generally offers many benefits and is challenging to provide, research in the
temporal aspects of new database management technologies will continue to
flourish for existing as well as new application areas.
3-MOBILE DATABASE
The rapid technological development of
mobile phones (cell phones), wireless and satellite communications and
increased mobility of individual users have resulted into increasing demand for
mobile computing. Portable computing devices such as laptop computers, palmtop
computers and so on coupled with wireless communications allow clients to
access data from virtually anywhere and at any time in the globe. The mobile databases
interfaced with these developments, offer the users such as CEOs, marketing
professionals, finance managers and others to access any data, anywhere, at any
time to take business decisions in real-time. Mobile databases are especially
useful to geographically dispersed organisations.
The flourishing of the mobile devices is
driving businesses to deliver data to employees and customers wherever they may
be. The potential of mobile gear with mobile data is enormous. A salesperson
equipped with
a PDA running corporate databases can check order status, sales history and
inventory instantly from the client’s site. And drivers can use handheld
computers to log deliveries and report order changes for a more efficient
supply chain
Recent
advances in portable and wireless technology led to mobile computing, a new
dimension in data communication and processing. Portable computing devices
coupled with wireless communications allow clients to access data from
virtually anywhere and at any time. Now days you can even connect to your
Intranet from an aero plane. Mobile database are the database that allows the
development and deployment of database applications for handheld devices, thus,
enabling relational database based applications in the hands of mobile workers.
The database technology allows employees using handheld to link to their
corporate networks, download data, work offline, and then connect to the
network again to synchronize with the corporate database. Mobile computing
applications, residing fully or partially on mobile devices, typically use
cellular networks to transmit information over wide areas, and wireless LANs
over short distances. Some of the commercially available Common Mobile
relational Database systems are IBM's DB2 Everywhere 1.0, Oracle Lite, Sybase's
SQL etc.
These
databases work on Palm top and hand held devices (Windows CE devices) providing
a local data store for the relational data acquired from enterprise SQL
databases. The main constraints for such databases are relating to the size of
the program as the handheld devices have RAM oriented constraints. The
commercially available mobile database systems allow wide variety of platforms
and data sources. They also allows users with handheld to synchronise with Open
Database Connectivity (ODBC) database content, and personal information
management data and email from Lotus Development's Notes or Microsoft's
Exchange. These database technologies support either query-by-example (QBE) or
SQL statements. Mobile computing has proved useful in many applications. Many
business travelers are using laptop computers to enable them to work and to
access data while traveling. Delivery services may use/ are using mobile
computers to assist in tracking of delivery of goods. Emergency response
services may use/ are using mobile computers at the disasters sites, medical
emergencies, etc. to access information and to provide data pertaining to the
situation. Newer applications of mobile computers are also emerging.
4-GEOGRAPHIC INFORMATION
SYSTEMS
GIS
is a technological field that incorporates geographical features with tabular
data in order to map, analyses, and assess real-world problems. The key word to
this technology is Geography – this means that some portion of the data is
spatial. In other words, data that is in some way referenced to locations on
the earth. Coupled with this data is usually tabular data known as attribute
data. Attribute data can be generally defined as additional information about
each of the spatial features. Geographic information systems (GIS) are used to
collect, model, and analyses information describing physical properties of the
geographical world. The scope of GIS broadly encompasses two types of data:
Spatial data, originating from maps, digital images,
administrative and political boundaries, roads, transportation networks,
physical data, such as rivers, soil characteristics, climatic regions, land
elevations, and
Non
spatial data, such as socio-economic data (like census counts), economic data,
and sales or marketing information. GIS is a rapidly developing domain that
offers highly innovative approaches to meet some challenging technical demands.
GIS
Applications can be divided into three categories
Cartographic applications
Digital terrain modelling applications
geographic objects applications
Figure
shows GIS categories and grouping of
different GIS application areas. GIS data can be broadly represented in two
formats, Vector data and Raster data. Vector data represents geometric objects
such as points, lines and polygons. Raster data is characterized as an array of
points, where each point represents the value of an attribute for a real-world
location. Informally, raster images are n-dimensional array where each entry is
a unit of the image and represents an attribute. Two-dimensional units are
called pixels, while three-dimensional units are called voxels.
Three-dimensional elevation data is stored in a raster-based digital elevation
model (DEM) format. Another raster format called triangular irregular network
(TIN) is a topological vector-based approach that models surfaces by connecting
sample points as vector of triangles and has a point density that may vary with
the roughness of the terrain. Rectangular grids (or elevation matrices) are
two-dimensional array structures.
5-GENOME DATA
The
biological sciences encompass an enormous variety of information. Environmental
science gives us a view of how species live and interact in a world filled with
natural phenomena. Biology and ecology study particular species. Anatomy
focuses on the overall structure of an organism, documenting the physical
aspects of individual bodies. Traditional medicine and physiology break the
organism into systems and tissues and strive to collect information on the
workings of these systems and the organism as a whole. Histology and cell
biology delve into the tissue and cellular levels and provide knowledge about the
inner structure and function of the cell. This wealth of information that has
been generated, classified, and stored for centuries has only recently become a
major application of database technology. Genetics has emerged as an ideal
field for the application of information technology. In a broad sense, it can
be taught of as the construction of models based on information about genes –
which can be defined as units of heredity – and population and the seeking out
of relationships in that information. The study of genetics can be divided into
three branches:
Mendelian genetics. This is the study of the
transmission of traits between generations.
Molecular genetics. This is the study of the
chemical structure and function of genes at the molecular level.
Population genetics. This is the study of how genetic information varies across
populations of organisms.
Biological
data exhibits many special characteristics that make management of biological
information a particularly challenging problem. The characteristics related to
biological information, and focusing on a multidisciplinary field called
bioinformatics that has emerged. Bioinformatics addresses information
management of genetic information with special emphasis on DNA sequence
analysis. Applications of bioinformatics span design of targets for drugs,
study of mutations and related diseases, anthropological investigations on
migration patterns of tribes and therapeutic treatments. The term genome is
defined as the total genetic information that can be obtained about an entity.
The human genome, for example, generally refers to the complete set of genes
required to create a human being –estimated to be more than 30,000 genes spread
over 23 pairs of chromosomes, with an estimated 3 to 4 billion nucleotides. The
goal of the Human Genome Project (HGP) has been to obtain the complete sequence
– the ordering of the bases – of those nucleotides.
6-DIGITAL LIBRARY
Digital
libraries are an important and active research area. Conceptually, a digital
library is an analog of a traditional library-a large collection of information
sources in various media-coupled with the advantages of traditional
technologies. However, digital libraries differ from their traditional
counter-parts in significant ways: storage is digital, remote access is quick
and easy, and materials are copied from a master version. Furthermore, keeping
extra copies on hand is easy and is not hampered by budget and storage
restrictions, which are major problems in traditional libraries. Thus, digital
technologies overcome many of the physical and economic limitations of
traditional libraries. The Digital Library Initiative (DLI), jointly focused by
SNF, DARPA, and NASA, has been a major accelerator of the development of
digital libraries. This initiative provided significant funding to six major
projects at six universities in its first phase covering a broad spectrum of
enabling technologies. The initiative‘s web page define its focus as
―dramatically advance the means to collect, store, and organize information in
digital forms, and make it available for searching, retrieval, and processing
via communication networks-all in user-friendly ways. The magnitude of these
data collections as well as their diversity and multiple formats provides
challenges on a new scale. The future progression of the development of digital
libraries is likely to move from the present technology of retrieval via the
internet, though net searches of indexed information in repositories, to a time
of information correlation and analysis by intelligent networks. Techniques for
collecting information, storing it, and organizing it to support informational
requirements learned in decades of design and implementation of database will
provide the baseline for development of approaches appropriate for digital
libraries.
7- BIG DATA
Now
a days advancement of technology generate large, diverse, longitudinal,
complex, and/or distributed data sets mainly from instruments, sensors,
Internet transactions, email, video, click streams, and/or all other digital
sources. Individuals with smartphones and on social network sites and
multimedia will continue to fuel exponential growth of data. The large pools of
data that can be captured, communicated, aggregated, stored, and analysed is
part of every sector and function of the global economy. This amount of data
has been exploding. Companies capture trillions of bytes of information about
their customers, suppliers, and operations, and millions of networked sensors
are being embedded in the physical world in devices such as mobile phones and
automobiles, sensing, creating, and communicating data. Multimedia and
individuals with smartphones and on social network sites will continue to fuel
exponential growth. Big data—large pools of data that can be captured,
communicated, aggregated, stored, and analysed—is now part of every sector and
function of the global economy. Like other essential factors of production
Such
as hard assets and human capital, it is increasingly the case that much of
modern economic activity, innovation, and growth simply couldn‘t take place
without data. Big data represents a sea change in the technology we draw upon
for making decisions. Organizations will integrate and analyse data from
diverse sources, complementing enterprise databases with data from social
media, video, smart mobile devices, and other sources. The evolution of
information architectures to include big data will likely provide the
foundation for a new generation of enterprise infrastructure. To exploit these
diverse sources of data for decision-making, an organization must develop an
effective strategy for acquiring, organizing, and analysing big data, using it
to generate new insights about the business and make better decisions. The
previously nebulous definition of ―big data‖ is growing more concrete as it
becomes the focus of more applications. As seen in Figure 2 (below), volume,
velocity and variety make up three key characteristics of big data:
Volume. Rather than just capturing business
transactions and moving samples and aggregates to another database for
analysis, applications now capture all possible data for analysis.
Velocity. Traditional transaction-processing
applications might have captured transactions in real time from end users, but
newer applications are increasingly capturing data streaming in from other
systems or even sensors. Traditional applications also move their data to an
enterprise data warehouse through a deliberate and careful process that
generally focuses on historical analysis.
Variety. The variety of data is much richer now, because data no longer comes
solely from business transactions. It often comes from machines, sensors and
unrefined sources, making it much more complex to manage.
8-NOSQL
DATABASES
The
term NoSQL has been around for just a few years and was invented to provide a
descriptor for a variety of database technologies that emerged to cater for
what is known as "Web-scale" or "Internet-scale" demands.
In computing, NoSQL (commonly interpreted as "not only SQL") is a
broad class of database management systems identified by non-adherence to the
widely used relational database management system model. NoSQL databases are not
built primarily on tables, and generally do not use SQL for data manipulation.
NoSQL database systems are often highly optimized for retrieval and appending
operations and often offer little functionality beyond record storage (e.g.
key–value stores). The reduced run-time flexibility compared to full SQL
systems is compensated by marked gains in scalability and performance for
certain data models. In short, NoSQL database management systems are useful
when working with a huge quantity of data when the data's nature does not
require a relational model. The data can be structured, but NoSQL is used when
what really matters is the ability to store and retrieve great quantities of
data, not the relationships between the elements. Usage examples might be to store
millions of key–value pairs in one or a few associative arrays or to store
millions of data records. The fledgling NoSQL marketplace is going through a
rapid transition – from the predominantly community-driven platform development
to a more mature application-driven market. Scaling up web infrastructure on
NoSQL basis have proven successful for Facebook, Digg and Twitter. Successful
attempts have been made to develop NOSQL applications in the biotechnology,
defence and image/signal processing. Interest in using key-value pair (KVP)
technology has re-emerged to the point where the traditional RDMS vendors
evaluate strategy of developing in-house NoSQL solutions and integrating them
in current product offers. It will not take long before we‘ll see acquisitions
driven by emerging NoSQL technology. The future deals will likely be made to
better compete both in platform offering and in vertical market segments.
CONCLUSIONS
Applications
in domains such as Multimedia, Geographical Information Systems, digital
libraries, and big data demand a completely different set of requirements in
terms of the underlying database models which conventional relational database
can no longer handle. The conventional relational database model is no longer
appropriate for these types of data. Furthermore the volume of data is
typically significantly larger than in classical database systems. Finally,
indexing, retrieving and analyzing these data types require specialized
functionality, which is not available in conventional database systems. Hence,
a new direction, such as described above, in DBMS is necessary