Any information processing application used in running a business
Cache
A rapid computer memory where frequently or recently used data is temporarily stored
CAP theorem
One cannot achieve Consistency, Availability, and Partition tolerance at the same time
Category
A flat or hierarchic semantic dimension added to a document, or part of a document
Categorization
Assigning, usually through statistical means, one or more categories to text
CDM
Customer Data Management
Cloud services
Computer applications that are executed on computers outside the enterprise rather than in-house. Examples are SalesForce, Google Apps, Yahoo mail, etc.
Clustering
Grouping documents according to content similarity
CMS
Content Management System
Consistency
A quality of an information system in which only valid data is recorded; that is, there are not two conflicting versions of the same data
Connector
A program that extracts information from a certain file format, or from a database
Consolidation
Making all the data concerning one entity available in one output
COTS
Commercial off-the-shelf software
Crawl
Fetching web pages for indexing by following URLs found in each page
CRM
Customer Relationship Management, applications used by businesses to interact with customers
CSIS
Customer Service Information System
Data integration
Merging data from different data sources or different information systems
Data mart
A subset of data found in an enterprise information system, relevant for a specific group or purpose
Data warehouse
A database which is used to consolidate data from disparate sources
DBA
Database administrator, the person who is responsible for maintaining (and often designing) an organization’ database(s)
Deep Web
Web pages that are dynamically generated as a result of form input and/or database querying
Directory
A listing of the files or websites in a particular storage system
DIS
Decision Intelligence System, a computer-based system for helping decision making
Document model
A model of seeing a database entity as a single persistent document, composed of typed fields and categories corresponding to the entity’s attributes
Dublin Core Metadata
A standard for metadata associated with documents, such as Title, Creator, Publisher, etc.
Durability
A database quality that means that successfully completed transactions must persist (or be recoverable) in the case of a system failure
EDI
Electronic Data Interchange, an early database communication system
ETL
Extract-Transform-Load, any method for extracting all or part of a database and storing it in another database
Enterprise Search
Searching access-controlled, structured and unstructured data found within the enterprise
ERP
Enterprise Resource Planning
Evolutive Data Model
Model that can be easily extended with new fields or data types without rebuilding the entire data structure
Facet
A dimension of meaning that can be used for restricting search, for example shirts and coats are two facets that could be found on a shopping site
Field
A labeled part of a document in a search engine. Fields can be typed to contain text, numbers, dates, GPS coordinates, or categories
Firewall
A computer-implemented protection that isolates internal company data from outside access
File server
A service that provides sequential or direct access to computer files
Full-text engine
A system for searching any of the words found in documents, rather than just a set of manually assigned keywords
Garbage collection
A process for recovering memory, usually by recognizing deleted or out-of-date data
Gartner
An information technology research and advisory firm that reports on technology issues
GPS
Global Positioning System, a system of satellites for geolocating a point on the globe
Hash table
Hashing converts a data item into a single number, and the hash table maps this number to a list of items
Heuristics
Methods based more on demonstrated performance than theory, weighting words by their inverse frequency in a collection is an example
HTTP
HyperText Transfer Protocol, an application layer protocol for accessing web pages
IDC
International Data Corporation, a global provider of market intelligence and analysis concerning information technology
ILM
Information Lifecycle Management
IMAP
Internet Message Access Protocol, a format for transmitting emails
Index, inverted
A data structure that contains lists of words with pointers to where the words are found in documents
Index slice
One section of an inverted index which can be distributed over many different computer stores
Intranet
A secure network that gives authorized users Web-style access to an organization’s information assets (e.g., internal documents and web pages)
IR
Information Retrieval, the study of how to index and retrieve information, usually from unstructured text
IS
Information System, a generic term for any computer system for storing and retrieving information
Isolation
The database constraint specifying that data involved in a transaction are isolated from (inaccessible to) other transactions