Web intelligence

Last updated

Web intelligence is the area of scientific research and development that explores the roles and makes use of artificial intelligence and information technology for new products, services and frameworks that are empowered by the World Wide Web. [1]

Contents

The term was coined in a paper written by Ning Zhong, Jiming Liu Yao and Y.Y. Ohsuga in the Computer Software and Applications Conference in 2000. [2]

Research

The research about the web intelligence covers many fields – including data mining (in particular web mining), information retrieval, pattern recognition, predictive analytics, the semantic web, web data warehousing – typically with a focus on web personalization and adaptive websites. [3]

Related Research Articles

Text mining, text data mining (TDM) or text analytics is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources." Written resources may include websites, books, emails, reviews, and articles. High-quality information is typically obtained by devising patterns and trends by means such as statistical pattern learning. According to Hotho et al. (2005) we can distinguish between three different perspectives of text mining: information extraction, data mining, and a knowledge discovery in databases (KDD) process. Text mining usually involves the process of structuring the input text, deriving patterns within the structured data, and finally evaluation and interpretation of the output. 'High quality' in text mining usually refers to some combination of relevance, novelty, and interest. Typical text mining tasks include text categorization, text clustering, concept/entity extraction, production of granular taxonomies, sentiment analysis, document summarization, and entity relation modeling.

<span class="mw-page-title-main">Xenu's Link Sleuth</span> Broken hyperlink checking computer program

Xenu, or Xenu's Link Sleuth, is a computer program that checks websites for broken hyperlinks. It is written by Tilman Hausherr and is proprietary software available at no charge. The program is named after Xenu, the Galactic Ruler from Scientology scripture.

Granular computing is an emerging computing paradigm of information processing that concerns the processing of complex information entities called "information granules", which arise in the process of data abstraction and derivation of knowledge from information or data. Generally speaking, information granules are collections of entities that usually originate at the numeric level and are arranged together due to their similarity, functional or physical adjacency, indistinguishability, coherency, or the like.

In artificial intelligence research, commonsense knowledge consists of facts about the everyday world, such as "Lemons are sour", or "Cows say moo", that all humans are expected to know. It is currently an unsolved problem in Artificial General Intelligence. The first AI program to address common sense knowledge was Advice Taker in 1959 by John McCarthy.

In predictive analytics, data science, machine learning and related fields, concept drift or drift is an evolution of data that invalidates the data model. It happens when the statistical properties of the target variable, which the model is trying to predict, change over time in unforeseen ways. This causes problems because the predictions become less accurate as time passes. Drift detection and drift adaptation are of paramount importance in the fields that involve dynamically changing data and data models.

The High Performance Knowledge Bases (HPKB) was a DARPA research program to advance the technology of how computers acquire, represent and manipulate knowledge. The successor of the HPKB project was the Rapid Knowledge Formation (RKF) project.

Sentiment analysis is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine. With the rise of deep language models, such as RoBERTa, also more difficult data domains can be analyzed, e.g., news texts where authors typically express their opinion/sentiment less explicitly.

<i>Romance of the Three Kingdoms</i> (TV series) Chinese television series

Romance of the Three Kingdoms is a Chinese television series adapted from the classical 14th century novel of the same title by Luo Guanzhong. The series was produced by China Central Television (CCTV) and was first aired on the network in 1994. It spanned a total of 84 episodes, each approximately 45 minutes long. One of the most expensive television series produced at the time, the project was completed over four years and involved over 400,000 cast and crew members, including divisions of the People's Liberation Army from the Beijing, Nanjing and Chengdu military regions. Some of the dialogues spoken by characters were adapted directly from the novel. Extensive battle scenes, such as the battles of Guandu, Red Cliffs and Xiaoting, were also live-acted.

Terminology extraction is a subtask of information extraction. The goal of terminology extraction is to automatically extract relevant terms from a given corpus.

<span class="mw-page-title-main">Ontology learning</span> Automatic creation of ontologies

Ontology learning is the automatic or semi-automatic creation of ontologies, including extracting the corresponding domain's terms and the relationships between the concepts that these terms represent from a corpus of natural language text, and encoding them with an ontology language for easy retrieval. As building ontologies manually is extremely labor-intensive and time-consuming, there is great motivation to automate the process.

A relationship extraction task requires the detection and classification of semantic relationship mentions within a set of artifacts, typically from text or XML documents. The task is very similar to that of information extraction (IE), but IE additionally requires the removal of repeated relations (disambiguation) and generally refers to the extraction of many different relationships.

<span class="mw-page-title-main">James Z. Wang</span> Chinese-American computer scientist

James Ze Wang is a Chinese-American computer scientist. He is a distinguished professor of the College of Information Sciences and Technology at Pennsylvania State University. He is also an affiliated professor of the Molecular, Cellular, and Integrative Biosciences Program; the Computational Science Graduate Minor; and the Social Data Analytics Graduate Program. He is co-director of the Intelligent Information Systems Laboratory. He was a visiting professor of the Robotics Institute at Carnegie Mellon University from 2007 to 2008. In 2011 and 2012, he served as a program manager in the Office of International Science and Engineering at the National Science Foundation. He is the second son of Chinese mathematician Wang Yuan.

Knowledge retrieval seeks to return information in a structured form, consistent with human cognitive processes as opposed to simple lists of data items. It draws on a range of fields including epistemology, cognitive psychology, cognitive neuroscience, logic and inference, machine learning and knowledge discovery, linguistics, and information technology.

An adaptive website is a website that builds a model of user activity and modifies the information and/or presentation of information to the user in order to better address the user's needs.

AMiner is a free online service used to index, search, and mine big scientific data.

<span class="mw-page-title-main">Mobile mapping</span>

Mobile mapping is the process of collecting geospatial data from a mobile vehicle, typically fitted with a range of GNSS, photographic, radar, laser, LiDAR or any number of remote sensing systems. Such systems are composed of an integrated array of time synchronised navigation sensors and imaging sensors mounted on a mobile platform. The primary output from such systems include GIS data, digital maps, and georeferenced images and video.

NetOwl is a suite of multilingual text and identity analytics products that analyze big data in the form of text data – reports, web, social media, etc. – as well as structured entity data about people, organizations, places, and things.

<span class="mw-page-title-main">Infobox</span> Template used to collect and present a subset of information about a subject

An infobox is a digital or physical table used to collect and present a subset of information about its subject, such as a document. It is a structured document containing a set of attribute–value pairs, and in Wikipedia represents a summary of information about the subject of an article. In this way, they are comparable to data tables in some aspects. When presented within the larger document it summarizes, an infobox is often presented in a sidebar format.

Semantic Scholar is a research tool for scientific literature powered by artificial intelligence. It is developed at the Allen Institute for AI and was publicly released in November 2015. Semantic Scholar uses modern techniques in natural language processing to support the research process, for example by providing automatically generated summaries of scholarly papers. The Semantic Scholar team is actively researching the use of artificial intelligence in natural language processing, machine learning, human–computer interaction, and information retrieval.

A large memory storage and retrieval neural network (LAMSTAR) is a fast deep learning neural network of many layers that can use many filters simultaneously. These filters may be nonlinear, stochastic, logic, non-stationary, or even non-analytical. They are biologically motivated and learn continuously.

References

  1. Zhong, Ning; Liu Yao, Jiming; Yao, Yiyu (2003). Web Intelligence. p. 1. ISBN   978-3-540-44384-1.
  2. Zhong, Ning; Liu Yao, Jiming; Yao, Y.Y.; Ohsuga, S. (2000), "Web Intelligence (WI)" , Web Intelligence, Computer Software and Applications Conference, 2000. COMPSAC 2000. The 24th Annual International, pp.  469, doi:10.1109/CMPSAC.2000.884768, ISBN   0-7695-0792-1, S2CID   37683026
  3. Velasquez, Juan; Vacile, Palade (2008), Adaptive Web Site: A Knowledge Extraction from Web Data Approach (1st ed.), IOS Press, ISBN   978-1-58603-831-1

Further reading