3D Content Retrieval

Last updated

A 3D Content Retrieval system is a computer system for browsing, searching and retrieving three dimensional digital contents (e.g.: Computer-aided design, molecular biology models, and cultural heritage 3D scenes, etc.) from a large database of digital images. The most original way of doing 3D content retrieval uses methods to add description text to 3D content files such as the content file name, link text, and the web page title so that related 3D content can be found through text retrieval. Because of the inefficiency of manually annotating 3D files, researchers have investigated ways to automate the annotation process and provide a unified standard to create text descriptions for 3D contents. Moreover, the increase in 3D content has demanded and inspired more advanced ways to retrieve 3D information. Thus, shape matching methods for 3D content retrieval have become popular. Shape matching retrieval is based on techniques that compare and contrast similarities between 3D models.

Contents

3D retrieval methods

Derive a high level description (e.g.: a skeleton) and then find matching results

This method describes 3D models by using a skeleton. The skeleton encodes the geometric and topological information in the form of a skeletal graph and uses graph matching techniques to match the skeletons and compare them. [1] However, this method requires a 2-manifold input model, and it is very sensitive to noise and details. Many of the existing 3D models are created for visualization purposes, while missing the input quality standard for the skeleton method. The skeleton 3D retrieval method needs more time and effort before it can be used widely.

Compute a feature vector based on statistics

Unlike Skeleton modeling, which requires a high quality standard for the input source, statistical methods do not put restriction on the validity of an input source. Shape histograms, feature vectors composed of global geo-metic properties such as circularity and eccentricity, and feature vectors created using frequency decomposition of spherical functions are common examples of using statistical methods to describe 3D information. [2]

2D projection method

Some approaches use 2D projections of a 3D model, justified by the assumption that if two objects are similar in 3D, then they should have similar 2D projections in many directions. Prototypical Views [3] and Light field description [4] are good examples of 2D projection methods.

3D Engineering Search System

In Purdue University, researchers led by Professor Karthik Ramani at the Research and Education Center for Information created a 3D search engine called the 3D Engineering Search System (3DESS). It is designed to find computer-generated engineering parts.

The mechanism behind this search engine is that it starts from an algorithm which can transform query drawing to voxels, then extracts the most important shape information from the voxels by using another algorithm called thinning, and formulates a skeleton of the object’s outlines and topology. After that, 3DESS will develop a skeletal graph to render the skeleton, using three common topological constructs: loops, edges, and nodes. The processed common constructs graph can reduce the data amount to represent an object, and it is easier to store and index the description in a database. [5]

According to the lead professor, 3DESS can also describe objects using feature vectors, such as volume, surface area, etc. The system processes queries by comparing their feature vectors or skeletal graphs with data stored in the database. When the system retrieves models in response to the query, users can pick whichever object looks more similar to what they want and leave feedback.

Challenges

Challenges associated with 3D shape-based similarity queries

With the skeleton modeling 3D retrieval method, figuring out an efficient way to index 3D shape descriptors is very challenging because 3D shape indexing has very strict criteria. The 3D models must be quick to compute, concise to store, easy to index, invariant under similarity transformations, insensitive to noise and small extra features, robust to arbitrary topological degeneracies, and discriminating of shape differences at many scales.

3D search and retrieval with multimodal support challenges

In order to make the 3D search interface simple enough for novice users who know little on 3D retrieval input source requirements, a multimodal retrieval system, which can take various types of input sources and provide robust query results, is necessary. So far, only a few approaches have been proposed. In Funkhouser et al. (2003), [6] the proposed “Princeton 3D search engine” supports 2D sketches, 3D sketches, 3D models and text as queries. In Chen et al. (2003), [7] he designed a 3D retrieval system that intakes 2D sketches and retrieves for 3D objects. Recently, Ansary et al. (2007) [8] proposed a 3D retrieval framework using 2D photographic images, sketches, and 3D models.

See also

Related Research Articles

Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an information need. The information need can be specified in the form of a search query. In the case of document retrieval, queries can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.

In computing, a search engine is an information retrieval software system designed to help find information stored on one or more computer systems. Search engines discover, crawl, transform, and store information for retrieval and presentation in response to user queries. The search results are usually presented in a list and are commonly called hits. The most widely used type of search engine is a web search engine, which searches for information on the World Wide Web.

<span class="mw-page-title-main">Scene graph</span>

A scene graph is a general data structure commonly used by vector-based graphics editing applications and modern computer games, which arranges the logical and often spatial representation of a graphical scene. It is a collection of nodes in a graph or tree structure. A tree node may have many children but only a single parent, with the effect of a parent applied to all its child nodes; an operation performed on a group automatically propagates its effect to all of its members. In many programs, associating a geometrical transformation matrix at each group level and concatenating such matrices together is an efficient and natural way to process such operations. A common feature, for instance, is the ability to group related shapes and objects into a compound object that can then be manipulated as easily as a single object.

An image retrieval system is a computer system used for browsing, searching and retrieving images from a large database of digital images. Most traditional and common methods of image retrieval utilize some method of adding metadata such as captioning, keywords, title or descriptions to the images so that retrieval can be performed over the annotation words. Manual image annotation is time-consuming, laborious and expensive; to address this, there has been a large amount of research done on automatic image annotation. Additionally, the increase in social web applications and the semantic web have inspired the development of several web-based image annotation tools.

Document retrieval is defined as the matching of some stated user query against a set of free-text records. These records could be any type of mainly unstructured text, such as newspaper articles, real estate records or paragraphs in a manual. User queries can range from multi-sentence full descriptions of an information need to a few words.

<span class="mw-page-title-main">Content-based image retrieval</span> Method of image retrieval

Content-based image retrieval, also known as query by image content and content-based visual information retrieval (CBVIR), is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases. Content-based image retrieval is opposed to traditional concept-based approaches.

In mathematics, computer science and especially graph theory, a distance matrix is a square matrix containing the distances, taken pairwise, between the elements of a set. Depending upon the application involved, the distance being used to define this matrix may or may not be a metric. If there are N elements, this matrix will have size N×N. In graph-theoretic applications, the elements are more often referred to as points, nodes or vertices.

The scale-invariant feature transform (SIFT) is a computer vision algorithm to detect, describe, and match local features in images, invented by David Lowe in 1999. Applications include object recognition, robotic mapping and navigation, image stitching, 3D modeling, gesture recognition, video tracking, individual identification of wildlife and match moving.

A spatial database is a general-purpose database that has been enhanced to include spatial data that represents objects defined in a geometric space, along with tools for querying and analyzing such data.

Nearest neighbor search (NNS), as a form of proximity search, is the optimization problem of finding the point in a given set that is closest to a given point. Closeness is typically expressed in terms of a dissimilarity function: the less similar the objects, the larger the function values.

Object recognition – technology in the field of computer vision for finding and identifying objects in an image or video sequence. Humans recognize a multitude of objects in images with little effort, despite the fact that the image of the objects may vary somewhat in different view points, in many different sizes and scales or even when they are translated or rotated. Objects can even be recognized when they are partially obstructed from view. This task is still a challenge for computer vision systems. Many approaches to the task have been implemented over multiple decades.

This article describes shape analysis to analyze and process geometric shapes.

A concept search is an automated information retrieval method that is used to search electronically stored unstructured text for information that is conceptually similar to the information provided in a search query. In other words, the ideas expressed in the information retrieved in response to a concept search query are relevant to the ideas contained in the text of the query.

Ranking of query is one of the fundamental problems in information retrieval (IR), the scientific/engineering discipline behind search engines. Given a query q and a collection D of documents that match the query, the problem is to rank, that is, sort, the documents in D according to some criterion so that the "best" results appear early in the result list displayed to the user. Ranking in terms of information retrieval is an important concept in computer science and is used in many different applications such as search engine queries and recommender systems. A majority of search engines use ranking algorithms to provide users with accurate and relevant results.

Vector space model or term vector model is an algebraic model for representing text documents as vectors such that the distance between vectors represents the relevance between the documents. It is used in information filtering, information retrieval, indexing and relevancy rankings. Its first use was in the SMART Information Retrieval System.

<span class="mw-page-title-main">Reverse image search</span> Content-based image retrieval

Reverse image search is a content-based image retrieval (CBIR) query technique that involves providing the CBIR system with a sample image that it will then base its search upon; in terms of information retrieval, the sample image is very useful. In particular, reverse image search is characterized by a lack of search terms. This effectively removes the need for a user to guess at keywords or terms that may or may not return a correct result. Reverse image search also allows users to discover content that is related to a specific sample image or the popularity of an image, and to discover manipulated versions and derivative works.

<span class="mw-page-title-main">Reeb graph</span>

A Reeb graph is a mathematical object reflecting the evolution of the level sets of a real-valued function on a manifold. According to a similar concept was introduced by G.M. Adelson-Velskii and A.S. Kronrod and applied to analysis of Hilbert's thirteenth problem. Proposed by G. Reeb as a tool in Morse theory, Reeb graphs are the natural tool to study multivalued functional relationships between 2D scalar fields , , and arising from the conditions and , because these relationships are single-valued when restricted to a region associated with an individual edge of the Reeb graph. This general principle was first used to study neutral surfaces in oceanography.

Learning to rank or machine-learned ranking (MLR) is the application of machine learning, typically supervised, semi-supervised or reinforcement learning, in the construction of ranking models for information retrieval systems. Training data may, for example, consist of lists of items with some partial order specified between items in each list. This order is typically induced by giving a numerical or ordinal score or a binary judgment for each item. The goal of constructing the ranking model is to rank new, unseen lists in a similar way to rankings in the training data.

Spectral shape analysis relies on the spectrum of the Laplace–Beltrami operator to compare and analyze geometric shapes. Since the spectrum of the Laplace–Beltrami operator is invariant under isometries, it is well suited for the analysis or retrieval of non-rigid shapes, i.e. bendable objects such as humans, animals, plants, etc.

<span class="mw-page-title-main">Entity linking</span> Concept in Natural Language Processing

In natural language processing, entity linking, also referred to as named-entity linking (NEL), named-entity disambiguation (NED), named-entity recognition and disambiguation (NERD) or named-entity normalization (NEN) is the task of assigning a unique identity to entities mentioned in text. For example, given the sentence "Paris is the capital of France", the idea is to determine that "Paris" refers to the city of Paris and not to Paris Hilton or any other entity that could be referred to as "Paris". Entity linking is different from named-entity recognition (NER) in that NER identifies the occurrence of a named entity in text but it does not identify which specific entity it is.

References

  1. Sundar, H., Silver, D., Gagvani, N., Dickinson, S., Skeleton based shape matching and retrieval, In: Proc. SMI, Seoul, Korea (2003)
  2. Min, P., Kazhdan, M., Funkhouser, T., A comparison of text and shape matching for retrieval of Online 3D models. Research And Advanced Technology For Digital Libraries, 2004, Vol.3232, pp.209-220
  3. Cyr, C.M., Kimia, B.B., 3D object recognition using shape similarity-based aspect graph, In: Proc. ICCV, IEEE (2001)
  4. Chen, D.Y., Ouhyoung, M., Tian, X.P., Shen, Y.T., Ouhyoung, M., On visual similarity based 3D model retrieval, In: Proc. Eurographics, Granada, Spain (2003)
  5. Ortiz, S., 3D searching starts to take shape, Computer, 2004, Vol.37(8), pp.24-26
  6. Funkhouser, T., Min, P., Kazhdan, M., Chen, J., Halderman, A., Dobkin, D., & Jacobs, D. (2003). A search engine for 3D models. ACM Transactions on Graphics, 22(1), 83–105
  7. Chen, D.Y., Ouhyoung, M., Tian, X.P., Shen, Y.T., Ouhyoung, M., On visual similarity based 3D model retrieval, In: Proc. Eurographics, Granada, Spain (2003)
  8. Filali Ansary, T., Daoudi, M., & Vandeborre, J.-P. (2007). A Bayesian 3D search engine using adaptive views clustering. IEEE Transactions on Multimedia, 9(1), 78–88.