Search is the most common method used today to locate
information, present it and interact with complex data sets. Search has brought access to large, complex
data sets to the masses; enabling people to locate information that previously
would have been difficult or impossible to find. In the context of Big Data, we hear search
mentioned a lot. It can be referenced in
one of two ways:
- Search as an interface – Search as an interface allows people to use a single entry point to reference complex data sets, the components that make them up and the relationships between pieces of data. Search provides the front user interface to a dataset that is analyzed and manipulated by other tools.
- Search as product – Search as a product is the packaged components need to index data from a variety of sources and present the results through a search interface. Search as an interface commonly allows for key word searches as well as phrases.
Search can provide a powerful interface to locate data and
present it, but the search engine still needs the data presented to it in an
organized fashion. This is where additional
analytical tools come in and enable rich, more powerful search results to be
provided.
Companies like IBM, HP, and Microsoft have long provided search
as a product, these tools bring in data sets, index it and enable folks to
search for key words or phrases. In July
2013, Cloudera also announced this capability on top of Hadoop. The announcement from Cloudera was their inclusion
of SolrCloud on top of Hadoop to create highly scalable search indexes. This search capability allows customers to
locate information quickly, but the customer must know what they are looking
for.
Kitenga enables a richer search experience though its
ability to extract entities from unstructured data, analyze the data from that
extraction process, presenting not only key word results, but relationships and
contextual meaning for the data that was analyzed. Kitenga leverages search as an interface to
access these complex data sets and the relationships derived from them. Kitenga enables an end to end analytical pipeline
of data analysis, relationship identification, and presentation of the results
through search.