SEARCHING V/S SEARCH FILTERING

Posted on October 8, 2010


INTRODUCTION

Often there is an element of mystery when search filtering v/s searching is brought up in an electronic discovery project meeting. I hope this blog serves to eliminate this confusion and helps you design effective work-flows taking into consideration your better understanding of both these techniques.

WHAT IS SEARCHING AND WHEN DO YOU USE THIS TECHNIQUE?

Searching by definition means “finding something” and in an electronic discovery environment means finding data (i.e. documents or content) that you are looking for to take further action. Further actions can be classified in numerous ways depending on your given situation. In simple terms, once you find what you are looking for, you will do something with the data you found. Maybe you will classify what you found in a particular way reminding you of what it is all about. May be you will take necessary steps such as folder, export and produce the data to complete the ediscovery lifecycle or may be segregate the data based on its relevancy for future action, and so on….

WHAT IS SEARCH FILTERING AND HOW IS IT DIFFERENT FROM SEARCHING?

Search filtering by its very name is a search technique used to subset the data once you have finished searching. This subset could be either included or excluded from your future actions. Searching allows you to find what you are looking for and take several different actions on your data set. Search filtering on the other hand only allows you to create a subset which is in essence a focus set. Meaning, once a search filter is applied, from that point onwards only that focus data set will be acted upon for further action and this brings about the major difference between both these techniques. The universe of data never shrinks (or in eDiscovery terms never gets culled) to a subset when searching, but does so when search filtering.

WHEN SHOULD YOU PREFER SEARCH FILTERING OVER SEARCHING?

If the volume of collected data is very large; more than your data processing infrastructure can handle and if you know all of the inclusive search terms or search phrases and most importantly if you are 100% sure that these search terms are not going to change in the future, then and only then you may consider search based filtering techniques. However, if you have the slightest doubt about any kind of changes to the search terms provided then I recommend you prefer searching over search filtering. Project managers and electronic discovery consultants must always ask relevant questions up-front to their clients before designing work-flows around search filtering.

WHAT ARE INCLUSIVE AND EXCLUSIVE SEARCH FILTERING TECHNIQUES?

There are two methods by which you can perform search based filtering. They are:

Inclusive Search Filtering is one where data that comes up as a hit to the search terms should be considered for further action. This method of filtering is more popular and often used.

Exclusive Search Filtering is one where all other data apart from the data that came up as a hit to the search terms must be considered for further action. This method is rarely used because it can cause many complications.

WHAT PRECAUTIONS ONE MUST TAKE WHILE RUNNING EXCLUSIVE FILTERS?

Data that contains exclusive search terms may also contain non exclusive (i.e. inclusive) search terms. When such a situation arises then inclusivity should always get preference over exclusivity. Consider the diagram below:

 A is data from inclusive search hits and B is the data from exclusive search hits. Say Y is the inclusive set and Z is the exclusive set.

  

In this diagram X is the data set which falls under inclusive hits as well as exclusive hits. Then while eliminating the data from the subset after search filtering you must always eliminate B – X.  This will help you take into account the false positives and provide a subset which is most appropriate for you to take further action.

Advertisement