The biggest hurdle that researchers face in data mining is that of unstructured text as that is how majority of information available online is not structured and organised properly.
Open Ended Questions: Another place where the information collected is in an unstructured form is the open ended survey responses that are pertaining to the topic which is under study. The purpose of open ended questions is to allow the respondents to bring out their opinion or views without getting constrained to a particular dimension or predefined format. These open ended questions do give an insight to the researcher to explore those areas and viewpoints of the respondent that may remain untouched in a more structured and closed ended questionnaire. Techniques to text mining do help in the segregation and classification of data that is collected from open ended sources
Automatic processing of messages and emails: This is another useful application area of text mining. It helps to automatically filter the mails and messages on the basis of predefined terms into desirable and undesirable messages. It identified junk mail on the basis of certain terms or words and it makes it easy to discard such messages automatically without much effort. This automatic text mining technique also aids in routing of messages to the appropriate agency or department automatically on the basis of predefined system.
Analysis of warranty or insurance claims, interviews etc. : In a whole lot of businesses, a lot of information is collected in the form of open ended textual form. The warranty claims, medical information or history of patients, automobile owners information profile. All this information needs to be sorted and categorized for appropriate actions. For the best possible exploitation if this information, clustering into relevant data sets in done by the aid of data mining.
Competitor investing by website crawling: Sometimes there is a need to find out the content on the site of the competitor. The most significant and important terms and features could be distinguished and identified by using the text mining techniques. It is easy to see how these capabilities are able to deliver in efficient manner valuable business intelligence about ye different activities of the competitors.
I am still looking for a topic and my area of interest resolves around text mining itself. Can you be of some help in suggesting a topic so that I know and I am sure that I have adopted the right approach and that my research also has a good impact.