Comparing Data

My ultimate goal is to present a comparative analysis where I can EASILY show the differences and similarities between data I have assembled and data others have assembled.

I have assembled a significantly large set of data consisting of many short/succinct text strings. By contrast, other info I have is 1) data published in .pdf, .doc, .xls, other, 2) are formatted in various forms and 3) contain strings of text which are lengthy, poorly worded and of undetermined/varying lengths.

How, in excel and/or VB, can I “compare/contrast” other data against my pre-determined larger data set. My goal is to mine data QUICKLY, AUTOMATICALLY and with MINIMAL to NO HUMAN INTERACTION to determine which of the “other” furnished strings of text matches or near matches or doesn’t match my internal data set.