Fast And Effective Approximations For Summarization And Categorization Of Very Large Text Corpora