Mining for Norms in Clouds: Complying to Ethical Communication through Cloud Text Data Mining

  • Authors:
  • Ahsan Nabi Khan;Aslam Muhammad;A. M. Martinez Enriquez

  • Affiliations:
  • -;-;-

  • Venue:
  • UCC '12 Proceedings of the 2012 IEEE/ACM Fifth International Conference on Utility and Cloud Computing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

As the world is realizing the power and efficiency of cloud computing, enhanced security and intelligence is needed in communication to filter out unethical data violating norms in clouds. No filtering categorization has been currently proposed. Numerous lists of banned, unethical and objectionable words have been developed with limited user satisfaction. Lists are usually manually generated, with some programmable extensibility for online forums and public newsgroups. We define a tool and methodology to categorize the censor data. We statistically grow words in the categorized data and tag the hidden neutral words with meaning in context. Using Computational Linguistics tools and modifying them to suit our means, we analyze sample text from gigabytes of email newsgroup dataset over Cloud Servers. A sample result dataset of the most frequently used words breaking the norms in recent cloud communication is presented in the results in broad categories. The categories separate cloud-server data found in newsgroups related to internet crimes, fraud, theft, anti-state elements, and other material of legal importance. Thus this study demonstrates a tag cloud of most frequent critical words in communications from legal and ethical point-of-view in the current scenario of cloud databases.