Difference between revisions of "Mapping drugs and medicine websites into VAD space"

From Master Projects
Jump to: navigation, search
(New page: {{Masterproject |Master name=Technical Artificial Intelligence |Student name=Alberto Caroli |Project start date=2015/04/13 |Project end date=2015/10/13 |Supervisor=Guszti Eiben |Second sup...)
 
Line 12: Line 12:
 
|Poster=Posternaam.pdf
 
|Poster=Posternaam.pdf
 
}}
 
}}
 +
Within the EU TAFEIC project (Tools Against Financial and Economic Internet Crime), [http://www.parabots.nl/ Parabots] and [http://sentient.nl/ Sentient] are collecting large sets of websites relating to sales of drugs over the web. The goal of this thesis is to find and develop a way to cluster these sites and give a tool to fiscal inspectors to visualize this data in clear way that helps them to identify malicious and/or fraud websites.
 +
In order to do that, 3D representation will be used as a solution to show groups of similar websites, so that the final representation will be sphere shaped. Instead of applying a traditional dimensionality reduction algorithm, the three dimensions will be taken and adapted from psychology theory, starting from the Wundt’s three-dimensional theory of emotions [1]. Based on the studies made by Osgood et at. [2] the dimensions that will be used use are valence, arousal and dominance.
 +
Valence define the extent to which a website is trying to sell illegal drugs, rather than selling legal ones or treating the subject in other allowed ways.
 +
Arousal measures how often a certain website has been updated.
 +
Dominance is a measure of the popularity of the website based on Alexa rank, google Pagerank.
 +
 +
The purpose of this study is to see whether a mapping from a different (namely psychological) field can be mapped in a coherent way to a different context, which is in our case Drugs & medicine websites, and return clusters useful for the final users of the tool (fiscal inspectors). The second purpose is to see how a three dimensional, rather than the usual two dimensional, representation can help users to understand and make use of the outcome of a clustering task.
 +
 +
'''References'''
 +
 +
[1] Reisenzein, R. (1992). A structuralist reconstruction of Wundt's three-dimensional theory of emotion.
 +
 +
[2] Osgood, C. E., Suci, G. J., & Tannenbaum, P. H. (1957). The measurement of meaning. Urbana, IL: University of Illinois Press.

Revision as of 20:05, 31 May 2015


has title::Mapping drugs and medicine websites into VAD space
status: ongoing
Master: project within::Technical Artificial Intelligence
Student name: student name::Alberto Caroli
Dates
Start start date:=2015/04/13
End end date:=2015/10/13
Supervision
Supervisor: Guszti Eiben
Second supervisor: Bas Weitjens
Company: has company::Sentient
Thesis: has thesis::Media:Thesis.pdf
Poster: has poster::Media:Posternaam.pdf

Signature supervisor



..................................

Abstract

Within the EU TAFEIC project (Tools Against Financial and Economic Internet Crime), Parabots and Sentient are collecting large sets of websites relating to sales of drugs over the web. The goal of this thesis is to find and develop a way to cluster these sites and give a tool to fiscal inspectors to visualize this data in clear way that helps them to identify malicious and/or fraud websites. In order to do that, 3D representation will be used as a solution to show groups of similar websites, so that the final representation will be sphere shaped. Instead of applying a traditional dimensionality reduction algorithm, the three dimensions will be taken and adapted from psychology theory, starting from the Wundt’s three-dimensional theory of emotions [1]. Based on the studies made by Osgood et at. [2] the dimensions that will be used use are valence, arousal and dominance. Valence define the extent to which a website is trying to sell illegal drugs, rather than selling legal ones or treating the subject in other allowed ways. Arousal measures how often a certain website has been updated. Dominance is a measure of the popularity of the website based on Alexa rank, google Pagerank.

The purpose of this study is to see whether a mapping from a different (namely psychological) field can be mapped in a coherent way to a different context, which is in our case Drugs & medicine websites, and return clusters useful for the final users of the tool (fiscal inspectors). The second purpose is to see how a three dimensional, rather than the usual two dimensional, representation can help users to understand and make use of the outcome of a clustering task.

References

[1] Reisenzein, R. (1992). A structuralist reconstruction of Wundt's three-dimensional theory of emotion.

[2] Osgood, C. E., Suci, G. J., & Tannenbaum, P. H. (1957). The measurement of meaning. Urbana, IL: University of Illinois Press.